java - Getting the type of Unicode in java

Question

This is an interview question:

Return true or false for a given string value and its corresponding unicode

public boolean decode (String value, String unicode){
    // logic goes here
}

for eg if the given inputs are

String value = "abc" String unicode = "UTF-8" return value is false    
String value"\u00A3" String unicode = "ASCII" return value is true

I read in an article that the unicode values are determined internally by bytes. So my first logic was trying to get the range ie for eg if it belongs to range between 40 and 63 its ASCII. Please correct if am wrong with this logic and if there is a better way to find out the unicode.

score 0 · Accepted Answer

Unicode Equivalent of ANSCI

ANSI characters 32 to 127 correspond to those in the 7-bit ASCII character set, which forms the Basic Latin Unicode character range. Characters 160–255 correspond to those in the Latin-1 Supplement Unicode character range.

As you can observe, there are ASCII equalvilent Unicode value in that table. So you better ask the interviewer what is really the requirements.

score 0 · Accepted Answer

これは、関数のかなり悪い仕様です。インタビューでは、クライアントがこのソフトウェアの実装を要求したふりをして応答する必要があります。そのため、仕様の背後にある意図について明確にするように優しく尋ねます。または、自分が学生であり、教えてもらいたいかのように、質問の中で批判を導入します。あなたは言うかもしれません：

「Unicode」という言葉を、ASCII や UTF-8 などのエンコーディングの総称として使用することに慣れていません。それがパラメーターの目的であると私は正しいですか？その目的をより簡単に思い出せるように、「エンコーディング」という名前を付けることができますか?
つまり、たとえば Internet Engineering Task Force がこれまでに名前を付けたすべてのエンコーディングではなく、特定のエンコーディングに関心があるようです。おわかりのように、私は MIME 標準について言及しています。MIME 標準では、IETF がエンコーディングの名前の公式レジストリを指定することを規定しています。それらの数は数百または数千あります。
テキストが「abc」の場合、UTF-8に関するクエリにfalseを返すことに気付きました。それは、そのテキストのコードポイントがすべて、UTF-8 が ASCII と共通する範囲内にあるため、エンコードされたテキストが UTF-8 エンコーディングと ASCII エンコーディングの場合で同じであるためですか? サブセットとしてASCIIを含むISO-8859-1などの別のエンコーディングについても同様に行いますか?

java - Getting the type of Unicode in java

2 に答える 2

Related

Reference