java - Java：UTF8を文字列に変換

Question

次のプログラムを実行すると：

public static void main(String args[]) throws Exception
{
    byte str[] = {(byte)0xEC, (byte)0x96, (byte)0xB4};
    String s = new String(str, "UTF-8");
}

Linuxで、jdbのsの値を調べると、正しく次のようになります。

 s = "ì–´"

Windowsでは、間違って次のように表示されます。

s = "?"

私のバイトシーケンスは韓国語で有効なUTF-8文字ですが、なぜ2つの非常に異なる結果が生成されるのでしょうか。

score 3 · Accepted Answer

3

于 2012-10-02T21:22:16.447 に答える

score 1 · Accepted Answer

1

于 2012-10-02T21:20:11.713 に答える

score 1 · Accepted Answer

You get the correct string, it's Windows console that does not display the string correctly.

Here is a link to an article that discusses a way to make Java console produce correct Unicode output using JNI.

score 0 · Accepted Answer

JDB がデータを正しく表示していません。このコードは、Windows と Linux の両方で同じように機能します。このより決定的なテストを実行してみてください。

public static void main(String[] args) throws Exception {
    byte str[] = {(byte)0xEC, (byte)0x96, (byte)0xB4};
    String s = new String(str, "UTF-8"); 
    for(int i=0; i<s.length(); i++) {
        System.out.println(BigInteger.valueOf((int)s.charAt(i)).toString(16));
    }
}

これにより、文字列内のすべての文字の 16 進値が出力されます。これにより、Windows と Linux の両方で「c5b4」が正しく出力されます。

java - Java：UTF8を文字列に変換

4 に答える 4

Related

Reference