c# - Utf7Encoding テキストの切り捨て

Question

「+4」シーケンスを切り捨てる Utf7Encoding クラスに問題がありました。なぜこれが起こったのかを知りたいと思います。byte[] 配列から文字列を取得するために Utf8Encoding を試しましたが、うまくいくようです。Utf8 でそのような既知の問題はありますか? 基本的に、この変換によって生成された出力を使用して、rtf 文字列から html を作成します。

スニペットは次のとおりです。

    UTF7Encoding utf = new UTF7Encoding(); 
    UTF8Encoding utf8 = new UTF8Encoding(); 

    string test = "blah blah 9+4"; 

    char[] chars = test.ToCharArray(); 
    byte[] charBytes = new byte[chars.Length]; 

    for (int i = 0; i < chars.Length; i++) 
    { 

        charBytes[i] = (byte)chars[i]; 

     }


    string resultString = utf8.GetString(charBytes); 
    string resultStringWrong = utf.GetString(charBytes); 

    Console.WriteLine(resultString);  //blah blah 9+4  
    Console.WriteLine(resultStringWrong);  //blah 9

score 1 · Accepted Answer

文字列を utf7 バイトに正しく変換していません。utf.GetBytes()文字をバイトにキャストする代わりに呼び出す必要があります。

utf7 では、'+' に対応する ASCII コードは、実際には国際的な Unicode 文字をエンコードするために予約されていると思われます。

score 1 · Accepted Answer

そのような char 配列を介してバイト配列に変換することはできません。文字セット固有の文字列が必要な場合は、次のようにbyte[]します。

UTF7Encoding utf = new UTF7Encoding();
UTF8Encoding utf8 = new UTF8Encoding();

string test = "blah blah 9+4";

byte[] utfBytes = utf.GetBytes(test);
byte[] utf8Bytes = utf8.GetBytes(test);

string utfString = utf.GetString(utfBytes);
string utf8String = utf8.GetString(utf8Bytes);

Console.WriteLine(utfString);  
Console.WriteLine(utf8String);

出力：

何とか9+4

何とか9+4

c# - Utf7Encoding テキストの切り捨て

2 に答える 2

Related

Reference