c# - C＃でバイト配列を文字列に変換してから戻す

Question

ファイルを (バイトから) 開き、それを文字列に変換して、ヘッダーのメタデータをいじり、バイトに戻して保存しようとしています。私が今直面している問題は、このコードにあります。元のバイト配列と前後に変換された (変更されていない) 文字列を比較すると、等しくありません。どうすればこれを機能させることができますか？

public static byte[] StringToByteArray(string str)
{
    UTF8Encoding encoding = new UTF8Encoding();
    return encoding.GetBytes(str);
}

public string ByteArrayToString(byte[] input)
{
    UTF8Encoding enc = new UTF8Encoding();
    string str = enc.GetString(input);
    return str;
}

これが私がそれらを比較している方法です。

byte[] fileData = GetBinaryData(filesindir[0], Convert.ToInt32(fi.Length));
string fileDataString = ByteArrayToString(fileData);
byte[] recapturedBytes = StringToByteArray(fileDataString);
Response.Write((fileData == recapturedBytes));

私はそれがUTF-8であると確信しています：

StreamReader sr = new StreamReader(filesindir[0]);
Response.Write(sr.CurrentEncoding);

「System.Text.UTF8Encoding」を返します。

score 16 · Accepted Answer

Encodingさまざまなエンコーディングのインスタンスを提供するクラスで静的関数を試してください。Encodingバイト配列との間で変換するためだけにをインスタンス化する必要はありません。コード内の文字列をどのように比較していますか?

編集

文字列ではなく、配列を比較しています。これらは 2 つの異なる配列を参照しているため、等しくありません。演算子を使用する==と、値ではなく参照のみが比較されます。配列の各要素を調べて、それらが等しいかどうかを判断する必要があります。

public bool CompareByteArrays(byte[] lValue, byte[] rValue)
{
    if(lValue == rValue) return true; // referentially equal
    if(lValue == null || rValue == null) return false; // one is null, the other is not
    if(lValue.Length != rValue.Length) return false; // different lengths

    for(int i = 0; i < lValue.Length; i++)
    {
        if(lValue[i] != rValue[i]) return false;
    }

    return true;
}

score 7 · Accepted Answer

When you have raw bytes (8-bit possibly-not-printable characters) and want to manipulate them as a .NET string and turn them back into bytes, you can do so by using

Encoding.GetEncoding(1252)

instead of UTF8Encoding. That encoding works to take any 8-bit value and convert it to a .NET 16-bit char, and back again, without losing any information.

In the specific case you describe above, with a binary file, you will not be able to "mess with metadata in the header" and have things work correctly unless the length of the data you mess with is unchanged. For example, if the header contains

{any}{any}ABC{any}{any}

and you want to change ABC to DEF, that should work as you'd like. But if you want to change ABC to WXYZ, you will have to write over the byte that follows "C" or you will (in essence) move everything one byte further to the right. In a typical binary file, that will mess things up greatly.

If the bytes after "ABC" are spaces or null characters, there's a better chance that writing larger replacement data will not cause trouble -- but you still cannot just replace ABC with WXYZ in the .NET string, making it longer -- you would have to replace ABC{whatever_follows_it} with WXYZ. Given that, you might find that it's easier just to leave the data as bytes and write the replacement data one byte at a time.

score 5 · Accepted Answer

.NET文字列はUnicode文字列を使用するため、Cの場合のようにこれを行うことはできなくなります。ほとんどの場合、内容が実際にない限り、文字列<->バイト配列を行き来することはできません。テキスト。

この点を明確にする必要があります。.NETでは、byte[]データがテキストでない場合は、テキストチャネルを介したバイナリデータ用の特別なBase64stringエンコーディングを除いて、データをに変換しようとしないでください。これは、.NETで働く人々の間で広く受け入れられている誤解です。

score 3 · Accepted Answer

問題は、バイトの配列を比較する方法にあるように見えます。

Response.Write((fileData == recapturedBytes));

バイト配列に含まれる値ではなく、バイト配列のアドレスを比較しているため、これは常にfalseを返します。文字列データを比較するか、バイト配列を比較する方法を使用します。代わりにこれを行うこともできます。

Response.Write(Convert.ToBase64String(fileData) == Convert.ToBase64String(recapturedBytes));

c# - C＃でバイト配列を文字列に変換してから戻す

4 に答える 4

Related

Reference