c# - バイトデータの解析中にエラーが発生してもループを続行する方法

Question

私の質問はこれの続きです:(ファイルから非常に大きなバイト配列から異なるデータ型とサイズを読み取るためのループ)

ファイル (rawbytes.txt または bytes.data) に未加工のバイトストリームが保存されており、これを解析して CSV スタイルのテキストファイルに出力する必要があります。

raw バイトの入力 (characters/long/int などとして読み取られる場合) は、次のようになります。

A2401028475764B241102847576511001200C...

解析すると、次のようになります。

OutputA.txt

(Field1,Field2,Field3) - heading

A,240,1028475764

OutputB.txt

(Field1,Field2,Field3,Field4,Field5) - heading

B,241,1028475765,1100,1200

OutputC.txt

C,...//and so on

基本的に、これはバイトの 16 進ダンプ形式の入力であり、行末記号や解析が必要なデータ間のギャップがなく連続しています。上記のように、データは次々と異なるデータ型で構成されています。

これが私のコードのスニペットです - どのフィールドにもコンマがなく、"" (つまり CSV ラッパー) を使用する必要がないため、次のように TextWriter を使用して CSV スタイルのテキストファイルを作成しています。

if (File.Exists(fileName))
        {
        using (BinaryReader reader = new BinaryReader(File.Open(fileName, FileMode.Open)))
            {
        while (reader.BaseStream.Position != reader.BaseStream.Length)
            {
                inputCharIdentifier = reader.ReadChar();
                switch (inputCharIdentifier)
                     case 'A':

                        field1 = reader.ReadUInt64();
                        field2 = reader.ReadUInt64();
                        field3 = reader.ReadChars(10);
                        string strtmp = new string(field3);
                        //and so on
                        using (TextWriter writer = File.AppendText("outputA.txt"))
                        {
                            writer.WriteLine(field1 + "," + field2 + "," + strtmp); // +  
                        }
                        case 'B':
                        //code...

私の質問は、生のバイトデータの一部に null 値が含まれているという事実に基づいています。これは、解析が困難です。連続するデータ間に不明な数の null バイト (または null 以外の場違いなバイト) があるためです。ブロック (データブロックが破損していない場合は、それぞれ A、B、または C で始まります)。

質問

では、データの破損や欠陥が原因でエラーが発生した場合でもループを続行するには、デフォルトのケースまたはその他のメカニズムを追加するにはどうすればよいでしょうか? 次のコードは機能しますか?

    inputCharIdentifier = reader.ReadChar();
    ...
    case default:
    //I need to know what to add here, instead of default 
    //(i.e. the case when the character could not be read)
    while (binReader.PeekChar() != -1)
    {
         filling = binReader.readByte();
         //filling is a single byte
         try {
             fillingChar = Convert.ToChar(filling);

             break;
         }
         catch (Exception ex) { break; }
         if (fillingChar == 'A' || fillingChar == 'B')
             break;

残りの部分 - プログラムを停止せずに続行するためのコードを各スイッチケース (たとえば 'A') に追加する - 複数の try-catch ブロックなしでこれを行う方法はありますか? [つまり、コードブロックの文字識別子は A ですが、A の後のバイトが破損しています。この場合、ループを終了するか、定義されたバイト数を読み取る (つまり、スキップする) 必要があります。ここでは、メッセージヘッダーが残りのバイトを正しく識別します。

[注: ケース A、B などでは、入力サイズが異なります。つまり、A は合計 40 バイトで、B は 50 バイトです。したがって、inputBuf[1000] や [50] などの固定サイズのバッファの使用は、たとえば、それらがすべて同じサイズである場合、うまく機能しません。

助言がありますか？助けてください！私はC＃に比較的慣れていません（2か月）...

更新:私のコード全体は次のとおりです。

         class Program
{
    const string fileName = "rawbytes.txt";
    static void Main(string[] args)
    {
                    try
        {
            var program = new Program();
            program.Parser();
        }
        catch (Exception e)
        {
            Console.WriteLine(e);
        }
        Console.ReadLine();
    }
    public void Parser()
    {
        char inputCharIdentifier = 'Z';
        //only because without initializing inputCharIdentifier I ended up with an error
        //note that in the real code, 'Z' is not a switch-case alphabet
        //it's an "inconsequential character", i.e. i have defined it to be 'Z'
        //just to avoid that error, and to avoid leaving it as a null value
        ulong field1common = 0;
        ulong field2common = 0;
        char[] charArray = new char[10];
        char char1;
        char char2;
        char char3;
        int valint1 = 0;
        int valint2 = 0;
        int valint3 = 0;
        int valint4 = 0;
        int valint5 = 0;
        int valint6 = 0;
        int valint7 = 0;
        double valdouble;
        /*
        char[] filler = new char[53];
        byte[] filling = new byte[4621];
        byte[] unifiller = new byte[8];
        //these values above were temporary measures to manually filter through
        //null bytes - unacceptable for the final program
        */
        if (File.Exists(fileName))
        {
            using (BinaryReader reader = new BinaryReader(File.Open(fileName, FileMode.Open)))
            {
                while (reader.BaseStream.Position != reader.BaseStream.Length)
                {
                    //inputCharIdentifier = reader.ReadChar();
                    //if (inputCharIdentifier != null)
                    //{
                        try
                        {
                            inputCharIdentifier = reader.ReadChar();
                            try
                            {
                                switch (inputCharIdentifier)
                                {
                                    case 'A':

                                        field1common = reader.ReadUInt64();
                                        field2common = reader.ReadUInt64();
                                        //unifiller = reader.ReadBytes(8);
                                        //charArray = reader.ReadString();
                                        //result.ToString("o");
                                        //Console.WriteLine(result.ToString());
                                        charArray = reader.ReadChars(10);
                                        string charArraystr = new string(charArray);
                                        char1 = reader.ReadChar();
                                        valint1 = reader.ReadInt32();
                                        valint2 = reader.ReadInt32();
                                        valint3 = reader.ReadInt32();
                                        valint4 = reader.ReadInt32();
                                        using (TextWriter writer = File.AppendText("A.txt"))
                                        {
                                            writer.WriteLine(field1common + "," + /*result.ToString("o")*/ field2common + "," + charArraystr + "," + char1 + "," + valint1 + "," + valint2 + "," + valint3 + "," + valint4);
                                            writer.Close();
                                        }
                                        break;


                                    case 'B':
                                    case 'C':

                                        field1common = reader.ReadUInt64();
                                        field2common = reader.ReadUInt64();
                                        //charArray = reader.ReadString();
                                        charArray = reader.ReadChars(10);
                                        string charArraystr2 = new string(charArray);
                                        char1 = reader.ReadChar();
                                        valint1 = reader.ReadInt32();
                                        valint2 = reader.ReadInt32();
                                        using (TextWriter writer = File.AppendText("C.txt"))
                                        {
                                            writer.WriteLine(field1common + "," + result2.ToString("o") + "," + charArraystr2 + "," + char1 + "," + valint1 + "," + valint2);
                                            writer.Close();
                                        }
                                        break;
                                    case 'S':
                                        //market status message
                                        field1common = reader.ReadUInt64();
                                        char2 = reader.ReadChar();
                                        char3 = reader.ReadChar();
                                        break;
                                    case 'L':
                                        filling = reader.ReadBytes(4);
                                        break;
                                    case 'D':
                                    case 'E':
                                        field1common = reader.ReadUInt64();
                                        field2common = reader.ReadUInt64();
                                        //charArray = reader.ReadString();
                                        charArray = reader.ReadChars(10);
                                        string charArraystr3 = new string(charArray);
                                        //char1 = reader.ReadChar();
                                        valint1 = reader.ReadInt32();
                                        valint2 = reader.ReadInt32();
                                        valint5 = reader.ReadInt32();
                                        valint7 = reader.ReadInt32();
                                        valint6 = reader.ReadInt32();
                                        valdouble = reader.ReadDouble();
                                        using (TextWriter writer = File.AppendText("D.txt"))
                                        {
                                            writer.WriteLine(field1common + "," + result3.ToString("o") + "," + charArraystr3 + "," + valint1 + "," + valint2 + "," + valint5 + "," + valint7 + "," + valint6 + "," + valdouble);
                                            writer.Close();
                                        }
                                        break;
                                    }
                            }
                            catch (Exception ex)
                            {
                                Console.WriteLine("Parsing didn't work");
                                Console.WriteLine(ex.ToString());
                                break;
                            }
                        }
                        catch (Exception ex)
                        {
                            Console.WriteLine("Here's why the character read attempt didn't work");
                            Console.WriteLine(ex.ToString());

                            continue;
                            //continue;
                        }
                    //}
                }
            }
            }
            }

私が受け取るエラーは次のとおりです。

    Here's why the character read attempt didn't work

    System.ArgumentException: The output char buffer is too small to contain the decoded characters, encoding 'Unicode (UTF-8)' fallback 'System.Text.DecoderReplacementFallback'.
    Parameter name: chars
    at System.Text.Encoding.ThrowCharsOverflow()
    at System.Text.Encoding.ThrowCharsOverflow(DecoderNLS decoder, Boolean nothingDecoded)
    at System.Text.UTF8Encoding.GetChars(Byte* bytes, Int32 byteCount, Char* chars, Int32 charCount, DecoderNLS baseDecoder)
    at System.Text.DecoderNLS.GetChars(Byte* bytes, Int32 byteCount, Char* chars, Int32 charCount, Boolean flush)
    at System.Text.DecoderNLS.GetChars(Byte[] bytes, Int32 byteIndex, Int32 byteCount, Char[] chars, Int32 charIndex, Boolean flush)
    at System.Text.DecoderNLS.GetChars(Byte[] bytes, Int32 byteIndex, Int32 byteCount, Char[] chars, Int32 charIndex)
    at System.IO.BinaryReader.InternalReadOneChar()
    at System.IO.BinaryReader.Read()
    at System.IO.BinaryReader.ReadChar()
    at line 69: i.e. inputCharIdentifier = reader.ReadChar();

更新: 上記と同じエラーを生成するサンプルファイルは、次のリンクにあります: http://www.wikisend.com/download/106394/rawbytes.txt

特に、データブロックヘッダー (つまり、 inputCharIdentifier ) が有効であっても、連続するデータブロック間に予期しない 8 バイトの null バイトがあることに注意してください。このようなヘッダーに続くバイト数は常に予測不可能であり、一般的に変化します。私の問題は、次の利用可能な破損していないデータブロックが発生したときに、そのような状況を削除またはスキップできるようにする必要があることです。サンプルファイルの場合、 8 個の場違いヌルバイト。

8 つのヌルバイトは、次のようにファイル内で見つけることができます: バイトカウンター: 1056 行 2、列 783 (Notepad++ による)

問題の核心は、8 つの null バイトが任意のサイズ (3、7、15、50 など) になる可能性があることです。これは、データ破損の直接的な結果として、常に不明です。しかし、「従来の」データ破損とは異なり、つまり、データブロック内の固定バイト数、たとえば 50 バイトが読み取れない可能性があるため、(正確なバイト数だけ) スキップすることができます - 私が直面するデータ破損は、有効なデータブロック間の不明なバイト数。

score 2 · Accepted Answer

ターゲット変数 (inputCharIdentifier) が null であるため、これらの状況にケースを割り当てることはできません。したがって、これらのケースを回避する条件で十分です。完全に確認するために、try...catch も含めました (指定されたすべてのアクションの実行中にエラーが発生すると、コードは自動的に次の反復にスキップされます)。

try
{
    using (BinaryReader reader = new BinaryReader(File.Open(fileName, FileMode.Open), Encoding.ASCII))
    {
        while (reader.BaseStream.Position != reader.BaseStream.Length)
        {
            inputCharIdentifier = reader.ReadChar();
            if(inputCharIdentifier != null)
            {
               switch (inputCharIdentifier)
                 case 'A':
                    field1 = reader.ReadUInt64();
                    field2 = reader.ReadUInt64();
                    field3 = reader.ReadChars(10);
                    string strtmp = new string(field3);
                    //and so on
                    using (TextWriter writer = File.AppendText("outputA.txt"))
                    {
                       writer.WriteLine(field1 + "," + field2 + "," + strtmp); 
                    }
                 case 'B':
                   //code...
            }
        }
    }
}
catch
{
}

c# - バイトデータの解析中にエラーが発生してもループを続行する方法

1 に答える 1

Related

Reference