java - Cでの単純なデータシリアル化

Question

現在、アプリケーションを再設計していて、一部のデータのシリアル化で問題が発生しました。

サイズmxnの配列があるとしましょう

double **data;

シリアル化したい

char *dataSerialized

単純な区切り文字を使用します（1つは行用、もう1つは要素用）。

デシリアライズはかなり簡単で、区切り文字を数え、保存するデータにサイズを割り当てます。ただし、シリアル化機能についてはどうでしょうか。

serialize_matrix(double **data, int m, int n, char **dataSerialized);

char配列に必要なサイズを決定し、それに適切なメモリを割り当てるための最良の戦略は何でしょうか？

おそらく、文字列内のdoubleの固定幅の指数表現を使用していますか？doubleのすべてのバイトをcharに変換し、sizeof（double）で整列されたchar配列を持つことは可能ですか？数値の精度を維持するにはどうすればよいですか？

ノート：

バイナリではなく、ファイルではなく、char配列のデータが必要です。

シリアル化されたデータは、CサーバーとJavaクライアント間でZeroMQを使用してネットワーク経由で送信されます。配列の次元とsizeof（double）を考えると、これら2つの間で常に正確に再構築できる可能性はありますか？

score 3 · Accepted Answer

Javaは、生のバイトを読み取り、必要なものに変換するための非常に優れたサポートを備えています。単純なワイヤ形式を決定してから、Cでこれにシリアル化し、Javaでシリアル化を解除できます。

これは、シリアル化を解除してシリアル化するコードを含む、非常に単純な形式の例です。

必要に応じてどこかにダンプできる、少し大きいテストプログラムを作成しました。Cでランダムなデータ配列を作成し、シリアル化し、base64でエンコードされたシリアル化された文字列をstdoutに書き込みます。次に、はるかに小さいJavaプログラムがこれを読み取り、デコードし、逆シリアル化します。

シリアル化するCコード：

/* 
I'm using this format:
32 bit signed int                   32 bit signed int                   See below
[number of elements in outer array] [number of elements in inner array] [elements]

[elements] is buildt like
[element(0,0)][element(0,1)]...[element(0,y)][element(1,0)]...

each element is sendt like a 64 bit iee754 "double". If your C compiler/architecture is doing something different with its "double"'s, look forward to hours of fun :)

I'm using a couple non-standard functions for byte-swapping here, originally from a BSD, but present in glibc>=2.9.
*/

/* Calculate the bytes required to store a message of x*y doubles */
size_t calculate_size(size_t x, size_t y)
{
    /* The two dimensions in the array  - each in 32 bits - (2 * 4)*/
    size_t sz = 8;  
    /* a 64 bit IEE754 is by definition 8 bytes long :) */
    sz += ((x * y) * 8);    
    /* and a NUL */
    sz++;
    return sz;
}

/* Helpers */
static char* write_int32(int32_t, char*);
static char* write_double(double, char*);
/* Actual conversion. That wasn't so hard, was it? */
void convert_data(double** src, size_t x, size_t y, char* dst)
{

    dst = write_int32((int32_t) x, dst);    
    dst = write_int32((int32_t) y, dst);    

    for(int i = 0; i < x; i++) {
        for(int j = 0; j < y; j++) {
            dst = write_double(src[i][j], dst);
        }
    }
    *dst = '\0';
}


static char* write_int32(int32_t num,  char* c)
{
    char* byte; 
    int i = sizeof(int32_t); 
    /* Convert to network byte order */
    num = htobe32(num);
    byte = (char*) (&num);
    while(i--) {
        *c++ = *byte++;
    }
    return c;
}

static char* write_double(double d, char* c)
{
    /* Here I'm assuming your C programs use IEE754 'double' precision natively.
    If you don't, you should be able to convert into this format. A helper library most likely already exists for your platform.
    Note that IEE754 endianess isn't defined, but in practice, normal platforms use the same byte order as they do for integers.
*/
    char* byte; 
    int i = sizeof(uint64_t);
    uint64_t num = *((uint64_t*)&d);
    /* convert to network byte order */
    num = htobe64(num);
    byte = (char*) (&num);
    while(i--) {
        *c++ = *byte++; 
    }
    return c;
}

シリアル化を解除するJavaコード：

/* The raw char array from c is now read into the byte[] `bytes` in java */
DataInputStream stream = new DataInputStream(new ByteArrayInputStream(bytes));

int dim_x; int dim_y;
double[][] data;

try {   
    dim_x = stream.readInt();
    dim_y = stream.readInt();
    data = new double[dim_x][dim_y];
    for(int i = 0; i < dim_x; ++i) {
        for(int j = 0; j < dim_y; ++j) {
            data[i][j] = stream.readDouble();
        }
    }

    System.out.println("Client:");
    System.out.println("Dimensions: "+dim_x+" x "+dim_y);
    System.out.println("Data:");
    for(int i = 0; i < dim_x; ++i) {
        for(int j = 0; j < dim_y; ++j) {
            System.out.print(" "+data[i][j]);
        }
        System.out.println();
    }


} catch(IOException e) {
    System.err.println("Error reading input");
    System.err.println(e.getMessage());
    System.exit(1);
}

score 1 · Accepted Answer

バイナリファイルを作成している場合は、の実際のバイナリデータ（64ビット）をシリアル化するための良い方法を考える必要がありますdouble。これは、doubleのコンテンツをファイルに直接書き込む（エンディアンを気にする）ことから、より複雑な正規化シリアル化スキーム（たとえば、NaNの明確に定義された表現を使用）に移行する可能性があります。それは本当にあなた次第です。基本的に同種のアーキテクチャの中にあると予想される場合は、直接メモリダンプでおそらく十分でしょう。

テキストファイルに書き込みたい場合で、ASCII表現を探している場合は、10進数の数値表現を強くお勧めしません。代わりに、base64などを使用して64ビットの生データをASCIIに変換できます。

あなたは本当にあなたがあなたの中に持っているすべての精度を保ちたいですdouble！

java - Cでの単純なデータシリアル化

2 に答える 2

Related

Reference