c++ - 異なるアーキテクチャでの C でのファイル操作

Question

研究プロジェクトとして、標準 C (BINARY) ファイル処理ライブラリ (stdio) の上に抽象化レイヤーを作成し、トランザクションによるファイル処理にいくつかの追加機能を提供しています。

ワークフローは次のとおりです。

ユーザーが API (または標準のfopen) を使用してファイルを開きます。どちらも戻りFILE*ます。ファイルは BINARY モードで開かれています!
ユーザーは、標準ライブラリコマンド ( などfwrite)を使用してファイルにデータを書き込みます。
ユーザーは、API を使用して開いたファイルでトランザクションを開きます。TRANSACTION a = trans_start(FILE* )
ユーザーがTRANSACTIONオブジェクトのデータバリデーターを設定します (set_validator(TRANSACTION, int(*)(char*))
ユーザーは独自の API を使用してファイルにデータを「書き込み」ます (int trans_write_string(TRANSACTION*, char*, length)
- 実際には、この「書き込み」は、上記で定義されたバリデーターのメモリにデータを配置します。これは、データに対して操作を実行し、いくつかのフラグをどこかに設定する可能性があります...質問には関係ありません。
ユーザーはtrans_commit(TRANSACTION)、実際にデータをファイルに書き込むために使用します。現在、バリデーターによって設定されたフラグによっては、データがファイルに書き込まれない可能性がありますが、エラーがユーザーに報告されます (これはプログラムで解決できます。... 質問にはあまり関係ありません)。
ユーザーは標準 API を使用してファイルを閉じますfclose。

これまでのところ、trans_write_stringうまく機能する API の文字列処理メソッド ( ) しかありません。メモリデータバッファ内に独自のデータを構築し、必要に応じて内容を変更し、バリデータを呼び出します。連続した呼び出しでは、新しいデータを内部メモリバッファに追加し、割り当てを処理します。コミットが成功すると、を使用してファイルにデータを送信しますfwrite(はい、これはほとんどが C プロジェクトですが、C++ の回答も除外されません)。

しかし今、API を拡張して、数値 (16 ビット、32 ビット、64 ビット) と浮動小数点数も書き込めるようにしたい (... 必要があります)... 標準のCstdio APIと非常によく似た方法でそれ。文字列の既存の実装を使用して、これはメモリ内にN文字のバイト (文字列自体)を保持するデータバッファーがあることを前提としていMます。文字列、64 ビット値の場合は 8 バイト、16 ビット値の場合は 2 バイトなど...

「別のコンピューター/アーキテクチャ/OS/エンディアンを使用する他の誰かが読み取れるようにするために、ファイル内の数値を表現する方法」の時点で行き詰まりました。

char* addr = &my_16bit_intメモリストリームに数値を挿入することは、理論的には char ( )へのキャストを介して可能で*(addr)あり*(addr + 1)、必要なアドレス (つまりN、文字列の文字の後) に配置し、それをファイルに書き込むことも可能ですが、必要な場合はどうすればよいでしょうかエンディアンが異なる別のアーキテクチャで結果のファイルを読み取るには? そして、「もう一方の」コンピュータが 16 ビットの古代の金属の山に過ぎない場合はどうなるでしょうか? その場合、ファイルに書き込まれた 64 ビット値はどうなるでしょうか?

この種の問題を解決するための優れた方法にはどのようなものがありますか?

EDIT : ターゲットファイルはバイナリである必要があり、その形式 ( N8 バイト文字、116 ビット値など) を説明するテキストファイル (XML) が付随します (このテキストファイルは、最愛のバリデータ）。バリデーターは、はい、この 16 ビット値を受け入れる、いいえ、この長い文字列を拒否するなどのようなことを「言います」...そして、他の誰かがこの「出力」に基づいてデータ形式 XML を作成しています。

EDIT2：はい、20年前の巨大な冷蔵庫サイズのボックスでさえ、さまざまなプラットフォームでファイルを共有する必要があります:)

EDIT3：はい、フロートも必要です！

score 2 · Accepted Answer

Casting is not sufficient, I think the sockets method htons and htonl will be sufficient solution for int16 and int32. for int64 you should build it yourself, since there is no official method:

Note that all the functions are reversing the bytes order only if needed, so you can also use the same method in order to 'fix' a number back to normal.

typedef union{
    unsigned char c[2];
    unsigned short s;
}U2;

//you can use the standard htons or this
unsigned short htons(unsigned short s)
{
    U2 mask,res;
    unsigned char* p = (unsigned char*)&s; 
    mask.s = 0x0001;
    res.c[mask.c[0]] = p[0];
    res.c[mask.c[1]] = p[1];
    return res.s;
}

//the same for 4 bytes
typedef union{
    unsigned char c[4];
    unsigned short s[2];
    unsigned long l;
}U4;

//you can use the standard htonl or this
unsigned long htonl(unsigned long l)
{
    U4 mask,res;
    unsigned char* p = (unsigned char*)&l; 
    mask.l = 0x00010203;
    res.c[mask.c[0]] = p[0];
    res.c[mask.c[1]] = p[1];
    res.c[mask.c[2]] = p[2];
    res.c[mask.c[3]] = p[3];
    return res.l;
}

typedef union{
    unsigned char c[8];
    unsigned char c2[2][4];
    unsigned short s[4];    
    unsigned long l[2];
    unsigned long long ll; 
}U8; 

//for int64 you can use the int64 and do the same, or you can to do it with 2*4 like i did
//you can give a void pointer as well.. 
unsigned long long htonll(unsigned long long ll)//void htonll(void* arg, void* result)
{
    U2 mask1;
    U4 mask2;
    U8 res;

    unsigned char* p = (unsigned char*)&ll; //or (unsigned char*)arg   
    mask1.s = 0x0001;
    mask2.l = 0x00010203;
    //I didn't use the int64 for convertion 
    res.c2[mask1.c[0]][mask2.c[0]] = p[0];
    res.c2[mask1.c[0]][mask2.c[1]] = p[1];
    res.c2[mask1.c[0]][mask2.c[2]] = p[2];
    res.c2[mask1.c[0]][mask2.c[3]] = p[3];
    res.c2[mask1.c[1]][mask2.c[0]] = p[4];
    res.c2[mask1.c[1]][mask2.c[1]] = p[5];
    res.c2[mask1.c[1]][mask2.c[2]] = p[6];
    res.c2[mask1.c[1]][mask2.c[3]] = p[7];

    //memcpy(result,res.c,8);
    return res.ll;
}
//or if you want to use the htonl:
unsigned long long htonll2(unsigned long long ll)
{
    U2 mask1;
    U8 res;
    mask1.s = 0x0001;
    unsigned long* p = (unsigned long*)&ll;
    res.l[0] = htonl(p[mask1.c[0]]);
    res.l[1] = htonl(p[mask1.c[1]]);
    return res.ll;
}

int main()
{
    unsigned short s = 0x1122;
    cout<<hex<<htons(s)<<endl;
    unsigned long l = 0x11223344;
    cout<<hex<<htonl(l)<<endl;
    unsigned long long ll=0x1122334455667788;
    cout<<hex<<htonll(ll)<<endl;
    cout<<hex<<htonll2(ll)<<endl;
    return 0;
}

score 1 · Accepted Answer

形式を定義するか、XDR などの既存のバイナリ形式を選択して、読み書きする必要があります。したがって、たとえば、XDR で 32 ビット整数を書き込むには:

void
write32Bits( FILE* dest, uint_least32_t value )
{
    putc( (value >> 24) & 0xFF, dest );
    putc( (value >> 16) & 0xFF, dest );
    putc( (value >>  8) & 0xFF, dest );
    putc( (value      ) & 0xFF, dest );
}

浮動小数点はもっと複雑ですが、プラットフォームを IEEE float をサポートするものに制限したい場合は、 pun floattouint32_tおよびdoubletouint64_tと入力して unsigned int として出力できます。同様に、32 ビット整数型の 2 の補数マシンに制限する場合は、上記のシフトとマスクの手順を符号付きの値に使用することもできます (整数型はとになりますuint32_t) int32_t。

移植性に関して: IEEE はメインフレームを除いて普遍的であり、2 の補数は非常にエキゾチックなメインフレームを除いて普遍的であると思います。(IBM メインフレームは 2 の補数ですが、IEEE ではありません。Unisys メインフレームは 2 の補数ではなく、32 ビットの整数型もありません。他のメインフレームがまだ存在するかどうかはわかりませんが、過去にはすべての一種の異国情緒。）

score 1 · Accepted Answer

glibc を使用している場合は、「endian.h」からの le <-> be 変換にその関数を使用できます。

SYNOPSIS
   #define _BSD_SOURCE             /* See feature_test_macros(7) */
   #include <endian.h>

   uint16_t htobe16(uint16_t host_16bits);
   uint16_t htole16(uint16_t host_16bits);
   uint16_t be16toh(uint16_t big_endian_16bits);
   uint16_t le16toh(uint16_t little_endian_16bits);

   uint32_t htobe32(uint32_t host_32bits);
   uint32_t htole32(uint32_t host_32bits);
   uint32_t be32toh(uint32_t big_endian_32bits);
   uint32_t le32toh(uint32_t little_endian_32bits);

   uint64_t htobe64(uint64_t host_64bits);
   uint64_t htole64(uint64_t host_64bits);
   uint64_t be64toh(uint64_t big_endian_64bits);
   uint64_t le64toh(uint64_t little_endian_64bits);

glibc を使用していない場合は、glibc-2.18/bits/byteswap.h を参照してください。

c++ - 異なるアーキテクチャでの C でのファイル操作

3 に答える 3

Related

Reference