c - GCC 4.7 文字列リテラルのソース文字エンコーディングと実行文字エンコーディング?

Question

Linux/x86_64 上の GCC 4.7 には、C ソースファイルの文字列リテラルの内容を検証およびデコードするためのデフォルトの文字エンコーディングがありますか? これは構成可能ですか?

さらに、文字列リテラルの文字列データを出力のデータセクションにリンクする場合、デフォルトの実行文字エンコーディングはありますか? これは構成可能ですか?

どのような構成でも、実行文字エンコーディングとは異なるソース文字エンコーディングを持つことは可能ですか? (つまり、gcc は文字エンコーディング間でトランスコードしますか?)

score 13 · Accepted Answer

これらのオプションが実際にどれだけうまく機能するかはわかりません (atm を使用していません。ローカライズされた文字列はとにかく外部ファイルから取得されるため、文字列リテラルを「ASCII のみ」として扱うことを好みます。そのほとんどはフォーマット文字列やファイル名のようなものです)。のようなオプションを追加しました

-fexec-charset=charset
Set the execution character set, used for string and character constants. The default
is UTF-8. charset can be any encoding supported by the system's iconv library routine. 

-fwide-exec-charset=charset
Set the wide execution character set, used for wide string and character constants.
The default is UTF-32 or UTF-16, whichever corresponds to the width of wchar_t. As
with -fexec-charset, charset can be any encoding supported by the system's iconv
library routine; however, you will have problems with encodings that do not fit
exactly in wchar_t.

-finput-charset=charset
Set the input character set, used for translation from the character set of the
input file to the source character set used by GCC. If the locale does not specify,
or GCC cannot get this information from the locale, the default is UTF-8. This can
be overridden by either the locale or this command line option. Currently the command
line option takes precedence if there's a conflict. charset can be any encoding
supported by the system's iconv library routine.

c - GCC 4.7 文字列リテラルのソース文字エンコーディングと実行文字エンコーディング?

1 に答える 1

Related

Reference