これはあなたのニーズに合うかもしれません(しかし、私にはわかりません)。ロケールと utf8_decode を設定し、mt_detect_encoding のmb_check_encoding
代わりに使用すると、有用な出力が得られるようです。
// some text from http://chinesenotes.com/chinese_text_l10n.php
// have tried both as string and content loaded from a file
$chinese = '譧躆 礛簼繰 剆坲姏 潧 騔鯬 跠 瘱瘵瘲 忁曨曣 蛃袚觙';
$chinese=utf8_decode($chinese);
$chinese_encodings ='EUC-CN,HZ,GBK,CP936,GB18030,EUC-TW,BIG5,CP950,BIG5-HKSCS,BIG5-HKSCS:2004,BIG5-HKSCS:2001,BIG5-HKSCS:1999,ISO-2022-CN,ISO-2022-CN-EXT';
$encodings = explode(',',$chinese_encodings);
//set chinese locale
setlocale(LC_CTYPE, 'Chinese');
foreach($encodings as $encoding) {
if (@mb_check_encoding($chinese, $encoding)) {
echo 'The string seems to be compatible with '.$encoding.'<br>';
} else {
echo 'Not compatible with '.$encoding.'<br>';
}
}
出力
The string seems to be compatible with EUC-CN
The string seems to be compatible with HZ
The string seems to be compatible with GBK
The string seems to be compatible with CP936
Not compatible with GB18030
The string seems to be compatible with EUC-TW
The string seems to be compatible with BIG5
The string seems to be compatible with CP950
Not compatible with BIG5-HKSCS
Not compatible with BIG5-HKSCS:2004
Not compatible with BIG5-HKSCS:2001
Not compatible with BIG5-HKSCS:1999
Not compatible with ISO-2022-CN
Not compatible with ISO-2022-CN-EXT
全くの推測です。現在、少なくとも中国語のエンコーディングの一部を認識しているようです。完全ジャンクでしたら削除します。