php - PHP - UTF-8 エンコーディング/デコーディングは必要ですか?

Question

OK、以下の関数内で読み取った UTF-8 ファイルにコメントを書き込んで、これらのコメント間のテキストを削除しています。私の質問は、UTF-8 ファイルに対してこれを正常に行うために、ここで何か違うものが必要ですか? または、以下のコードは機能しますか? utf8_decode基本的に、機能が必要かどうか、またはutf8_encode機能する必要があるかどうか疑問に思っていiconvます。

// This holds the current file we are working on.
$lang_file = 'files/DreamTemplates.russian-utf8.php';

// Can't read from the file if it doesn't exist now can we?
if (!file_exists($lang_file))
    continue;

// This helps to remove the language strings for the template, since the comment is unique
$template_begin_comment = '// ' . ' Template - ' . $lang_file . ' BEGIN...';
$template_end_comment = '// ' . ' Template - ' . $lang_file . ' END!';

$fp = fopen($lang_file, 'rb');
$content = fread($fp, filesize($lang_file));
fclose($fp);

// Searching within the string, extracting only what we need.
$start = strpos($content, $template_begin_comment);
$end = strpos($content, $template_end_comment);

// We can't do this unless both are found.
if ($start !== false && $end !== false)
{
    $begin = substr($content, 0, $start);
    $finish = substr($content, $end + strlen($template_end_comment));

    $new_content = $begin . $finish;

    // Write it into the file.
    $fo = fopen($lang_file, 'wb');
    @fwrite($fo, $new_content);
    fclose($fo);
}

コメント付きの文字列であっても、文字列の UTF-8 エンコーディングとデコーディングに関してご協力いただきありがとうございます。

PHPコメントをUTF-8ファイルに書き込むとき、変換を使用していません。私はすべきですか？ただし、php コメント間の文字列定義は既に UTF-8 でエンコードされており、ファイル内で正常に機能しているようです。ここで助けていただければ幸いです。

score 1 · Accepted Answer

これを行うには、preg_replace代わりに次を使用します。

$content = file_get_contents($lang_file);

$template_begin_comment = '// ' . ' Template - ' . $lang_file . ' BEGIN...';
$template_end_comment = '// ' . ' Template - ' . $lang_file . ' END!';

// find from begin comment to end comment
// replace with emptiness
// keep track of how many replacements have been made
$new_content = preg_replace('/' . 
      preg_quote($template_begin_comment, '/') . 
      '.*?' . 
      preg_quote($template_end_comment, '/') . '/s', 
    '', 
    $content, 
    -1, 
    $replace_count
);

if ($replace_count) {
  // if replacements have been made, write the file back again
  file_put_contents($lang_file, $new_content);
}

マッチングにはASCIIのみが含まれているため、残りはそのままコピーされるため、このアプローチは十分安全です。

免責事項

上記のコードはテストされていません。何か問題がある場合はお知らせください。

score 1 · Accepted Answer

いいえ、変換する必要はありません。

また、抽出コードはマルチバイト文字を壊さないという意味で信頼できますが、終了位置が開始位置の後にあることを確認したい場合があります。

php - PHP - UTF-8 エンコーディング/デコーディングは必要ですか?

2 に答える 2

Related

Reference