mysql - mysql内のガベージ文字を置き換えます

Question

私のデータベースは、またはlatin1でいっぱいです（端末がそれぞれlatin1またはunicodeに設定されているかどうかによって異なります）。文脈から、私はそれらがemdashsであるべきだと思います。IEでレンダリングされた（またはレンダリングされなかった）場合、これらは厄介なバグを引き起こしているように見えます。それらを見つけて交換したいのですが。問題は、âと�のどちらの文字も。と一致しないことです。クエリの実行：â"'��"'replace

    update TABLE set COLUMN = replace(COLUMN,'��&quot;','---');

エラーなしで実行されますが、何も実行されません（0行が変更されました）。ターミナルにコピーすると、「ひし形の疑問符」の文字が一致していないことは明らかです。そのコードを見つけて、それと何かを一致させる方法はありますか？コンソールは、mysqlこれを1行で実行できることに非常に近いので、回避できるのであれば、ターミナルの外でスクリプトを作成したくありません。

dbはAmazonRDSでホストされているため、ここで他の質問で参照されているregexpudfをインストールできません。長期的には、db全体をutf8に適切に変換する必要がありますが、このレンダリングの問題をすぐに修正する必要があります。

編集：

悪い文字をで分離しましたhexdump。これはe280です（これはどのUnicode文字にも対応していないと思います）。どうすればそれを置換関数にフィードできますか？

    update TABLE set COLUMN = replace(COLUMN, char(0xe2,0x80),'---');

何もしません。

score 1 · Accepted Answer

I figured it out. I used mysql's builtin hex function to dump an entry that I knew was bad.

    select hex(column) from table where id=666;

Then picked out the words (those numbers sandwiched between "20"s) and discovered that my offending set of bytes was in fact x'C3A2E282AC2671756F743B'. How this corresponds to the way I saw it encoded in PHP and by my system (as e2 80) I don't know and at this point, I don't really care.

To verify, before destroying the data, you plug that back in to mysql:

    select x'C3A2E282AC2671756F743B';
    +---------------------------+
    | x'C3A2E282AC2671756F743B' |
    +---------------------------+
    | â€&amp;quot;               |
    +---------------------------+
    1 row in set (0.00 sec)

So, using the replace query like above, I was able to get rid of all the bad data at once.

For the record it was:

    update TABLE set COLUMN = replace(COLUMN, x'C3A2E282AC2671756F743B','--');

I really hope this is useful for someone. Though encoding snafus appear to be pretty common in mysql, I searched everywhere and I couldn't find an explanation for this ultimately rather simple process.

score 0 · Accepted Answer

0

于 2012-02-13T22:09:40.767 に答える

mysql - mysql内のガベージ文字を置き換えます

2 に答える 2

Related

Reference