java - Android Unicodeから読み取り可能な文字列へ

Question

Web ページからテキストを読み込んでいるときに、TextView に表示される Unicode 文字で問題が発生します。

次のコードを使用して Web コンテンツを取得しています。

try {
    HttpGet request = new HttpGet();
    request.addHeader("User-Agent", USER_AGENT);
    request.setURI(new URI(wwwlink));
    try {
        response4 = httpClient.execute(request);
    } catch (ClientProtocolException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
} catch (URISyntaxException e) {e.printStackTrace();}   
try {
    in2 = null;
    String UTF8 = "UTF-8";
    in2 = new BufferedReader (new InputStreamReader(response4.getEntity().getContent(),UTF8));
} catch (IllegalStateException e) {Log.i(tag,e.toString());
} catch (IOException e) {Log.i(tag,e.toString());}

私が読んでいるページには、次の HTML 見出しタグがあります。

<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>

問題は次のとおりです。行を読み、必要なテキストに次のようなユニコード文字が含まれています：

20 \u00b0C (20 degree symbol C )

これを変換して、TextView で度記号として表示しようとしています。

以下が機能しています

textview.settext("\u00b0");

しかし、私がそうすると、行にはユニコード文字が含まれます:

line = in2.readln;
textview.settext(line);

TextView は fe を表示します:some text \u00b0 some text

エミュレータと電話ですべてをチェックしました。

score 0 · Accepted Answer

入力テキストには Unicode の Java 表現が含まれているため、そのような文字を手動で正しい文字に置き換える必要があります。ここでは、大まかなアイデアを示すために、文字列から 1 つの文字を置き換える方法の例を示します。

    String input = "some text \\u00b0 some text";
    Scanner scanner =  new Scanner(input);
    String unicodeCharStr = scanner.findWithinHorizon("\\\\{1}u[0-9a-fA-F]{4}", 0);
    char unicodeChar = (char)(int)Integer.valueOf(unicodeCharStr.substring(2, 6), 16);
    input = input.replace(unicodeCharStr, unicodeChar+"");

java - Android Unicodeから読み取り可能な文字列へ

1 に答える 1

Related

Reference