haskell - Aeson は Unicode 文字を含む文字列をデコードしません

Question

Data.Aeson ( https://hackage.haskell.org/package/aeson-0.6.1.0/docs/Data-Aeson.html ) を使用して JSON 文字列をデコードしようとしていますが、文字列の解析に失敗しています。非標準文字が含まれています。

例として、ファイル:

import Data.Aeson
import Data.ByteString.Lazy.Char8 (pack)

test1 :: Maybe Value
test1 = decode $ pack "{ \"foo\": \"bar\"}"

test2 :: Maybe Value
test2 = decode $ pack "{ \"foo\": \"bòz\"}"

ghci で実行すると、次の結果が得られます。

*Main> :l ~/test.hs
[1 of 1] Compiling Main             ( /Users/ltomlin/test.hs, interpreted )
Ok, modules loaded: Main.
*Main> test1
Just (Object fromList [("foo",String "bar")])
*Main> test2
Nothing

文字列をユニコード文字で解析しない理由はありますか? 私は、Haskell が Unicode にかなり優れているという印象を受けました。どんな提案でも大歓迎です！

ありがとう、

テチギ

編集

を使用してさらに調査するeitherDecodeと、次のエラーメッセージが表示されます。

 *Main> test2
 Left "Failed reading: Cannot decode byte '\\x61': Data.Text.Encoding.decodeUtf8: Invalid UTF-8 stream"

x61特殊な Unicode 文字の直後にある「z」の Unicode 文字です。特殊文字の後の文字を読み取れない理由がわかりません!

代わりにに変更test2するとtest2 = decode $ pack "{ \"foo\": \"bòz\"}"、エラーが発生します。

Left "Failed reading: Cannot decode byte '\\xf2': Data.Text.Encoding.decodeUtf8: Invalid UTF-8 stream"

これは「ò」の文字で、もう少し意味があります。

score 7 · Accepted Answer

問題は、ラテン 1 以外のデータでは機能しない Char8 モジュールの pack の使用です。代わりに、encodeUtf8from text を使用してください。

haskell - Aeson は Unicode 文字を含む文字列をデコードしません

編集

1 に答える 1

Related

Reference