c++ - 特殊文字を使用したHunspell提案の処理

Question

Visual Studio 2010を使用して、Windows7のアンマネージC++アプリにHunspellを統合しました。

私は英語で機能するスペルチェックと提案を持っていますが、今はスペイン語で機能するようにしようとしています。スペイン語の提案を受け取るたびに、アクセント文字を含む提案がstd::wstringオブジェクトに適切に翻訳されていません。

Hunspell->suggestメソッドから返される提案の例を次に示します。

Hunspell-> Suggestest（...）の結果

これは私がそれをに翻訳するstd::stringために使用しているコードですstd::wstring

std::wstring StringToWString(const std::string& str)
{
    std::wstring convertedString;
    int requiredSize = MultiByteToWideChar(CP_UTF8, 0, str.c_str(), -1, 0, 0);
    if(requiredSize > 0)
    {
        std::vector<wchar_t> buffer(requiredSize);
        MultiByteToWideChar(CP_UTF8, 0, str.c_str(), -1, &buffer[0], requiredSize);
        convertedString.assign(buffer.begin(), buffer.end() - 1);
    }

    return convertedString;
}

そして、それを実行した後、最後にファンキーなキャラクターでこれを取得します。

wstringに変換した後

ここでの変換で何が起こっているのかを誰かが理解するのを手伝ってもらえますか？hunspellから返された負の文字に関連していると思いますが、std::wstring変換コード用にそれを何かに変換する方法がわかりません。

score 1 · Accepted Answer

Hunspellの出力はコードページ852のASCIIであるように見えます。CP_UTF8の代わりに852を使用してくださいhttp://msdn.microsoft.com/en-us/library/windows/desktop/dd317756(v=vs.85).aspx

または、UTF8を返すようにHunspellを構成します。

score 1 · Accepted Answer

Hunspellの出力はASCIIで28591、unixコマンドラインユーティリティのHunspellのデフォルト設定を調べて見つけたコードページ（ISO 8859-1 Latin 1;西ヨーロッパ（ISO））のようです。

CP_UTF8私のために28591働くように変更します。

// Updated code page to 28591 from CP_UTF8
std::wstring StringToWString(const std::string& str)
{
    std::wstring convertedString;
    int requiredSize = MultiByteToWideChar(28591, 0, str.c_str(), -1, 0, 0);
    if(requiredSize > 0)
    {
        std::vector<wchar_t> buffer(requiredSize);
        MultiByteToWideChar(28591, 0, str.c_str(), -1, &buffer[0], requiredSize);
        convertedString.assign(buffer.begin(), buffer.end() - 1);
    }

    return convertedString;
}

これは、正しいコードページ整数を見つけるのに役立ったMSDNのコードページのリストです。

c++ - 特殊文字を使用したHunspell提案の処理

2 に答える 2

Related

Reference