c++ - 正規表現を使用して文字列を漢字のみに一致させるにはどうすればよいですか?

Question

漢字で構成され、英語やその他の文字を含まない文字列にのみ一致する正規表現を取得したいと考えています。[\u4e00-\u9fa5] はまったく機能せず、[^x00-xff] は句読点または他の言語文字を使用した状況に一致します。

boost::wregex reg(L"\\w*");
bool b = boost::regex_match(L"我a", reg);    // expected to be false
b = boost::regex_match(L"我,", reg);         // expected to be false
b = boost::regex_match(L"我", reg);          // expected to be true

score 3 · Accepted Answer

Boost with ICUは文字クラスを使用できます。\p{Han}スクリプトを探していると思います。あるいは、U+4E00..U+9FFF は\p{InCJK_Unified_Ideographs}

score 1 · Accepted Answer

1

次の正規表現は正常に機能します。

boost::wregex reg(L"^[\u4e00-\u9fa5]+");

于 2013-03-29T08:34:30.587 に答える

c++ - 正規表現を使用して文字列を漢字のみに一致させるにはどうすればよいですか?

2 に答える 2

Related

Reference