c++ - Glib::ustring と日本語の文字

Question

Glib::ustring は UTF8 でうまく動作するはずですが、日本語の文字列を扱うときに問題があります。

「わたし」と「ワタシ」の 2 つの文字列を == 演算子または比較メソッドを使用して比較すると、これら 2 つの文字列は等しいと答えられます。

理由がわかりません。Glib::ustring はどのように機能しますか?

比較に失敗することがわかった唯一の方法は、異なるサイズの文字列を比較することです。たとえば、「海外わたわ」と「海外わた」。

非常に奇妙な...

score 1 · Accepted Answer

#include <iostream>
#include <glibmm/ustring.h>
int main() {
  Glib::ustring s1 = "わたし";
  Glib::ustring s2 = "ワタシ";
  std::cerr << (s1 == s2) << std::endl;
  return 0;
}

出力：0

編集：しかし、私はもう少し深く掘り下げました：

#include <iostream>
#include <glibmm.h>
int main() {
  Glib::ustring s1 = "わたし";
  Glib::ustring s2 = "ワタシ";
  std::cout << (s1 == s1) << std::endl;
  std::cout << (s1 == s2) << std::endl;
  std::locale::global(std::locale(""));
  std::cout << (s1 == s1) << std::endl;
  std::cout << (s1 == s2) << std::endl;
  std::cout << s1 << std::endl;
  std::cout << s2 << std::endl;
  return 0;
}

出力：

1
0
1
1
わたし
ワタシ

そして、これは奇妙に聞こえます。

score 1 · Accepted Answer

Glib::ustring::compare内部で使用g_utf8_collate()され、現在のロケールの規則に従って文字列を比較します。ロケールが日本語以外に設定されていませんか?

c++ - Glib::ustring と日本語の文字

2 に答える 2

Related

Reference