I am trying to help a friend with a project that was supposed to be 1H and has been now 3 days. Needless to say I feel very frustrated and angry ;-) ooooouuuu... I breath.
So the program written in C++ just read a bunch of file and process them. The problem is that my program reads files which are using a UTF-16 encoding (because the files contain words written in different languages) and a simple use to ifstream just doesn't seem to work (it reads and outputs garbage). It took me a while to realise that this was because the files were in UTF-16.
Now I spent literally the whole afternoon on the web trying to find info about READING UTF16 files and converting the content of a UTF16 line to char! I just can't seem to! It's a nightmare. I try to learn about <locale>
and <codecvt>
, wstring, etc. which I have never used before (I am specialised in graphics apps, not desktop apps). I just can't get it.
This is what I have done so fare (but doesn't work):
std::wifstream file2(fileFullPath);
std::locale loc (std::locale(), new std::codecvt_utf16<char32_t>);
std::cout.imbue(loc);
while (!file2.eof()) {
std::wstring line;
std::getline(file2, line);
std::wcout << line << std::endl;
}
That's the maximum I could come up with but it doesn't even work. And it doesn't do anything better. But the problem is that I don't understand what I am doing in the first place anyway.
SO PLEASE PLEASE HELP! This is really driving crazy that I can even read a G*** D*** text file.
On top, my friend uses Ubuntu (I use clang++) and this code needs -stdlib=libc++ which doesn't seem to be supported by gcc on his side (even though he uses a pretty advanced version of gcc, which is 4.6.3 i believe). So I am not even sure using codecvt and locale is a good idea (as in "possible"). Would there be a better (another) option.
If I convert all the files to utf-8 just from the command line (using a linux command) am I going to potentially lose information?
Thank a lot, I will ever be grateful to you if you help me on this.