java - ユニコードの日本語テキスト

Question

日本語のテキストを読みやすいテキストに変換するのに問題があります。現在、ユーザーから値を取得している試用プログラムがあります。これらの値は、オブジェクトを作成するために word と呼んだクラスを介して渡されます。オブジェクトが作成されたら、オブジェクトをファイルに読み書きしたいと考えています。私はオブジェクトを読み書きしているので、これを行うためにオブジェクト出力と入力ストリームを使用しています。これの問題は、オブジェクト出力と入力ストリームの使用中に、使用中のファイルを UTF-8 にエンコードする方法がわからないことです。エンコーディングを使用しないと、仮名または漢字があるべき場所に疑問符が表示されます。

とにかく、オブジェクト出力または入力ストリームを使用してファイルをユニコードに変換する方法はありますか? そうでない場合、仮名または漢字があるべき場所に疑問符が表示されないようにする他の方法はありますか?

    public class JavaApplication1 {

    /**
     * @param args the command line arguments
     */

    Scanner scan = new Scanner(System.in);

    public static void main(String[] args) throws FileNotFoundException, IOException, ClassNotFoundException, FontFormatException {
        // TODO code application logic here
        JavaApplication1 ja = new JavaApplication1();
        ja.start();
    }
    public void start() throws FileNotFoundException, IOException, ClassNotFoundException, FontFormatException{

        System.out.println("Enter Kanji");
        String Kanji = scan.next();
        System.out.println("Enter Romanji");
        String Romanji = scan.next();
        System.out.println("How common is it");
        int common = scan.nextInt();
        System.out.println("How many types of word is it?");
        int loop = scan.nextInt();
        ArrayList type = new ArrayList();
        for(int i = 0; i<loop;i++){
            System.out.println("What type of word");
            type.add(scan.nextInt());
        }
        System.out.println("What type of adjective");
        int adjective = scan.nextInt();
        System.out.println("What type of verb");
        int verb = scan.nextInt();
        System.out.println("How many radicals");
        int loop2 = scan.nextInt();
         ArrayList radical = new ArrayList();
        for(int i = 0; i<loop2;i++){
            System.out.println("radical");
            radical.add(scan.nextInt());
        }
        //String newKanji = GetUnicode(Kanji);
        Word word = new Word(Kanji,Romanji,common,type,adjective,verb,radical);
        word.getKanaKanji();
        store(word);
        //store(word);
        read();

    }
    public void store(Word word) throws FileNotFoundException, IOException, FontFormatException{
        File file = new File("test.dat");
        FileOutputStream outFileStream = new FileOutputStream(file);
        ObjectOutputStream oos = new ObjectOutputStream(outFileStream);
        oos.writeObject(word);
        oos.close();
    }
    public void read() throws FileNotFoundException, IOException, ClassNotFoundException, FontFormatException{
        File file = new File("test.dat");
        FileInputStream filein = new FileInputStream(file);
        ObjectInputStream ois = new ObjectInputStream(filein);
        Word word = (Word) ois.readObject();
        ois.close();
        System.out.println(word.getKanaKanji());//this gets the kanakanji  

    }
}

Word クラスの getKanaKanji メソッドを呼び出すと、疑問符が表示されます。

日本語の文字をサポートする OS を持っているので、それは問題ではありません。

前もって感謝します！

score 0 · Accepted Answer

ObjectOutputStream を介して String オブジェクトを書き込む場合、最初に String オブジェクトの長さが 2 バイトで書き込まれ、次に String オブジェクトの内容が修正された UTF-8 で書き込まれます。DataOutput.writeUTF(String) の説明を参照してください。

http://docs.oracle.com/javase/7/docs/api/java/io/DataOutput.html#writeUTF%28java.lang.String%29

表示されている疑問符は、文字列の長さを表す先頭の 2 バイトです。

java - ユニコードの日本語テキスト

1 に答える 1

Related

Reference