java - Javaの配列から重複した単語を削除する方法

Question

私はファイルを読み込んでから、行を個々の単語に分割し、各単語を一意の単語の配列に追加し (重複する単語はありません)、一意の単語の配列を返す string[] メソッドを呼び出しています。

各単語を一度だけ印刷する方法を理解することはできませんが、これまでのところ私が持っているものは次のとおりです。

static public String[ ] sortUnique( String [ ] unique, int count)
{
    String temp;
    for(int i = 1; i < count; i++) {
        temp = unique[i].replaceAll("([a-z]+)[,.?]*", "$1");;
        int j;
        for(j = i - 1; j>= 0 && (temp.compareToIgnoreCase(unique[j]) < 0);j--) {
            unique[j+1] = unique[j];
        }
        unique[j+1] = temp;
    }
    return unique;
}

そして、これがデータファイルです。

    Is this a dagger which I see before me,
    The handle toward my hand? Come, let me clutch thee.
    I have thee not, and yet I see thee still.
    Art thou not, fatal vision, sensible
    To feeling as to sight? Or art thou but
    A dagger of the mind, a false creation,

どんな助けでも大歓迎です！

score 4 · Accepted Answer

ファイルを読み込んで重複する単語を削除するには:

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.StreamTokenizer;
import java.util.Set;
import java.util.TreeSet;

public class WordReader {

   public static void main( String[] args ) throws Exception {
      BufferedReader br =
         new BufferedReader(
            new FileReader( "F:/docs/Notes/Notes.txt" ));
      Set< String > words = new TreeSet<>();                // {sorted,unique}
      StreamTokenizer st = new StreamTokenizer( br );
      while( st.nextToken() != StreamTokenizer.TT_EOF ) {
         if( st.ttype == StreamTokenizer.TT_WORD ) {
            words.add( st.sval );
         }
      }
      System.out.println( words );
      br.close();
   }
}

java - Javaの配列から重複した単語を削除する方法

1 に答える 1

Related

Reference