java - ワイルドカードを使用した lucene 検索が大幅に遅い

Question

テキストのみで、ファイルあたりのサイズが最大 8Mb の 1000 個のファイル (および年に 2 倍に増加) のリストがあり、(ワイルドカード) 式を指定してファイル名を見つけようとしています。

例、すべてのファイルにそのようなデータが含まれています

COD1004129641208240002709991455671866 4IT / HUF 4400QQQUF 3300QQQUF

私の検索は、上記の行に一致する「* 9991455671866」です。

問題は (私の期待が高すぎるのかもしれません) 、結果を返すのに 1 分以上かかることです。

私のドキュメントのインデックス作成は次のようになります。

private Document getDocument(File file) throws IOException
{
    FileReader reader = new FileReader(file);
    Document doc = new Document();
    doc.add(new Field(IndexProperties.FIELD_FILENAME, file.getName(), Field.Store.YES, Field.Index.NOT_ANALYZED)); 
    doc.add(new Field(IndexProperties.FIELD_CONTENT, reader));

    return doc;
}

アナライザー

        Directory fsDir = FSDirectory.open(new File(indexFolder));
        Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_36);

        // build the writer
        IndexWriterConfig indexWriter = new IndexWriterConfig(Version.LUCENE_36, analyzer);
        IndexWriter writer = new IndexWriter(fsDir, indexWriter);

ワイルドカードによる検索は次のとおりです。

public List<String> findFilenameByContent(String wildCardContent, String INDEX_FOLDER, String TICKETS_FOLDER) throws Exception
{   
    long start = System.currentTimeMillis();
    Term term = new Term(IndexProperties.FIELD_CONTENT, wildCardContent); //eg *9991455671866
    Query query = new WildcardQuery(term);

    //loop through docs
    Directory fsDir = FSDirectory.open(new File(INDEX_FOLDER));
    IndexSearcher searcher = new IndexSearcher(IndexReader.open(fsDir));
    ScoreDoc[] queryResults = searcher.search(query, 10).scoreDocs;  
    List<String> strs = new ArrayList<String>();

    for (ScoreDoc scoreDoc : queryResults) 
    {  
        Document doc = searcher.doc(scoreDoc.doc);  
        strs.add(doc.get(IndexProperties.FIELD_FILENAME));
    }

    searcher.close();
    long end = System.currentTimeMillis();
    System.out.println("TOTAL SEARCH TIME: "+(end-start)/1000.0+ "secs");
    return strs;
}

score 1 · Accepted Answer

あなたのコードに問題はありません。検索のみが必要な場合は、次を試してください。

IndexReader.open(fsDir,true);

検索時間が改善される場合があります。

この提案が役立つ場合があります。

java - ワイルドカードを使用した lucene 検索が大幅に遅い

1 に答える 1

Related

Reference