java - /（スラッシュ）を含むコンパスクエリ

Question

プロジェクトでコンパスベースのインデックスを使用しています。フィールド'name'の注釈ベースの構成は次のとおりです。

@SearchableProperty(name="name")
@SearchableMetaData(name="ordering_name", index=Index.NOT_ANALYZED)
private String name;

これで、次の値が「name」フィールドに格納されます。

1. Temp 0 New n/a
2. e/f search
3. c/d search

これで、シナリオが異なる検索結果は次のようになります。

1. 'c/d' -> +(+alias:TempClass +(c/d*)) +(alias:TempClass) -> 1 record found
2. 'n/a' -> +(+alias:TempClass +(n/a*)) +(alias:TempClass) -> 0 record found
3. 'search' -> +(+alias:TempClass +(search*)) +(alias:TempClass) -> 2 records found

したがって、「n / a」を検索しようとすると、値が「Temp 0 Newn/a」の最初のレコードを検索する必要があります。

どんな助けでも大歓迎です!!!

score 1 · Accepted Answer

ある時点で、クエリ分析がドキュメント分析と一致しません。

ほとんどの場合、クエリの解析でLuceneのStandardAnalyzerを内部的に使用していますが、インデックス時には使用していません。

@SearchableMetaData(name="ordering_name", index=Index.NOT_ANALYZED))

このアナライザー内で使用されるStandardTokenizer/は、文字を単語の境界（スペースなど）と見なし、トークンnとを生成しaます。後で、トークンaはStopFilterによって削除されます。

次のコードは、この説明の例です（入力は"c/d e/f n/a"）：

Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_36);
TokenStream tokenStream = analyzer.tokenStream("CONTENT", new StringReader("c/d e/f n/a"));
CharTermAttribute term = tokenStream.getAttribute(CharTermAttribute.class);
PositionIncrementAttribute position = tokenStream.getAttribute(PositionIncrementAttribute.class);
int pos = 0;
while (tokenStream.incrementToken()) {
    String termStr = term.toString();
    int incr = position.getPositionIncrement();
    if (incr == 0 ) {
        System.out.print(" [" + termStr + "]");
    } else {
        pos += incr;
        System.out.println(" " + pos + ": [" + termStr +"]");
    }
}

次の抽出されたトークンが表示されます。

 1: [c]
 2: [d]
 3: [e]
 4: [f]
 5: [n]

予想される位置6：トークン付きaが欠落していることに注意してください。ご覧のとおり、LuceneのQueryParserは次のトークン化も実行します。

QueryParser parser = new QueryParser(Version.LUCENE_36, "content", new StandardAnalyzer(Version.LUCENE_36));
System.out.println(parser.parse("+n/a*"));

出力は次のとおりです。

+content:n

編集：解決策は、 WhitespaceAnalyzerを使用し、フィールドをANALYZEDに設定することです。次のコードは、Luceneでの概念実証です。

IndexWriter writer = new IndexWriter(new RAMDirectory(), new IndexWriterConfig(Version.LUCENE_36, new WhitespaceAnalyzer(Version.LUCENE_36)));
Document doc = new Document();
doc.add(new Field("content","Temp 0 New n/a", Store.YES, Index.ANALYZED));
writer.addDocument(doc);
writer.commit();
IndexReader reader = IndexReader.open(writer, true);
IndexSearcher searcher = new IndexSearcher(reader);
BooleanQuery query = new BooleanQuery();
QueryParser parser = new QueryParser(Version.LUCENE_36, "content", new WhitespaceAnalyzer(Version.LUCENE_36));
TopDocs docs = searcher.search(parser.parse("+n/a"), 10);
System.out.println(docs.totalHits);
writer.close();

出力は次のとおり1です。

java - /（スラッシュ）を含むコンパスクエリ

1 に答える 1

Related

Reference