java - ソースに Elastic Search Server の指定された検索テキストが含まれている場合、すべてのドキュメントを取得します

Question

エラスティック検索は初めてです。エラスティック検索インデックスでフィールドを「文字列」にマップしました。フィールド値に指定された検索テキストが含まれている場合、ドキュメントを取得する必要があります。

JSON1 : "{\"id\":\"1\",\"message\":\"Welcome to elastic search\"}"
JSON2 : "{\"id\":\"2\",\"message\":\"elasticsearch\"}"

「elastic」で検索すると、両方のレコードを取得する必要があります。私は最初のものだけを取得しています。

現在、FTS に基づいてドキュメントを取得しています。Elastic Search の psql で like/ilike の検索を実現する方法を教えてください。

前もって感謝します。

score 1 · Accepted Answer

それはトークナイザーの問題です。NGram http://www.elasticsearch.org/guide/reference/index-modules/analysis/ngram-tokenizer/をご覧ください。

ルートを使用してテストできます/_analyze

Elasticsearch がデフォルトでトークン化する方法は次のとおりです。

curl -XGET 'localhost:9200/_analyze?tokenizer=standard' -d 'this is a test elasticsearch'

{
"tokens": [{
        "token": "this",
        "start_offset": 0,
        "end_offset": 4,
        "type": "<ALPHANUM>",
        "position": 1
    }, {
        "token": "is",
        "start_offset": 5,
        "end_offset": 7,
        "type": "<ALPHANUM>",
        "position": 2
    }, {
        "token": "a",
        "start_offset": 8,
        "end_offset": 9,
        "type": "<ALPHANUM>",
        "position": 3
    }, {
        "token": "test",
        "start_offset": 10,
        "end_offset": 14,
        "type": "<ALPHANUM>",
        "position": 4
    }, {
        "token": "elasticsearch",
        "start_offset": 15,
        "end_offset": 28,
        "type": "<ALPHANUM>",
        "position": 5
    }
]

}

nGram とデフォルト値の例を次に示します。

curl -XGET 'localhost:9200/_analyze?tokenizer=nGram' -d 'this is a test elasticsearch'

{
    "tokens": [{
            "token": "t",
            "start_offset": 0,
            "end_offset": 1,
            "type": "word",
            "position": 1
        }, {
            "token": "h",
            "start_offset": 1,
            "end_offset": 2,
            "type": "word",
            "position": 2
        }, {
            "token": "i",
            "start_offset": 2,
            "end_offset": 3,
            "type": "word",
            "position": 3
        }, {
            "token": "s",
            "start_offset": 3,
            "end_offset": 4,
            "type": "word",
            "position": 4
        }, {
            "token": " ",
            "start_offset": 4,
            "end_offset": 5,
            "type": "word",
            "position": 5
        }, {
            "token": "i",
            "start_offset": 5,
            "end_offset": 6,
            "type": "word",
            "position": 6
        }, {
            "token": "s",
            "start_offset": 6,
            "end_offset": 7,
            "type": "word",
            "position": 7
        }, {
            "token": " ",
            "start_offset": 7,
            "end_offset": 8,
            "type": "word",
            "position": 8
        }, {
            "token": "a",
            "start_offset": 8,
            "end_offset": 9,
            "type": "word",
            "position": 9
        }, {
            "token": " ",
            "start_offset": 9,
            "end_offset": 10,
            "type": "word",
            "position": 10
        }, {
            "token": "t",
            "start_offset": 10,
            "end_offset": 11,
            "type": "word",
            "position": 11
        }, {
            "token": "e",
            "start_offset": 11,
            "end_offset": 12,
            "type": "word",
            "position": 12
        }, {
            "token": "s",
            "start_offset": 12,
            "end_offset": 13,
            "type": "word",
            "position": 13
        }, {
            "token": "t",
            "start_offset": 13,
            "end_offset": 14,
            "type": "word",
            "position": 14
        }, {
            "token": " ",
            "start_offset": 14,
            "end_offset": 15,
            "type": "word",
            "position": 15
        }, {
            "token": "e",
            "start_offset": 15,
            "end_offset": 16,
            "type": "word",
            "position": 16
        }, {
            "token": "l",
            "start_offset": 16,
            "end_offset": 17,
            "type": "word",
            "position": 17
        }, {
            "token": "a",
            "start_offset": 17,
            "end_offset": 18,
            "type": "word",
            "position": 18
        }, {
            "token": "s",
            "start_offset": 18,
            "end_offset": 19,
            "type": "word",
            "position": 19
        }, {
            "token": "t",
            "start_offset": 19,
            "end_offset": 20,
            "type": "word",
            "position": 20
        }, {
            "token": "i",
            "start_offset": 20,
            "end_offset": 21,
            "type": "word",
            "position": 21
        }, {
            "token": "c",
            "start_offset": 21,
            "end_offset": 22,
            "type": "word",
            "position": 22
        }, {
            "token": "s",
            "start_offset": 22,
            "end_offset": 23,
            "type": "word",
            "position": 23
        }, {
            "token": "e",
            "start_offset": 23,
            "end_offset": 24,
            "type": "word",
            "position": 24
        }, {
            "token": "a",
            "start_offset": 24,
            "end_offset": 25,
            "type": "word",
            "position": 25
        }, {
            "token": "r",
            "start_offset": 25,
            "end_offset": 26,
            "type": "word",
            "position": 26
        }, {
            "token": "c",
            "start_offset": 26,
            "end_offset": 27,
            "type": "word",
            "position": 27
        }, {
            "token": "h",
            "start_offset": 27,
            "end_offset": 28,
            "type": "word",
            "position": 28
        }, {
            "token": "th",
            "start_offset": 0,
            "end_offset": 2,
            "type": "word",
            "position": 29
        }, {
            "token": "hi",
            "start_offset": 1,
            "end_offset": 3,
            "type": "word",
            "position": 30
        }, {
            "token": "is",
            "start_offset": 2,
            "end_offset": 4,
            "type": "word",
            "position": 31
        }, {
            "token": "s ",
            "start_offset": 3,
            "end_offset": 5,
            "type": "word",
            "position": 32
        }, {
            "token": " i",
            "start_offset": 4,
            "end_offset": 6,
            "type": "word",
            "position": 33
        }, {
            "token": "is",
            "start_offset": 5,
            "end_offset": 7,
            "type": "word",
            "position": 34
        }, {
            "token": "s ",
            "start_offset": 6,
            "end_offset": 8,
            "type": "word",
            "position": 35
        }, {
            "token": " a",
            "start_offset": 7,
            "end_offset": 9,
            "type": "word",
            "position": 36
        }, {
            "token": "a ",
            "start_offset": 8,
            "end_offset": 10,
            "type": "word",
            "position": 37
        }, {
            "token": " t",
            "start_offset": 9,
            "end_offset": 11,
            "type": "word",
            "position": 38
        }, {
            "token": "te",
            "start_offset": 10,
            "end_offset": 12,
            "type": "word",
            "position": 39
        }, {
            "token": "es",
            "start_offset": 11,
            "end_offset": 13,
            "type": "word",
            "position": 40
        }, {
            "token": "st",
            "start_offset": 12,
            "end_offset": 14,
            "type": "word",
            "position": 41
        }, {
            "token": "t ",
            "start_offset": 13,
            "end_offset": 15,
            "type": "word",
            "position": 42
        }, {
            "token": " e",
            "start_offset": 14,
            "end_offset": 16,
            "type": "word",
            "position": 43
        }, {
            "token": "el",
            "start_offset": 15,
            "end_offset": 17,
            "type": "word",
            "position": 44
        }, {
            "token": "la",
            "start_offset": 16,
            "end_offset": 18,
            "type": "word",
            "position": 45
        }, {
            "token": "as",
            "start_offset": 17,
            "end_offset": 19,
            "type": "word",
            "position": 46
        }, {
            "token": "st",
            "start_offset": 18,
            "end_offset": 20,
            "type": "word",
            "position": 47
        }, {
            "token": "ti",
            "start_offset": 19,
            "end_offset": 21,
            "type": "word",
            "position": 48
        }, {
            "token": "ic",
            "start_offset": 20,
            "end_offset": 22,
            "type": "word",
            "position": 49
        }, {
            "token": "cs",
            "start_offset": 21,
            "end_offset": 23,
            "type": "word",
            "position": 50
        }, {
            "token": "se",
            "start_offset": 22,
            "end_offset": 24,
            "type": "word",
            "position": 51
        }, {
            "token": "ea",
            "start_offset": 23,
            "end_offset": 25,
            "type": "word",
            "position": 52
        }, {
            "token": "ar",
            "start_offset": 24,
            "end_offset": 26,
            "type": "word",
            "position": 53
        }, {
            "token": "rc",
            "start_offset": 25,
            "end_offset": 27,
            "type": "word",
            "position": 54
        }, {
            "token": "ch",
            "start_offset": 26,
            "end_offset": 28,
            "type": "word",
            "position": 55
        }
    ]
}

インデックスに適切なアナライザー/トークナイザーを設定する例のリンクを次に示します。elasticsearch でトークナイザーをセットアップする方法

次に、クエリは期待されるドキュメントを返す必要があります。

java - ソースに Elastic Search Server の指定された検索テキストが含まれている場合、すべてのドキュメントを取得します

1 に答える 1

Related

Reference