postgresql - 電話番号の Elasticsearch 検索

Question

インデックスを作成して検索に使用したいpostgres配列列があります。以下に例を示します。

電話 = [ "+175 (2) 123-25-32", "123456789", "+12 111-111-11" ]

分析APIを使用してトークンを分析しました.elasticsearchは、次のようにフィールドを複数のフィールドにトークン化しています

curl -XGET 'localhost:9200/_analyze' -d '
{
  "analyzer" : "standard",
  "text" : [ "+175 (2) 123-25-32", "123456789", "+12 111-111-11" ]
}'


{
  "tokens": [
    {
      "token": "analyzer",
      "start_offset": 6,
      "end_offset": 14,
      "type": "<ALPHANUM>",
      "position": 1
    },
    {
      "token": "standard",
      "start_offset": 19,
      "end_offset": 27,
      "type": "<ALPHANUM>",
      "position": 2
    },
    {
      "token": "text",
      "start_offset": 33,
      "end_offset": 37,
      "type": "<ALPHANUM>",
      "position": 3
    },
    {
      "token": "175",
      "start_offset": 45,
      "end_offset": 48,
      "type": "<NUM>",
      "position": 4
    },
    {
      "token": "2",
      "start_offset": 50,
      "end_offset": 51,
      "type": "<NUM>",
      "position": 5
    },
    {
      "token": "123",
      "start_offset": 53,
      "end_offset": 56,
      "type": "<NUM>",
      "position": 6
    },
    {
      "token": "25",
      "start_offset": 57,
      "end_offset": 59,
      "type": "<NUM>",
      "position": 7
    },
    {
      "token": "32",
      "start_offset": 60,
      "end_offset": 62,
      "type": "<NUM>",
      "position": 8
    },
    {
      "token": "123456789",
      "start_offset": 66,
      "end_offset": 75,
      "type": "<NUM>",
      "position": 9
    },
    {
      "token": "12",
      "start_offset": 80,
      "end_offset": 82,
      "type": "<NUM>",
      "position": 10
    },
    {
      "token": "111",
      "start_offset": 83,
      "end_offset": 86,
      "type": "<NUM>",
      "position": 11
    },
    {
      "token": "111",
      "start_offset": 87,
      "end_offset": 90,
      "type": "<NUM>",
      "position": 12
    },
    {
      "token": "11",
      "start_offset": 91,
      "end_offset": 93,
      "type": "<NUM>",
      "position": 13
    }
  ]
}

私はelasticsearchにトークン化を行わず、特殊文字なしで数字を保存することを望んでいました.検索結果に表示されます。

私のマッピングは以下のとおりです。

{ :id => { :type => "string"}, :secondary_phones => { :type => "string" } }

これが私がクエリを実行しようとしている方法です

      settings = {
        query: {
          filtered: {
            filter: {
              bool: {
                should: [
                  { terms: { phones: [ "+175 (2) 123-25-32", "123456789", "+12 111-111-11" ] } },
                ]
              }
            }
          }
        },
        size: 100,
      }

PS特殊文字を削除してみましたが、うまくいきませんでした。

私はそれが達成可能であると確信しており、何かが欠けています。提案してください。

ありがとう。

score 0 · Accepted Answer

termsクエリの例のように、データに対して正確な一致を実行するだけの場合、最善の方法は、マッピングのindexマッピングパラメーターを単純にに設定することnot_analyzedです。こちらのドキュメントをご覧ください。

これにより、値の分析 (またはトークン化) が完全に無効になり、フィールドの内容 (配列内の各項目) が単一のトークン/キーワードとして扱われます。

postgresql - 電話番号の Elasticsearch 検索

1 に答える 1

Related

Reference