types - Elasticsearch: オブジェクトフィールドのすべての (おそらく動的な) サブフィールドを文字列として宣言する方法はありますか?

Question

この非常に単純化されたものに似たマッピングを持つ doc_type があります。

{
   "test":{
      "properties":{
         "name":{
            "type":"string"
         },
         "long_searchable_text":{
            "type":"string"
         },
         "clearances":{
            "type":"object"
         }
      }
   }
}

このフィールドclearancesは、フィルタリング用の一連の英数字識別子を持つオブジェクトである必要があります。一般的なドキュメントは次の形式になります。

{
    "name": "Lord Macbeth",
    "long_searchable_text": "Life's but a walking shadow, a poor player, that..."
    "clearances": {
        "glamis": "aa2862jsgd",
        "cawdor": "3463463551"
    }
}

問題は、索引付け中に、オブジェクト・フィールド内の新しいフィールドの最初に索引付けされたコンテンツが、clearances上記の場合のように完全に数値になる場合があることです。これにより、Elasticsearch はこのフィールドのタイプをとして推測しlongます。しかし、これは事故です。フィールドは、別の文書では英数字である可能性があります。このフィールドに英数字の値を含む後者のドキュメントが到着すると、解析例外が発生します。

{"error":"MapperParsingException[failed to parse [clearances.cawdor]]; nested: NumberFormatException[For input string: \"af654hgss1\"]; ","status":400}%

次のように定義された動的テンプレートでこれを解決しようとしました:

{
   "test":{
      "properties":{
         "name":{
            "type":"string"
         },
         "long_searchable_text":{
            "type":"string"
         },
         "clearances":{
            "type":"object"
         }
      }
   },
   "dynamic_templates":[
      {
         "source_template":{
            "match":"clearances.*",
            "mapping":{
               "type":"string",
               "index":"not_analyzed"
            }
         }
      }
   ]
}

clearance.some_subfieldしかし、最初に索引付けされたドキュメントが整数として解析できる値を持つ場合、それは整数として推論され、そのサブフィールドに英数字の値を持つ後続のすべてのドキュメントは索引付けに失敗するということが起こり続けます。

マッピング内の現在のすべてのサブフィールドをリストすることはできますが、それらは多数あり、将来的にその数が増えると予想されます (マッピングの更新と完全な再インデックスの必要性をトリガーする...)。

新しいサブフィールドが追加されるたびに、この完全な再インデックスに頼らずにこれを機能させる方法はありますか?

score 8 · Accepted Answer

あなたはほとんどそこにいます。

まず、動的マッピングのパスは onclearances.*である必要がありpath_match、プレーンではなくである必要がありますmatch。

実行可能な例を次に示します: https://www.found.no/play/gist/df030f005da71827ca96

export ELASTICSEARCH_ENDPOINT="http://localhost:9200"

# Create indexes

curl -XPUT "$ELASTICSEARCH_ENDPOINT/play" -d '{
    "settings": {},
    "mappings": {
        "test": {
            "dynamic_templates": [
                {
                    "clearances_as_string": {
                        "path_match": "clearances.*",
                        "mapping": {
                            "type": "string",
                            "index": "not_analyzed"
                        }
                    }
                }
            ]
        }
    }
}'


# Index documents
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_bulk?refresh=true" -d '
{"index":{"_index":"play","_type":"test"}}
{"clearances":{"glamis":1234,"cawdor":5678}}
{"index":{"_index":"play","_type":"test"}}
{"clearances":{"glamis":"aa2862jsgd","cawdor":"some string"}}
'

# Do searches

curl -XPOST "$ELASTICSEARCH_ENDPOINT/_search?pretty" -d '
{
    "facets": {
        "cawdor": {
            "terms": {
                "field": "clearances.cawdor"
            }
        }
    }
}
'

types - Elasticsearch: オブジェクト フィールドのすべての (おそらく動的な) サブフィールドを文字列として宣言する方法はありますか?

1 に答える 1

Related

Reference

types - Elasticsearch: オブジェクトフィールドのすべての (おそらく動的な) サブフィールドを文字列として宣言する方法はありますか?