parent-child - elasitcsearchの親子ドキュメントのクエリ

Question

Elastic Search（ES）では、アイテムとスロットの2種類のドキュメントを処理します。ここで、アイテムはスロットドキュメントの親です。次のコマンドでインデックスを定義します。

curl -XPOST 'localhost:9200/items' -d @itemsdef.json

ここでitemsdef.json、次の定義があります

{
"mappings" : {
    "item" : {
        "properties" : {
            "id" : {"type" : "long" },
            "name" : {
                "type" : "string",
                "_analyzer" : "textIndexAnalyzer"   
            },
            "location" : {"type" : "geo_point" },
        }
    }
},
"settings" : {
    "analysis" : {
        "analyzer" : {

                "activityIndexAnalyzer" : {
                    "alias" : ["activityQueryAnalyzer"],
                    "type" : "custom",
                    "tokenizer" : "whitespace",
                    "filter" : ["trim", "lowercase", "asciifolding", "spanish_stop", "spanish_synonym"]
                },
                "textIndexAnalyzer" : {
                    "type" : "custom",
                    "tokenizer" : "whitespace",
                    "filter" : ["word_delimiter_impl", "trim", "lowercase", "asciifolding", "spanish_stop", "spanish_synonym"]
                },
                "textQueryAnalyzer" : {
                    "type" : "custom",
                    "tokenizer" : "whitespace",
                    "filter" : ["trim", "lowercase", "asciifolding", "spanish_stop"]
                }       
        },
        "filter" : {        
                "spanish_stop" : {
                    "type" : "stop",
                    "ignore_case" : true,
                    "enable_position_increments" : true,
                    "stopwords_path" : "analysis/spanish-stopwords.txt"
                },
                "spanish_synonym" : {
                    "type" : "synonym",
                    "synonyms_path" : "analysis/spanish-synonyms.txt"
                },
                "word_delimiter_impl" : {
                    "type" : "word_delimiter",
                    "generate_word_parts" : true,
                    "generate_number_parts" : true,
                    "catenate_words" : true,
                    "catenate_numbers" : true,
                    "split_on_case_change" : false                  
                }               
        }
    }
}
}

次に、次のコマンドを使用して子ドキュメント定義を追加します。

curl -XPOST 'localhost:9200/items/slot/_mapping' -d @slotsdef.json

slotsdef.json次の定義はどこにありますか。

{
"slot" : {
    "_parent" : {"type" : "item"},
    "_routing" : {
        "required" : true,
        "path" : "parent_id"
    },
    "properties": {
        "id" : { "type" : "long" },
        "parent_id" : { "type" : "long" },
        "activity" : {
            "type" : "string",
            "_analyzer" : "activityIndexAnalyzer"
        },
        "day" : { "type" : "integer" },
        "start" : { "type" : "integer" },
        "end" :  { "type" : "integer" }
    }
}   
}

最後に、次のコマンドを使用してバルクインデックスを実行します。

curl -XPOST 'localhost:9200/items/_bulk' --data-binary @testbulk.json

testbulk.jsonが次のデータを保持している場合：

{"index":{"_type": "item", "_id":35}}
{"location":[40.4,-3.6],"id":35,"name":"A Name"}
{"index":{"_type":"slot","_id":126,"_parent":35}}
{"id":126,"start":1330,"day":1,"end":1730,"activity":"An Activity","parent_id":35}

次のクエリを実行しようとしています。指定された日に子供（スロット）が存在する場所から特定の距離内にあり、特定の開始範囲と終了範囲内にあるすべてのアイテムを検索します。

条件を満たすスロットが多いアイテムほどスコアが高くなります。

既存のサンプルから始めてみましたが、ドキュメントが非常に少なく、先に進むのが困難です。

手がかり？

score 0 · Accepted Answer

場所をスロットに移動せずにこのようなことを行う効率的なクエリを作成する方法はないと思います。このようなことはできますが、一部のデータでは非常に非効率になる可能性があります。

{
    "query": {
        "top_children" : {
            "type": "blog_tag",
            "query" : {
                "constant_score" : {
                    "query" : {
                        ... your query for children goes here ...
                    }
                }            
            },
            "score" : "sum",
            "factor" : 5,
            "incremental_factor" : 2
        }
    },
    "filter": {
        "geo_distance" : {
            "distance" : "200km",
                "location" : {
                    "lat" : 40,
                    "lon" : -70
                }
            }
        }
    }
}

基本的に、このクエリが実行しているのはこれです。範囲クエリまたは子のフィルタ、およびその他の必要な条件を取得し、constant_scoreクエリにラップして、すべての子のスコアが1.0であることを確認します。クエリはこれらすべてのtop_children子を収集し、それらのスコアを親に蓄積します。次に、フィルターは、離れすぎている親をフィルターで除外します。

parent-child - elasitcsearchの親子ドキュメントのクエリ

1 に答える 1

Related

Reference