solr - SOLR インデックス内の単語の合計頻度をカウントする

Question

SOLR インデックスで単語を検索すると、この単語を含むドキュメントのドキュメントカウントが取得されますが、その単語がドキュメントに複数回含まれている場合でも、合計カウントはドキュメントごとに 1 のままです。

返されたすべてのドキュメントが、フィールドに検索された単語が含まれている回数としてカウントされる必要があります。

Solr で単語頻度とSOLR 用語頻度を読み取り、用語ベクトルコンポーネントを有効にしましたが、機能しません。

私は自分のフィールドを次のように設定しました：

<field name="text_text" type="textgen" indexed="true" stored="true" termVectors="true" termPositions="true" termOffsets="true" />

しかし、次のクエリを作成すると:

http://localhost:8888/solr/sources/select?q=text_text%3A%22Peter+Pan%22&fl=text_text&wt=json&indent=true&tv.tf

カウントがありません:

{
  "responseHeader":{
    "status":0,
    "QTime":1,
    "params":{
      "fl":"text_text",
      "tv.tf":"",
      "indent":"true",
      "q":"text_text:\"Peter Pan\"",
      "wt":"json"}},
  "response":{"numFound":12,"start":0,"docs":[
      {
        "text_text":"Text of the document"},
      {
        "text_text":"Text of the document"},
      {
        "text_text":"Text of the document"},
      {
        "text_text":"Text of the document"},
      {
        "text_text":"Text of the document"},
      {
        "text_text":"Text of the document"},
      {
        "text_text":"Text of the document"},
      {
        "text_text":"Text of the document"}]
  }}

「numFound」の値は 12 ですが、「Peter Pan」という単語は 12 のドキュメントすべてに 20 回含まれています。

私が間違っているところを見つけるのを手伝ってくれませんか？

どうもありがとうございました！

score 0 · Accepted Answer

応答で単語の頻度を作成するこの構造を試してください。

http://localhost:8983/solr/core/select?indent=on&q=solr&fl=field,termfreq("field","term")&wt=json

solr - SOLR インデックス内の単語の合計頻度をカウントする

2 に答える 2

Related

Reference