postgresql - PostgreSQL 9.4 での ts_vector の出現回数による語彙素のクエリ

Question

語彙素が ts_vector 内に出現する回数に基づいて、WHERE ステートメントを使用して PostgreSQL にクエリを実行することは可能ですか?

たとえば、「猫の上にシルクハット」というフレーズを含む ts_vector を作成すると、次のことができますSELECT * FROM table WHERE ts_vector @@ {the lexeme 'top' appears twice}か?

score 1 · Accepted Answer

この機能を使用できます：

create or replace function number_of_occurrences(vector tsvector, token text)
returns integer language sql stable as $$
    select coalesce((
        select length(elem)- length(replace(elem, ',', ''))+ 1
        from unnest(string_to_array(vector::text, ' ')) elem
        where trim(elem, '''') like token || '%'), 0)
$$;

select number_of_occurrences(to_tsvector('top hat on top of the cat'), 'top');

 number_of_occurrences 
-----------------------
                     2
(1 row)

もちろん、この関数は、ベクトルに位置付きの語彙素が含まれている場合にのみ正しく機能します。

select to_tsvector('top hat on top of the cat');

                   to_tsvector                   
-------------------------------------------------
 'cat':7 'hat':2 'of':5 'on':3 'the':6 'top':1,4
(1 row)

関数の使用例:

SELECT * 
FROM a_table 
WHERE ts_vector @@ to_tsquery('top')
AND number_of_occurrences(ts_vector, 'top') = 2;

score 0 · Accepted Answer

この目的でunnestとの組み合わせを使用できますarray_length

SELECT *
FROM table
WHERE (
  SELECT array_length(positions, 1)
  FROM unnest(ts_vector)
  WHERE lexeme = 'top'
) = 2

これで GIN インデックスを使用できるとは思いませんがts_vector、受け入れられた回答関数で実行される文字列操作よりも高速になる可能性があります。

postgresql - PostgreSQL 9.4 での ts_vector の出現回数による語彙素のクエリ

2 に答える 2

Related

Reference