python - スペーシーのある名詞句

Question

spacy を使用してテキストから名詞句を抽出するにはどうすればよいですか?
品詞タグについて言及しているわけではありません。ドキュメントでは、名詞句や通常の構文木については何も見つかりません。

score 64 · Accepted Answer

ベース NP、つまり調整、前置詞句、または関係節のない NP が必要な場合は、Doc および Span オブジェクトで noun_chunks イテレータを使用できます。

>>> from spacy.en import English
>>> nlp = English()
>>> doc = nlp(u'The cat and the dog sleep in the basket near the door.')
>>> for np in doc.noun_chunks:
>>>     np.text
u'The cat'
u'the dog'
u'the basket'
u'the door'

他の何かが必要な場合は、文の単語を反復処理し、構文のコンテキストを考慮して、その単語が目的のフレーズタイプを支配しているかどうかを判断するのが最善の方法です。存在する場合は、そのサブツリーを生成します。

from spacy.symbols import *

np_labels = set([nsubj, nsubjpass, dobj, iobj, pobj]) # Probably others too
def iter_nps(doc):
    for word in doc:
        if word.dep in np_labels:
            yield word.subtree

python - スペーシーのある名詞句

5 に答える 5

Related

Reference