python - ボキャブラリの修正セットを使用してランダムに生成された解析ツリー

Question

私はPython 3.2を使用しており、ランダムに生成された文の解析ツリーを構築しようとしました.文を生成することは確かですが、解析ツリーがどれほどランダムかはわかりません.このコードを改善するためのより良い/より効率的な方法。(私はプログラミングや Python 自体は初めてで、最近 NLP に興味を持っています。アドバイス、解決策、または修正は大歓迎です。)

 N=['man','dog','cat','telescope','park']  #noun
 P=['in','on','by','with']   #preposition
 det=['a','an','the','my']   #determinant
 V=['saw','ate','walked']    #verb
NP=['John','Mary','Bob']    #noun phrase


from random import choice
 PP=choice(NP)+' '+choice(P)   #preposition phrase
 PP=''.join(PP)
 VP=''.join(choice(V)+' '+choice(NP)) or''.join(choice(V)+' '.choice(NP)+(PP)) #verb phrase         
 VP=''.join(VP) #verb phrase 
 S=choice(NP)+' '+VP  #sentence
 print(S)

score 2 · Accepted Answer

NLTK を試してみてください。http://nltk.org/book/ch08.html

import nltk
from random import choice, shuffle, random

# Sometimes i find reading terminals as values into a dict of POS helps.
vocab={
'Det':['a','an','the','my'],
'N':['man','dog','cat','telescope','park'],
'V':['saw','ate','walked'],
'P':['in','on','by','with'],
'NP':['John','Mary','Bob']
}

vocab2string = [pos + " -> '" + "' | '".join(vocab[pos])+"'" for pos in vocab]

# Rules are simpler to be manually crafted so i left them in strings
rules = '''
S -> NP VP
VP -> V NP
VP -> V NP PP
PP -> NP P
NP -> Det N
'''

mygrammar = rules + "\n".join(vocab2string)
grammar = nltk.parse_cfg(mygrammar) # Loaded your grammar
parser =  nltk.ChartParser(grammar) # Loaded grammar into a parser

# Randomly select one terminal from each POS, based on infinite monkey theorem, i.e. selection of words without grammatical order, see https://en.wikipedia.org/wiki/Infinite_monkey_theorem
words = [choice(vocab[pos]) for pos in vocab if pos != 'P'] # without PP
words = [choice(vocab[pos]) for pos in vocab] + choice(vocab('NP')) # with a PP you need 3 NPs

# To make sure that you always generate a grammatical sentence
trees = []
while trees != []:
  shuffle(words)
  trees = parser.nbest_parse(words)

for t in trees:
  print t

python - ボキャブラリの修正セットを使用してランダムに生成された解析ツリー

1 に答える 1

Related

Reference