python - 数字を含むテキストから重要なキーワードを抽出する方法

翻译自：https://stackoverflow.com/questions/17158184 2013-06-17T22:56:35.467

171 次

ステミングと NLP メソッドを使用して、テキストファイルからキーワードを抽出しています。

出力キーワードを取得しました：

keywords = ['the lounge lizards', 'jazz', 'john lurie', 'musical', 'albums', 'bass guitar', 'drums', 'edit', 'erik satie', 'erik sanko']

# now to get numeric significant keywords 
# Applied this reg ex 
re.findall(r'\w+\s\d+.*?\s\w+', content)

numeric_keywords = ['in 1978 by', 'History\n2 Past', 'members\n3 Discography',    'albums\n3.2 Live', 'June 4th, 1979', 'October 7,1986): "The Lounge', 'In 1984 the', 'early 1990s; prominent']

数値を抽出するより良い方法はありますか? 両方の出力は同じファイルからのものです。

python - 数字を含むテキストから重要なキーワードを抽出する方法

0 に答える 0

Related

Reference