python - Python 3 で単語のインデックスからなる辞書を作成する

Question

入力: 文字列のリスト['who are they','are you there?','Yes! you be there']

出力: 文字列内の各単語を、その単語を含むすべての文字列の ID で構成されるセットにマップする辞書。

output = {'who':[1], 'are':[1,2], 'they':[1], 'you':[2,3], 'there':[2], 'Yes':[3], 'be':[3]}

この関数を実行するメソッドまたは手順を作成できません。助けてください。

score 7 · Accepted Answer

オブジェクトを使用しcollections.defaultdictて ID を収集し、enumerate()それらを生成します。

from collections import defaultdict

output = defaultdict(list)

for index, sentence in enumerate(inputlist):
    for word in sentence.lower().split():
         output[word.strip('!?. ')].append(index)

文を小文字にし、残りの句読点を取り除くことに注意してください。

結果：

defaultdict(<class 'list'>, {'are': [0, 1], 'they': [0], 'be': [2], 'who': [0], 'yes': [2], 'there': [1, 2], 'you': [1, 2]})

これは 0 から始まるインデックスを使用します (Python のすべてのものと同様)。1 から数える必要がある場合はenumerate()、そこから数えるように指示します。

for index, sentence in enumerate(inputlist, 1):

score 0 · Accepted Answer

この楽しい解決策はどうですか：

import string
a = ['who are they','are you there?','Yes! you be there']
x ={}
for word in ' '.join(a).translate(None,string.punctuation).lower().split():
    try:x[word]+=1
    except:x[word]=1
print x

単語がどのように構成されているかは気にしないため、文字列を形成するための文字列のリストを join() します
句読点を削除する translate()
lower() すべての文字を小文字にするので、「はい」と「はい」を別々に扱わないようにします
split() 文字列を単語に
より長い if ステートメントを回避するために、try、except、およびコードゴルフ

python - Python 3 で単語のインデックスからなる辞書を作成する

3 に答える 3

Related

Reference