python - リスト要素を別のリストに含めることに基づいてバイナリリストを作成する方法

Question

単語の 2 つのリストとが与えられた場合、辞書の i 番目の単語が文に現れることを 1 が示すように、の単語を含めることに基づいてバイナリ表現を作成しようとしていますdictionary。sentencedictionarysentence[1,0,0,0,0,0,1,...,0]

これを行う最速の方法は何ですか?

サンプルデータ:

dictionary =  ['aardvark', 'apple','eat','I','like','maize','man','to','zebra', 'zed']
sentence = ['I', 'like', 'to', 'eat', 'apples']
result = [0,0,1,1,1,0,0,1,0,0]

サイズが約 56'000 要素の非常に大きなリストで作業していることを考えると、次よりも速いものはありますか?

x = [int(i in sentence) for i in dictionary]

score 0 · Accepted Answer

使用sets、合計時間の複雑さO(N):

>>> sentence = ['I', 'like', 'to', 'eat', 'apples']
>>> dictionary =  ['aardvark', 'apple','eat','I','like','maize','man','to','zebra', 'zed']
>>> s= set(sentence)
>>> [int(word in s) for word in dictionary]
[0, 0, 1, 1, 1, 0, 0, 1, 0, 0]

文リストに単語ではなく実際の文が含まれている場合は、これを試してください。

>>> sentences= ["foobar foo", "spam eggs" ,"monty python"]
>>> words=["foo", "oof", "bar", "pyth" ,"spam"]
>>> from itertools import chain

# fetch words from each sentence and create a flattened set of all words
>>> s = set(chain(*(x.split() for x in sentences)))

>>> [int(x in s) for x in words]
[1, 0, 0, 0, 1]

score 0 · Accepted Answer

私は次のようなことを提案します：

words = set(['hello','there']) #have the words available as a set
sentance = ['hello','monkey','theres','there']
rep = [ 1 if w in words else 0 for w in sentance ]
>>> 
[1, 0, 0, 1]

セットにはO（1）ルックアップ時間があり、入っているかどうかを確認するには一定の時間がかかるため、私はこのアプローチを採用wしwordsます。これにより、各単語を 1 回訪問する必要があるため、リスト内包表記は O(n) になります。これは、あなたが得られるものに近いか、同じくらい効率的だと思います。

「ブール」配列の作成についても言及しましたが、これにより、代わりに次のものを単純に使用できます。

rep = [ w in words for w in sentance ]
>>> 
[True, False, False, True]

python - リスト要素を別のリストに含めることに基づいてバイナリ リストを作成する方法

3 に答える 3

Related

Reference

python - リスト要素を別のリストに含めることに基づいてバイナリリストを作成する方法