python - リスト内の要素のすべての出現を見つける方法

Question

index()リスト内の項目の最初の出現を示します。要素のリスト内のすべてのインデックスを返す巧妙なトリックはありますか?

score 758 · Accepted Answer

リスト内包表記を使用できます。

indices = [i for i, x in enumerate(my_list) if x == "whatever"]

イテレータは、リスト内の各項目のenumerate(my_list)ペアを生成します。ループ変数 target として(index, item)使用すると、これらのペアが indexと list itemにアンパックされます。基準に一致するものすべてに絞り込み、これらの要素のインデックスを選択します。i, xixxi

score 153 · Accepted Answer

リストの直接の解決策ではありませnumpyんが、この種のことには本当に優れています:

import numpy as np
values = np.array([1,2,3,1,2,4,5,6,3,2,1])
searchval = 3
ii = np.where(values == searchval)[0]

戻り値：

ii ==>array([2, 8])

これは、多数の要素を持つリスト (配列) の場合、他のソリューションと比較して大幅に高速になる可能性があります。

score 37 · Accepted Answer

を使用したソリューションlist.index:

def indices(lst, element):
    result = []
    offset = -1
    while True:
        try:
            offset = lst.index(element, offset+1)
        except ValueError:
            return result
        result.append(offset)

大きなリストの場合、を使用したリスト内包表記よりもはるかに高速ですenumerate。また、すでに配列がある場合numpyは、ソリューションよりもはるかに遅くなります。そうでない場合、変換のコストが速度の向上を上回ります (100、1000、および 10000 要素の整数リストでテスト)。

注: Chris_Rands のコメントに基づく注意: このソリューションは、結果が十分にまばらである場合、リスト内包表記よりも高速ですが、リストに検索対象の要素のインスタンスが多数ある場合 (リストの ~15% 以上) 、1000個の整数のリストを使用したテストで)、リストの理解はより高速です。

score 24 · Accepted Answer

どうですか：

In [1]: l=[1,2,3,4,3,2,5,6,7]

In [2]: [i for i,val in enumerate(l) if val==3]
Out[2]: [2, 4]

score 10 · Accepted Answer

occurrences = lambda s, lst: (i for i,e in enumerate(lst) if e == s)
list(occurrences(1, [1,2,3,1])) # = [0, 3]

score 6 · Accepted Answer

または使用range（python 3）：

l=[i for i in range(len(lst)) if lst[i]=='something...']

(python 2) の場合:

l=[i for i in xrange(len(lst)) if lst[i]=='something...']

そして（両方の場合）：

print(l)

期待通りです。

score 5 · Accepted Answer

リストを配列に変換する時間が含まれている場合、リスト内包表記よりも速くない単一の値のインデックスを見つけるために使用する答えがありますnp.where
aを aにインポートnumpyして変換するオーバーヘッドにより、ほとんどの状況で効率の悪いオプションを使用することになります。注意深いタイミング分析が必要です。 listnumpy.arraynumpy
- ただし、で複数の関数/操作を実行する必要がある場合はlist、をに変換してlistから関数をarray使用numpyする方が高速なオプションになる可能性があります。
このソリューションでは、とを使用np.whereして、リスト内のすべての一意の要素np.uniqueのインデックスを検索します。
- 配列での使用np.where(リストを配列に変換する時間を含む) は、すべての一意の要素のすべてのインデックスを見つけるため、リストでのリスト内包表記よりもわずかに高速です。
- これは、4 つの一意の値を持つ 2M 要素リストでテストされており、リスト/配列のサイズと一意の要素の数が影響します。
配列で使用する他のソリューションは、numpy 配列で繰り返される要素のすべてのインデックスのリストを取得するnumpyで見つけることができます

import numpy as np
import random  # to create test list

# create sample list
random.seed(365)
l = [random.choice(['s1', 's2', 's3', 's4']) for _ in range(20)]

# convert the list to an array for use with these numpy methods
a = np.array(l)

# create a dict of each unique entry and the associated indices
idx = {v: np.where(a == v)[0].tolist() for v in np.unique(a)}

# print(idx)
{'s1': [7, 9, 10, 11, 17],
 's2': [1, 3, 6, 8, 14, 18, 19],
 's3': [0, 2, 13, 16],
 's4': [4, 5, 12, 15]}

`%timeit`

# create 2M element list
random.seed(365)
l = [random.choice(['s1', 's2', 's3', 's4']) for _ in range(2000000)]

1 つの値のインデックスを見つける

4 つの一意の要素を持つ 2M 要素リスト内の単一要素のインデックスを検索します

# np.where: convert list to array
%%timeit
a = np.array(l)
np.where(a == 's1')
[out]:
409 ms ± 41.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

# list-comprehension: on list l
%timeit [i for i, x in enumerate(l) if x == "s1"]
[out]:
201 ms ± 24 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

# filter: on list l
%timeit list(filter(lambda i: l[i]=="s1", range(len(l))))
[out]:
344 ms ± 36.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

すべての値のインデックスを見つける

4 つの一意の要素を持つ 2M 要素リスト内のすべての一意の要素のインデックスを検索します

# use np.where and np.unique: convert list to array
%%timeit
a = np.array(l)
{v: np.where(a == v)[0].tolist() for v in np.unique(a)}
[out]:
682 ms ± 28 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

# list comprehension inside dict comprehension: on list l
%timeit {req_word: [idx for idx, word in enumerate(l) if word == req_word] for req_word in set(l)}
[out]:
713 ms ± 16.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

score 4 · Accepted Answer

すべてのオカレンスに対するもう1つの解決策（重複する場合は申し訳ありません）：

values = [1,2,3,1,2,4,5,6,3,2,1]
map(lambda val: (val, [i for i in xrange(len(values)) if values[i] == val]), values)

score 4 · Accepted Answer

python2 で filter() を使用します。

>>> q = ['Yeehaw', 'Yeehaw', 'Googol', 'B9', 'Googol', 'NSM', 'B9', 'NSM', 'Dont Ask', 'Googol']
>>> filter(lambda i: q[i]=="Googol", range(len(q)))
[2, 4, 9]

score 4 · Accepted Answer

リスト内の 1 つ以上の (同一の) アイテムのすべての出現箇所と位置を取得する

enumerate(alist) を使用すると、要素 x が探しているものと等しい場合に、リストのインデックスである最初の要素 (n) を格納できます。

>>> alist = ['foo', 'spam', 'egg', 'foo']
>>> foo_indexes = [n for n,x in enumerate(alist) if x=='foo']
>>> foo_indexes
[0, 3]
>>>

関数をfindindexにしましょう

この関数は、アイテムとリストを引数として取り、前に見たように、リスト内のアイテムの位置を返します。

def indexlist(item2find, list_or_string):
  "Returns all indexes of an item in a list or a string"
  return [n for n,item in enumerate(list_or_string) if item==item2find]

print(indexlist("1", "010101010"))

出力

[1, 3, 5, 7]

単純

for n, i in enumerate([1, 2, 3, 4, 1]):
    if i == 1:
        print(n)

出力：

0
4

score -1 · Accepted Answer

np.wherevsを使用した場合の時間パフォーマンスの比較を次に示しlist_comprehensionます。np.where平均して速いようです。

# np.where
start_times = []
end_times = []
for i in range(10000):
    start = time.time()
    start_times.append(start)
    temp_list = np.array([1,2,3,3,5])
    ixs = np.where(temp_list==3)[0].tolist()
    end = time.time()
    end_times.append(end)
print("Took on average {} seconds".format(
    np.mean(end_times)-np.mean(start_times)))

Took on average 3.81469726562e-06 seconds

# list_comprehension
start_times = []
end_times = []
for i in range(10000):
    start = time.time()
    start_times.append(start)
    temp_list = np.array([1,2,3,3,5])
    ixs = [i for i in range(len(temp_list)) if temp_list[i]==3]
    end = time.time()
    end_times.append(end)
print("Took on average {} seconds".format(
    np.mean(end_times)-np.mean(start_times)))

Took on average 4.05311584473e-06 seconds

python - リスト内の要素のすべての出現を見つける方法

17 に答える 17

%timeit

1 つの値のインデックスを見つける

すべての値のインデックスを見つける

リスト内の 1 つ以上の (同一の) アイテムのすべての出現箇所と位置を取得する

関数をfindindexにしましょう

単純

Related

Reference

`%timeit`