python - Pythonでの「in」および「for」操作の効率

Question

ここには、A と B という 2 つのリストがあります。

len(A) > len(B)

何かを行うには、次の 2 つの方法があります。

方法 1:

def f():
    return [someFunc(i) for i in A if i in B]

方法 2:

def f():
    return [someFunc(i) for i in B if i in A]

どちらがより効率的で、その理由は何ですか?

score 4 · Accepted Answer

操作の時間計算量はであり、in操作O(n)全体はですがO(nm)、これは主に問題のサイズに応じてどのようにスケーリングするかに当てはまります。Aパフォーマンスに関しては、とBが相互に排他的であるという最悪のシナリオを除けば、一致が見つかると反復が停止するため、 for B if i in A（ここでlen(A) > len(B)）実行する方が高速である必要があります。in

のすべてのエントリが同じ値Aを持つベストケースのシナリオを考えてみましょう。操作は効果的になりB、操作全体になります。inO(1)O(m)

そして、みんなのお気に入り、いくつかのベンチマーク：

$ python -m timeit "A=list(range(100000));B=list(range(100))" "[i for i in A if i in B]"
10 loops, best of 3: 113 msec per loop

$ python -m timeit "A=list(range(100000));B=list(range(100))" "[i for i in B if i in A]"
100 loops, best of 3: 2.6 msec per loop

Performance aside, do note that the two functions you provided do not do the same thing. The first iterates through A and discards values that do not appear in B and this is not the same as iterating through B and discarding values that do not appear in A. Going back to the scenario where all values in both list are the same, the first function will return len(A) elements while the second len(B) elements.

score 4 · Accepted Answer

他のリストの各要素に対して各リストを反復処理しているため、両方とも O(mn) です。

score 3 · Accepted Answer

def f(A, B):
    return [someFunc(_) for _ in set(A).intersection(B)]

これを行う最も効率的な方法である必要があります（少なくとも、リストが時間の複雑さが問題になるのに十分な長さである場合）

python - Pythonでの「in」および「for」操作の効率

3 に答える 3

Related

Reference