python - list.index() 呼び出しを含むこのラムダが非常に遅いのはなぜですか?

Question

cProfile の使用:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   17.834   17.834 <string>:1(<module>)
        1    0.007    0.007   17.834   17.834 basher.py:5551(_refresh)
        1    0.000    0.000   10.522   10.522 basher.py:1826(RefreshUI)
        4    0.024    0.006   10.517    2.629 basher.py:961(PopulateItems)
      211    1.494    0.007    7.488    0.035 basher.py:1849(PopulateItem)
      231    0.074    0.000    6.734    0.029 {method 'sort' of 'list' objects}
      215    0.002    0.000    6.688    0.031 bosh.py:4764(getOrdered)
     1910    3.039    0.002    6.648    0.003 bosh.py:4770(<lambda>)
      253    0.178    0.001    5.600    0.022 bosh.py:3325(getStatus)
        1    0.000    0.000    5.508    5.508 bosh.py:4327(refresh)
     1911    3.051    0.002    3.330    0.002 {method 'index' of 'list' objects}

その1910 3.039 0.002 6.648 0.003 bosh.py:4770(<lambda>)線は私を困惑させます。bosh.py:4770 にはmodNames.sort(key=lambda a: (a in data) and data.index(a))、 data と modNames がリストになっています。1911 3.051 0.002 3.330 0.002 {method 'index' of 'list' objects}この行から来ているように見えることに注意してください。

では、なぜこれはとても遅いのでしょうか? これを書き直してsort()、より高速に実行できる方法はありますか?

編集:このラムダを理解するために欠落していた最後の成分:

>>> True and 3
3

score 4 · Accepted Answer

YardGlassOfCode が述べたように、遅いのlambdaはそれ自体ではなく、遅いのはラムダ内の O(n) 操作です。a in dataとは両方ともdata.index(a)操作O(n)で、nはの長さですdata。また、効率に対する追加の侮辱として、への呼び出しはindexで行われた作業の多くを繰り返しa in dataます。の項目dataがハッシュ可能である場合、最初に dict を準備することでこれを大幅に高速化できます。

weight = dict(zip(data, range(len(data))))
modNames.sort(key=weight.get)  # Python2, or
modNames.sort(key=lambda a: weight.get(a, -1))  # works in Python3

各 dict ルックアップはO(1)操作であるため、これははるかに高速です。

modNames.sort(key=weight.get)整数未満として比較される None に依存することに注意してください。

In [39]: None < 0
Out[39]: True

Python3 ではNone < 0、TypeError. Soがにないlambda a: weight.get(a, -1)場合に -1 を返すために使用されます。aweight

python - list.index() 呼び出しを含むこのラムダが非常に遅いのはなぜですか?

1 に答える 1

Related

Reference