python - `starmap` が `List Comprehension` より優先される場合

Question

増加する一連の数値間の差の不格好な計算という質問に答えながら、もっと美しい方法はありますか? 、 itertools.starmap を使用するソリューションと、 itertools.starmapList Comprehensionを使用するソリューションの 2 つのソリューションを思いつきました。

私にとって、list comprehensionSyntax はより明快で読みやすく、冗長ではなく、より Pythonic に見えます。しかし、スターマップはitertoolsで十分に利用できるので、それには理由があるはずだと思っていました。

私の質問はいつstarmap優先されるのList Comprehensionですか？

注スタイルの問題である場合、それは間違いなく矛盾していますThere should be one-- and preferably only one --obvious way to do it.

直接比較

読みやすさが重要です。---LC

これも認識の問題ですが、私にとってLCはより読みやすいですstarmap。を使用するには、明示的な関数をインポートするか定義するstarmap必要がありますが、それでもから追加のインポートが必要です。operatorlambdamulti-variableitertools

パフォーマンス ---LC

>>> def using_star_map(nums):
    delta=starmap(sub,izip(nums[1:],nums))
    return sum(delta)/float(len(nums)-1)
>>> def using_LC(nums):
    delta=(x-y for x,y in izip(nums[1:],nums))
    return sum(delta)/float(len(nums)-1)
>>> nums=[random.randint(1,10) for _ in range(100000)]
>>> t1=Timer(stmt='using_star_map(nums)',setup='from __main__ import nums,using_star_map;from itertools import starmap,izip')
>>> t2=Timer(stmt='using_LC(nums)',setup='from __main__ import nums,using_LC;from itertools import izip')
>>> print "%.2f usec/pass" % (1000000 * t1.timeit(number=1000)/100000)
235.03 usec/pass
>>> print "%.2f usec/pass" % (1000000 * t2.timeit(number=1000)/100000)
181.87 usec/pass

score 13 · Accepted Answer

The difference I normally see is map()/starmap() are most appropriate where you are literally just calling a function on every item in a list. In this case, they are a little clearer:

(f(x) for x in y)
map(f, y) # itertools.imap(f, y) in 2.x

(f(*x) for x in y)
starmap(f, y)

As soon as you start needing to throw in lambda or filter as well, you should switch up to the list comp/generator expression, but in cases where it's a single function, the syntax feels very verbose for a generator expression of list comprehension.

They are interchangeable, and where in doubt, stick to the generator expression as it's more readable in general, but in a simple case (map(int, strings), starmap(Vector, points)) using map()/starmap() can sometimes make things easier to read.

Example:

An example where I think starmap() is more readable:

from collections import namedtuple
from itertools import starmap

points = [(10, 20), (20, 10), (0, 0), (20, 20)]

Vector = namedtuple("Vector", ["x", "y"])

for vector in (Vector(*point) for point in points):
    ...

for vector in starmap(Vector, points):
    ...

And for map():

values = ["10", "20", "0"]

for number in (int(x) for x in values):
    ...

for number in map(int, values):
    ...

Performance:

python -m timeit -s "from itertools import starmap" -s "from operator import sub" -s "numbers = zip(range(100000), range(100000))" "sum(starmap(sub, numbers))"                         
1000000 loops, best of 3: 0.258 usec per loop

python -m timeit -s "numbers = zip(range(100000), range(100000))" "sum(x-y for x, y in numbers)"                          
1000000 loops, best of 3: 0.446 usec per loop

For constructing a namedtuple:

python -m timeit -s "from itertools import starmap" -s "from collections import namedtuple" -s "numbers = zip(range(100000), reversed(range(100000)))" -s "Vector = namedtuple('Vector', ['x', 'y'])" "list(starmap(Vector, numbers))"
1000000 loops, best of 3: 0.98 usec per loop

python -m timeit -s "from collections import namedtuple" -s "numbers = zip(range(100000), reversed(range(100000)))" -s "Vector = namedtuple('Vector', ['x', 'y'])" "[Vector(*pos) for pos in numbers]"
1000000 loops, best of 3: 0.375 usec per loop

In my tests, where we are talking about using simple functions (no lambda), starmap() is faster than the equivalent generator expression. Naturally, performance should take a back-seat to readability unless it's a proven bottleneck.

Example of how lambda kills any performance gain, same example as in the first set, but with lambda instead of operator.sub():

python -m timeit -s "from itertools import starmap" -s "numbers = zip(range(100000), range(100000))" "sum(starmap(lambda x, y: x-y, numbers))" 
1000000 loops, best of 3: 0.546 usec per loop

score 3 · Accepted Answer

それは主にスタイルのものです。読みやすい方を選択してください。

「それを行う唯一の方法がある」に関連して、SvenMarnachは親切にこのGuidoの引用を提供します：

「これはTOOWTDIに違反していると思われるかもしれませんが、前に述べたように、それは白い嘘でした（2000年頃のPerlのスローガンに対する生意気な反応でもあります）。（人間の読者に）意図を表現できるようにするには、多くの場合、本質的に同じことを行うが、読者には異なって見える複数の形式から選択する必要があります。」</ p>

パフォーマンスホットスポットでは、最も高速に実行されるソリューションを選択することをお勧めします（この場合はベースのソリューションになると思いますstarmap）。

パフォーマンスについて-スターマップは、その破壊のために遅くなります。ただし、ここではスターマップは必要ありません。

from timeit import Timer
import random
from itertools import starmap, izip,imap
from operator import sub

def using_imap(nums):
    delta=imap(sub,nums[1:],nums[:-1])
    return sum(delta)/float(len(nums)-1)

def using_LC(nums):
    delta=(x-y for x,y in izip(nums[1:],nums))
    return sum(delta)/float(len(nums)-1)

nums=[random.randint(1,10) for _ in range(100000)]
t1=Timer(stmt='using_imap(nums)',setup='from __main__ import nums,using_imap')
t2=Timer(stmt='using_LC(nums)',setup='from __main__ import nums,using_LC')

私のコンピューターで：

>>> print "%.2f usec/pass" % (1000000 * t1.timeit(number=1000)/100000)
172.86 usec/pass
>>> print "%.2f usec/pass" % (1000000 * t2.timeit(number=1000)/100000)
178.62 usec/pass

imapおそらくそれがジッパー/破壊を回避するために、少し速く出てきます。

score 2 · Accepted Answer

Starmapについて.. あなたが持っているとしましょうL = [(0,1,2),(3,4,5),(6,7,8),..]。

ジェネレータ内包表記は次のようになります

(f(a,b,c) for a,b,c in L)

また

(f(*item) for item in L)

そしてスターマップは次のようになります

starmap(f, L)

3 番目のバリアントは、軽量で短くなっています。しかし、最初のものは非常に明白であり、それが何をするかを私に強制しません。

Ok。今度はもっと複雑なインラインコードを書きたいと思います..

some_result = starmap(f_res, [starmap(f1,L1), starmap(f2,L2), starmap(f3,L3)])

この行は明白ではありませんが、理解するのは簡単です..ジェネレータ内包表記では、次のようになります。

some_result = (f_res(a,b,c) for a,b,c in [(f1(a,b,c) for a,b,c in L1), (f2(a,b,c) for a,b,c in L2), (f3(a,b,c) for a,b,c in L3)])

ご覧のとおり、79 文字 (PEP 8) を超えるため、長くて理解しにくく、1 行に収めることができませんでした。より短いバリアントでさえ悪いです:

some_result = (f_res(*item) for item [(f1(*item) for item in L1), (f(*item2) for item in L2), (f3(*item) for item in L3)])

文字が多すぎる..括弧が多すぎる..ノイズが多すぎる.

そう。場合によっては、Starmap は非常に便利なツールです。それを使用すると、より理解しやすいコードを少なく書くことができます。

EDITはいくつかのダミーテストを追加しました

from timeit import timeit
print timeit("from itertools import starmap\nL = [(0,1,2),(3,4,5),(6,7,8)]\nt=list((max(a,b,c)for a,b,c in L))")
print timeit("from itertools import starmap\nL = [(0,1,2),(3,4,5),(6,7,8)]\nt=list((max(*item)for item in L))")
print timeit("from itertools import starmap\nL = [(0,1,2),(3,4,5),(6,7,8)]\nt=list(starmap(max,L))")

出力 (python 2.7.2)

5.23479851154
5.35265309689
4.48601346328

したがって、ここでは starmap が最大 15% 高速です。

python - `starmap` が `List Comprehension` より優先される場合

3 に答える 3

Example:

Performance:

Related

Reference