numpy - Swap a subset of multi-values in numpy

Question

Given a starting numpy array that looks like:

B = np.array( [1, 1, 1, 0, 2, 2, 1, 3, 3, 0, 4, 4, 4, 4] )

What it the most efficient way to swap one set of values for another when there are duplicates? For example, let

s1 = [1,2,4]
s2 = [4,1,2]

An inefficient swapping method would iterate through s1 and s2 as so:

B2 = B.copy()
for x,y in zip(s1,s2):
    B2[B==x] = y

Giving as output

B2 -> [4, 4, 4, 0, 1, 1, 4, 3, 3, 0, 2, 2, 2, 2]

Is there a way to do this essentially in-place without the zip loop?

score 2 · Accepted Answer

>>> B = np.array( [1, 1, 1, 0, 2, 2, 1, 3, 3, 0, 4, 4, 4, 4] )
>>> s1 = [1,2,4]
>>> s2 = [4,1,2]
>>> B2 = B.copy()
>>> c, d = np.where(B == np.array(s1)[:,np.newaxis])
>>> B2[d] = np.repeat(s2,np.bincount(c))
>>> B2
array([4, 4, 4, 0, 1, 1, 4, 3, 3, 0, 2, 2, 2, 2])

score 1 · Accepted Answer

0 から n の間の整数しかない場合 (非常にまばらでない限り、任意の整数範囲に一般化しても問題ない場合)、最も効率的な方法は、take/fancy インデックスを使用することです。

swap = np.arange(B.max() + 1) # all values in B
swap[s1] = s2 # replace the values you want to be replaced

B2 = swap.take(B) # or swap[B]

これは、ここで与えられた小さい B ではほぼ 2 倍の速さのように見えますが、B が大きいと、B を約 100000 の長さまで繰り返すとさらにスピードアップし、すでに 8 倍になります。これにより、すべての s1 要素に対する == 操作も回避されるため、s1/s2 が大きくなるにつれてスケーリングが大幅に向上します。

編集: のスピードアップのために np.put (他の回答でも) を使用することもできますswap[s1] = s2。これらの 1D 問題では、take/put の方が単純に高速です。

numpy - Swap a subset of multi-values in numpy

2 に答える 2

Related

Reference