numpy - Scipy：誤った値を与えるスパース行列

Question

以下は、スパース行列を生成するためのコードです。

import numpy as np
import scipy

def sparsemaker(X, Y, Z):
    'X, Y, and Z are 2D arrays of the same size'
    x_, row = np.unique(X, return_inverse=True)
    y_, col = np.unique(Y, return_inverse=True)
    return scipy.sparse.csr_matrix( (Z.flat,(row,col)), shape=(x_.size, y_.size) )

>>> print sparsemaker(A, B, C) #A, B, and C are (220, 256) sized arrays.
(0, 0)  167064.269831
(0, 2)  56.6146564629
(0, 9)  53.8660340698
(0, 23) 80.6529717039
(0, 28) 0.0
(0, 33) 53.2379218326
(0, 40) 54.3868995375
 :          :

今、私の入力配列は少し大きいので、ここに投稿する方法がわかりません（誰かがアイデアを持っていない限り）。しかし、最初の値を見ても、私はすでに何かが間違っていると言うことができます：

>>> test = sparsemaker(A, B, C)
>>> np.max(test.toarray())
167064.26983076424

>>> np.where(C==np.max(test.toarray()))
(array([], dtype=int64), array([], dtype=int64))

なぜこれが起こるのか誰かが知っていますか？その価値はどこから来たのですか？

score 3 · Accepted Answer

座標が繰り返され、コンストラクターがそれらをすべて追加しています。以下をせよ：

x_, row = np.unique(X, return_inverse=True)
y_, col = np.unique(Y, return_inverse=True)
print Z.flat[(row == 0) & (col == 0)].sum()

そして、その不思議な167064.26983076424プリントアウトを取得する必要があります。

編集次の醜いコードは、繰り返しのエントリを平均化する小さな例で正常に動作し、この他の質問から借用したいくつかのコードを使用して、試してみてください:

def sparsemaker(X, Y, Z):
    'X, Y, and Z are 2D arrays of the same size'
    x_, row = np.unique(X, return_inverse=True)
    y_, col = np.unique(Y, return_inverse=True)
    indices = np.array(zip(row, col))
    _, repeats = np.unique(indices.view([('', indices.dtype)]*2),
                           return_inverse=True)
    counts = 1. / np.bincount(repeats)
    factor = counts[repeats]

    return scipy.sparse.csr_matrix((Z.flat * factor,(row,col)),
                                   shape=(x_.size, y_.size))

numpy - Scipy：誤った値を与えるスパース行列

1 に答える 1

Related

Reference