python - スパース行列での並べ替え

Question

私は疎行列を持っています。このマトリックスを行ごとに並べ替えて、別の [スパース] マトリックスを作成する必要があります。コードはそれをよりよく説明するかもしれません:

# for `rand` function, you need newer version of scipy.
from scipy.sparse import *
m = rand(6,6, density=0.6)
d = m.getrow(0)
print d

出力1

(0, 5) 0.874881629788 
(0, 4) 0.352559852239 
(0, 2) 0.504791645463 
(0, 1) 0.885898140175

私はこのmマトリックスを持っています。ソートされたバージョンの m を使用して新しい行列を作成したいと考えています。新しい行列には、このように 0 番目の行が含まれています。

new_d = new_m.getrow(0)
print new_d

出力2

(0, 1) 0.885898140175
(0, 5) 0.874881629788  
(0, 2) 0.504791645463
(0, 4) 0.352559852239

したがって、どの列が大きいかなどを取得できます。

print new_d.indices

出力3

array([1, 5, 2, 4])

もちろん、すべての行は上記のように個別にソートする必要があります。

この問題の解決策は 1 つありますが、エレガントではありません。

score 7 · Accepted Answer

行列のゼロ値要素を無視する場合は、以下のコードが機能するはずです。また、かなり遅い getrow メソッドを使用する実装よりもはるかに高速です。

from itertools import izip

def sort_coo(m):
    tuples = izip(m.row, m.col, m.data)
    return sorted(tuples, key=lambda x: (x[0], x[2]))

例えば：

    >>> from numpy.random import rand
    >>> from scipy.sparse import coo_matrix
    >>>
    >>> d = rand(10, 20)
    >>> d[d > .05] = 0
    >>> s = coo_matrix(d)
    >>> sort_coo(s)
    [(0, 2, 0.004775589084940246),
     (3, 12, 0.029941507166614145),
     (5, 19, 0.015030386789436245),
     (7, 0, 0.0075044957259399192),
     (8, 3, 0.047994403933129481),
     (8, 5, 0.049401058471327031),
     (9, 15, 0.040011608000125043),
     (9, 8, 0.048541825332137023)]

必要に応じて、ラムダのソートキーを微調整したり、出力をさらに処理したりすることができます。行インデックス付き辞書のすべてが必要な場合は、次のようにします。

from collections import defaultdict

sorted_rows = defaultdict(list)

for i in sort_coo(m):
     sorted_rows[i[0]].append((i[1], i[2]))

score 2 · Accepted Answer

私の悪い解決策は次のとおりです。

from scipy.sparse import coo_matrix
import numpy as np
a = []
for i in xrange(m.shape[0]): # assume m is square matrix.
   d = m.getrow(i)
   n = len(d.indices)
   s = zip([i]*n, d.indices, d.data)
   sorted_s = sorted(s, key=lambda v: v[2], reverse=True)
   a.extend(sorted_s)
a = np.array(a)
new_m = coo_matrix((a[:,2], (a[:,0], a[:,1])), m.shape)

まだチェックしていないので、簡単な間違いがあるかもしれません。しかし、アイデアは直感的だと思います。良い解決策はありますか？

編集

getrowメソッドを呼び出すと、順序が再び壊れるため、この新しいマトリックスの作成は役に立たない可能性があります。coo_matrix.col順序を守るだけです。

別の解決策

これは正確な解決策ではありませんが、役立つ場合があります。

def sortSparseMatrix(m, rev=True, only_indices=True):

    """ Sort a sparse matrix and return column index dictionary
    """
    col_dict = dict() 
    for i in xrange(m.shape[0]): # assume m is square matrix.
        d = m.getrow(i)
        s = zip(d.indices, d.data)
        sorted_s = sorted(s, key=lambda v: v[1], reverse=True)
        if only_indices:
            col_dict[i] = [element[0] for element in sorted_s]
        else:
            col_dict[i] = sorted_s
    return col_dict

>>> print sortSparseMatrix(m)
{0: [5, 1, 0],
 1: [1, 3, 5],
 2: [1, 2, 3, 4],
 3: [1, 5, 2, 4],
 4: [0, 3, 5, 1],
 5: [3, 4, 2]}

python - スパース行列での並べ替え

出力1

出力2

出力3

2 に答える 2

編集

別の解決策

Related

Reference