python - numpy ndarrayで最も頻繁に使用される文字列要素を見つける方法は?

Question

numpy ndarray で最も頻繁に使用される文字列要素を見つける方法はありますか?

A= numpy.array(['a','b','c']['d','d','e']])


result should be 'd'

score 13 · Accepted Answer

派手な答えが必要な場合は、次を使用できますnp.unique。

>>> unique,pos = np.unique(A,return_inverse=True) #Finds all unique elements and their positions
>>> counts = np.bincount(pos)                     #Count the number of each unique element
>>> maxpos = counts.argmax()                      #Finds the positions of the maximum count

>>> (unique[maxpos],counts[maxpos])
('d', 2)

ただし、カウントが等しい 2 つの要素がある場合、これは単純にunique配列から最初の要素を取得します。

これにより、次のように要素数で簡単に並べ替えることもできます。

>>> maxsort = counts.argsort()[::-1]
>>> (unique[maxsort],counts[maxsort])
(array(['d', 'e', 'c', 'b', 'a'],
      dtype='|S1'), array([2, 1, 1, 1, 1]))

python - numpy ndarrayで最も頻繁に使用される文字列要素を見つける方法は?

2 に答える 2

Related

Reference