python - 配列内のPython複数検索

Question

idtopickIDの配列です

     idtopick=array([50,48,12,125,3458,155,299,6,7,84,58,63,0,8,-1])

idtolook興味のあるIDを含む別の配列です

     idtolook=array([0,8,12,50])

idtopickに対応するの位置を別の配列に格納したいと思いidtolookます。

これが私の解決策です

    positions=array([where(idtopick==dummy)[0][0] for dummy in idtolook])

その結果

    array([12, 13,  2,  0])

それは機能しますが、実際には、私が扱っている配列は何百万ものポイントを格納しているため、上記のスクリプトはかなり遅くなります。速くする方法があれば教えていただきたいです。また、idtolook並べ替えるアルゴリズムが私の場合には機能しないように、順序を維持したいと考えています。

score 3 · Accepted Answer

並べ替えを使用できます。

 sorter = np.argsort(idtopick, kind='mergesort') # you need stable sorting
 sorted_ids = idtopick[sorter]
 positions = np.searchsorted(sorted_ids, idtolook)
 positions = sorter[positions]

idtolookに存在する場合と存在しない場合でも、エラーはスローされないことに注意してくださいidtopick。実際に idtolook を結果配列にソートすることもできます。これはより高速なはずです。

 c = np.concatenate((idtopick, idtolook))
 sorter = np.argsort(c, kind='mergesort')
 #reverse = np.argsort(sorter) # The next two lines are this, but faster:
 reverse = np.empty_like(sorter)
 reverse[sorter] = np.arange(len(sorter))
 positions = sorter[reverse[-len(idtolook):]-1]

これは集合演算と類似しています。

python - 配列内のPython複数検索

1 に答える 1

Related

Reference