python - マルチプロセッシング/マップ関数でカウンターオブジェクトを返す

Question

複数のスレッドで同じ関数を開始する Python スクリプトを実行しています。この関数は、2 つのカウンター (c1 と c2) を作成して処理します。フォークされたプロセスからのすべての c1 カウンターの結果は、一緒にマージする必要があります。異なるフォークによって返されるすべての c2 カウンターの結果と同じです。

私の（疑似）コードは次のようになります。

def countIt(cfg)
   c1 = Counter
   c2 = Counter
   #do some things and fill the counters by counting words in an text, like
   #c1= Counter({'apple': 3, 'banana': 0})
   #c2= Counter({'blue': 3, 'green': 0})    

   return c1, c2

if __name__ == '__main__':
        cP1 = Counter()
        cP2 = Counter()
        cfg = "myConfig"
        p = multiprocessing.Pool(4)  #creating 4 forks
        c1, c2 = p.map(countIt,cfg)[:2]
        # 1.) This will only work with [:2] which seams to be no good idea
        # 2.) at this point c1 and c2 are lists, not a counter anymore,
        # so the following will not work:
        cP1 + c1
        cP2 + c2

上記の例に従って、次のような結果が必要です: cP1 = Counter({'apple': 25, 'banana': 247, 'orange': 24}) cP2 = Counter({'red': 11, 'blue': 56、「緑」: 3})

私の質問: 親プロセスの各カウンター (すべての c1 とすべての c2) を集計するために、フォークされたプロセスの洞察をどのようにカウントできますか?

score 2 · Accepted Answer

たとえば for-each ループを使用して、結果を「解凍」する必要があります。各タプルがカウンターのペアであるタプルのリストを受け取ります: (c1, c2).
現在のソリューションでは、実際にそれらを混同しています。含むと含むの意味に割り当て[(c1a, c2a), (c1b, c2b)]ました。c1, c2c1(c1a, c2a)c2(c1b, c2b)

これを試して：

if __name__ == '__main__':
        from contextlib import closing

        cP1 = Counter()
        cP2 = Counter()

        # I hope you have an actual list of configs here, otherwise map will
        # will call `countIt` with the single characters of the string 'myConfig'
        cfg = "myConfig"

        # `contextlib.closing` makes sure the pool is closed after we're done.
        # In python3, Pool is itself a contextmanager and you don't need to
        # surround it with `closing` in order to be able to use it in the `with`
        # construct.
        # This approach, however, is compatible with both python2 and python3.
        with closing(multiprocessing.Pool(4)) as p:
            # Just counting, no need to order the results.
            # This might actually be a bit faster.
            for c1, c2 in p.imap_unordered(countIt, cfg):
                cP1 += c1
                cP2 += c2

python - マルチプロセッシング/マップ関数でカウンターオブジェクトを返す

1 に答える 1

Related

Reference