python - 新しいマルチプロセッシングプロセス内で再初期化するのではなく、オブジェクトのコピーを作成する

Question

このコードは、私がやろうとしていることの構造を示しています。

import multiprocessing
from foo import really_expensive_to_compute_object

## Create a really complicated object that is *hard* to initialise.
T = really_expensive_to_compute_object(10) 

def f(x):
  return T.cheap_calculation(x)

P = multiprocessing.Pool(processes=64)
results = P.map(f, range(1000000))

print results

問題は、各プロセスが、一度計算された元の T を使用するのではなく、T の再計算に多くの時間を費やすことから始まることです。これを防ぐ方法はありますか？T には高速 (ディープ) コピーメソッドがあるので、Python に再計算の代わりにそれを使用させることはできますか?

score 2 · Accepted Answer

multiprocessingドキュメンテーションが示唆する

リソースを子プロセスに明示的に渡す

したがって、コードを次のように書き換えることができます。

import multiprocessing
import time
import functools

class really_expensive_to_compute_object(object):
    def __init__(self, arg):
        print 'expensive creation'
        time.sleep(3)

    def cheap_calculation(self, x):
        return x * 2

def f(T, x):
    return T.cheap_calculation(x)

if __name__ == '__main__':
    ## Create a really complicated object that is *hard* to initialise.
    T = really_expensive_to_compute_object(10)
    ## helper, to pass expensive object to function
    f_helper = functools.partial(f, T)
    # i've reduced count for tests 
    P = multiprocessing.Pool(processes=4)
    results = P.map(f_helper, range(100))

    print results

score 1 · Accepted Answer

fグローバルを参照する代わりにパラメータを取得Tして、コピーを自分で行ってみませんか?

import multiprocessing, copy
from foo import really_expensive_to_compute_object

## Create a really complicated object that is *hard* to initialise.
T = really_expensive_to_compute_object(10) 

def f(t, x):
  return t.cheap_calculation(x)

P = multiprocessing.Pool(processes=64)
results = P.map(f, (copy.deepcopy(T) for _ in range(1000000)), range(1000000))

print results

python - 新しいマルチプロセッシング プロセス内で再初期化するのではなく、オブジェクトのコピーを作成する

2 に答える 2

Related

Reference

python - 新しいマルチプロセッシングプロセス内で再初期化するのではなく、オブジェクトのコピーを作成する