python - Interact with a MPI binary via a (Non-MPI) python script

Question

I would like to somehow trigger the execution of certain functions of a MPI program (written in C++) via, e.g., a (serial) python script. This python script should launch the mpi program at the beginning with, e.g.,

subprocess.call(['mpirun','-np', '4', 'mpibinary', 'args' ])

I need to call a function of this MPI program multiple times and I want to avoid restarting the program for different inputs, as i have to reinitialize all my data structures which is costly. Therefore, I have thought about externally triggering a function when the MPI program is idle. I think this could be done with file IO, i.e., the root rank of the MPI program watches a certain file in a while(1) loop and as soon as its content changes it parses the new content notifies the other ranks and calls a function. Is there a more elegant solution to my problem?

The best solution would by to have a python class which wraps the important functions of the C++ MPI program so that i can call them from python with

mpiprogram.superfunction(a,b)

score 5 · Accepted Answer

おそらく最も洗練された解決策は、Python コードを MPI アプリケーションの一部にすることです。次に、データを (MPI メッセージ経由で) MPI アプリケーションの残りの部分に直接送信することができます。ここには 2 つの異なるアプローチがあります。

1) Python バイナリをランク 0 として MPI ジョブに挿入します。での集団操作への参加から除外するにはmpibinary、ランク 0 を除外するサブコミュニケーターを作成し、それをでの以降のすべての集団通信に使用する必要がありますmpibinary。最初のステップは簡単な部分です。Open MPI では、次のようにします。

mpirun --hostfile hosts -np 1 pythonbinary args : -np 32 mpibinary args

これを MPMD (multiple program multiple data) 起動と呼び、pythonbinaryランク 0 になる1 つのコピーとmpibinary、ランク 1、ランク 2、... ランク 32 になる 32 のコピー (合計 33 プロセス) を開始します。 . 他の MPI 実装も、MPMD の起動に非常によく似たメカニズムを提供します。次にMPI_Comm_split()、Python プログラムを含まない新しいコミュニケーターを作成するために使用します。コミュニケーターの分割は集合操作です。そのため、Python コードと C++ アプリケーションの両方でこれを呼び出す必要があります。MPI_Comm_split()「色」とキーを取り、異なる色に従ってコミュニケーターを複数のサブコミュニケーターに分割します。同じ色のプロセスは、キー値に基づいてソートされます。ほとんどの場合、次のように呼び出します。

Python で:

python_comm = mpi.mpi_comm_split(mpi.MPI_COMM_WORLD, 0, 0)

C++ で:

int rank;
MPI_Comm c_comm;

MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_split(MPI_COMM_WORLD, 1, rank, &c_comm);

as keyを使用することにより、のプロセスの順序が分割前と同じrankになることが保証されます。つまり、のランク 1 はのランク 0 になり、ランク 2 はランク 1 になります。c_commMPI_COMM_WORLDc_comm

これ以降、C++ アプリケーションはc_comm通常どおり集合操作を実行するために使用できます。Python と C++ コードの間で通信するには、引き続き使用する必要がMPI_COMM_WORLDあり、Python コードはランク 0 のままです。

2) MPI-2 プロセス管理機能を使用します。最初に、Python バイナリのみで構成される MPI ジョブを実行します。

mpirun --hostfile hosts -np 1 pythonbinary args

MPI_Comm_spawn()次に、Python バイナリは、必要な数の新しいプロセスを使用して、他の MPI バイナリを直接生成します。新しく生成されたプロセスには独自のプロセスがMPI_COMM_WORLDあり、使用する必要はありませんMPI_Comm_split()。また、spawn 操作は、Python コードが MPI アプリケーションの他の部分にメッセージを送信できるようにするインターコミュニケーターを確立します。

どちらの場合もhosts、MPI バイナリを実行できるすべての実行ホストの定義がファイルに含まれます。また、利用可能な Python MPI バインディングのいずれかを使用する必要があります。

MPI_Init、、および関連する/MPI_FinalizeなどのMPI 呼び出しを Python スクリプトに追加するだけでよいことに注意してください。平行にする必要はありません。MPI は、並列ワークシェアリングだけでなく、一般的なメッセージングフレームワークとしても使用できるという点で非常に用途が広いです。ただし、Python バインディングは、プログラムの残りの部分と同じ MPI ライブラリを使用する必要があることに注意してください。MPI_Comm_splitMPI_SendMPI_Recv

もう 1 つの解決策は、メッセージキューイングライブラリまたはファイルプーリングを使用することです (これは実際には大雑把な MQ 実装です)。

python - Interact with a MPI binary via a (Non-MPI) python script

1 に答える 1

Related

Reference