opencl - 異なるデバイスで実行すると PyOpenCL enqueue_copy がハングする

Question

カーネルを 2 つの異なる OpenCL プラットフォームで実行するのに問題があります。プラットフォームの唯一の違いは、1 つは OpenCL 1.1 で、もう 1 つは 1.2 です。

コードはこのデバイス (OS X 10.8) で動作します:

===============================================================
('Platform name:', 'Apple')
('Platform profile:', 'FULL_PROFILE')
('Platform vendor:', 'Apple')
('Platform version:', 'OpenCL 1.2 (Sep 20 2012 17:42:28)')
---------------------------------------------------------------
('Device name:', 'Intel(R) Core(TM) i5-3427U CPU @ 1.80GHz')
('Device type:', 'CPU')
('Device memory: ', 8192L, 'MB')
('Device max clock speed:', 1800, 'MHz')
('Device compute units:', 4)

ターゲットデバイス (Ubuntu 11.04):

===============================================================
('Platform name:', 'NVIDIA CUDA')
('Platform profile:', 'FULL_PROFILE')
('Platform vendor:', 'NVIDIA Corporation')
('Platform version:', 'OpenCL 1.1 CUDA 4.2.1')
---------------------------------------------------------------
('Device name:', 'Tesla M2050')
('Device type:', 'GPU')
('Device memory: ', 3071, 'MB')
('Device max clock speed:', 1147, 'MHz')
('Device compute units:', 14)
===============================================================
('Platform name:', 'NVIDIA CUDA')
('Platform profile:', 'FULL_PROFILE')
('Platform vendor:', 'NVIDIA Corporation')
('Platform version:', 'OpenCL 1.1 CUDA 4.2.1')
---------------------------------------------------------------
('Device name:', 'Tesla M2050')
('Device type:', 'GPU')
('Device memory: ', 3071, 'MB')
('Device max clock speed:', 1147, 'MHz')
('Device compute units:', 14)

次のコードのハングの原因を突き止めました。

# set up 
host_array = numpy.array(arr)
device_buffer = pyopencl.Buffer(context, pyopencl.mem_flags.WRITE_ONLY, host_array.nbytes)

# run the kernel
program.run(queue, host_array.shape, None, device_buffer)

# copy the results back --- this call causes the code to hang ----
pyopencl.enqueue_copy(queue, host_array, device_buffer)

2 つのデバイス間でコードの変更はなく、両方のデバイスで PyOpenCL 2013.1 が実行されています。何か不足していますか？どんな提案でも大歓迎です。

score 2 · Accepted Answer

にを追加してみて.wait()くださいprogram.run。これにより、実際にハングしているプログラムであるかどうかが判断されます。

score 0 · Accepted Answer

問題はスレッドの問題であることが判明しました。threading モジュールで生成された 2 番目のスレッドを使用して、pyopencl 呼び出しを行っていました。問題は、pyopencl を呼び出すために使用していたコンテキストがメインスレッドで作成されたことであり、これが何らかの問題を引き起こしていると思います。

修正するために、コンテキスト、キュー、および作成されたプログラムを、プライマリスレッドではなく 2 番目のスレッドで宣言するようにしました。

opencl - 異なるデバイスで実行すると PyOpenCL enqueue_copy がハングする

2 に答える 2

Related

Reference