tensorflow - Tensorflow: Multigpu トレーニングで変数を CPU にピン留めできない

Question

テンソルフローを使用して最初のマルチ GPU モデルをトレーニングしています。チュートリアルが述べているように、変数は name_scope を使用してすべての GPU で CPU と ops に固定されます。

小さなテストを実行してデバイスの配置をログに記録していると、ops が TOWER_1/TOWER_0 プレフィックスでそれぞれの GPU に配置されていることがわかりますが、変数は CPU に配置されていません。

何かが足りないのでしょうか、それともデバイス配置ログを間違って理解したのでしょうか。

ありがとう

テストコード

with tf.device('cpu:0'):  
    imgPath=tf.placeholder(tf.string)
    imageString=tf.read_file(imgPath)
    imageJpeg=tf.image.decode_jpeg(imageString, channels=3)
    inputImage=tf.image.resize_images(imageJpeg, [299,299])
    inputs  = tf.expand_dims(inputImage, 0)
    for i in range(2):
        with tf.device('/gpu:%d' % i):
            with tf.name_scope('%s_%d' % ('TOWER', i)) as scope:
                with slim.arg_scope([tf.contrib.framework.python.ops.variables.variable], device='/cpu:0'):
                    with slim.arg_scope(inception_v3.inception_v3_arg_scope()):
                        logits,endpoints = inception_v3.inception_v3(inputs, num_classes=1001, is_training=False)
                tf.get_variable_scope().reuse_variables()

with tf.Session(config=tf.ConfigProto(allow_soft_placement=True,log_device_placement=True)) as sess:
    tf.initialize_all_variables().run()
exit(0)

編集基本的に、「with slim.arg_scope([tf.contrib.framework.python.ops.variables.variable], device='/cpu:0'):」という行は、CPU 上のすべての変数を強制する必要がありますが、それらは作成されます「gpu:0」で

score 0 · Accepted Answer

試してみてください：

with slim.arg_scope([slim.model_variable, slim.variable], device='/cpu:0'):

これは以下から取得されました: model_deploy

tensorflow - Tensorflow: Multigpu トレーニングで変数を CPU にピン留めできない

1 に答える 1

Related

Reference