python - Tensorflow で 2 つの異なる LSTM セルを使用する

Question

ニューラルマシントランスレータを作成していますが、2 つの異なるLSTM セル (エンコーダ用とデコード用) を使用する必要があります。

2 つのセルの形状は異なります。

エンコーダー (最初のもの) は入力文のトークンを受け取り、状態ベクトルを生成します
デコーダー (2 つ目) には前の状態ベクトルが供給され、それ自体によって生成されたトークン

これを Tensorflow で記述しました。スクリプトを実行すると、次のエラーが発生しました (デコーダフェーズで発生しました)。

  outputs, states = tf.nn.rnn(cell_backward, inputs, initial_state=initial_state)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py", line 158, in rnn
    (output, state) = call_cell()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py", line 145, in <lambda>
    call_cell = lambda: cell(input_, state)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn_cell.py", line 520, in __call__
    dtype, self._num_unit_shards)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn_cell.py", line 357, in _get_concat_variable
    sharded_variable = _get_sharded_variable(name, shape, dtype, num_shards)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn_cell.py", line 387, in _get_sharded_variable
    dtype=dtype))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 732, in get_variable
    partitioner=partitioner, validate_shape=validate_shape)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 596, in get_variable
    partitioner=partitioner, validate_shape=validate_shape)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 161, in get_variable
    caching_device=caching_device, validate_shape=validate_shape)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 437, in _get_single_variable
    name, "".join(traceback.format_list(tb))))
ValueError: Variable backward/RNN/LSTMCell/W_0 already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at:

  File "/home/alexis/Documents/NMT/NMT.py", line 88, in dense_to_vector_state
    outputs, states = tf.nn.rnn(cell_forward, inputs, initial_state=initial_state)

まったく新しい LSTM セルを作成することを明示的に指定するにはどうすればよいですか?

前もって感謝します！

アレクシス

score 1 · Accepted Answer

変数スコープを使用する

with tf.variable_scope('enc'):
  cell_enc = LSTMCell(hidden_size)
with tf.variable_scope('dec'):
  cell_dec = LSTMCell(hidden_size)

score 1 · Accepted Answer

機械翻訳をしようとしています。これが私のエンコーダーとデコーダーです。各rnnに異なる変数スコープを使用するだけです。エンコーダーに MultiRNNCell セルを使用するのではなく、各レイヤーを手動で展開します。これにより、レイヤー間で方向を切り替えることができます。各レイヤーが独自のスコープを取得する方法を確認してください。

with tf.variable_scope('encoder'):
    rnn_cell = tf.nn.rnn_cell.LSTMCell(512, num_proj = 256, state_is_tuple = True)
    for level in range(3):
        with tf.variable_scope('level_%d' % level) as scope:
            state = [tf.zeros((BATCH_SIZE, sz)) for sz in rnn_cell.state_size]
            for t in range(TIME_STEPS) if level % 2 else reversed(range(TIME_STEPS)):
                y[t], state = rnn_cell(y[t], state)
                scope.reuse_variables()


with tf.variable_scope('decoder') as scope:
    rnn_cell = tf.nn.rnn_cell.MultiRNNCell \
    ([
        tf.nn.rnn_cell.LSTMCell(512, num_proj = 256, state_is_tuple = True),
        tf.nn.rnn_cell.LSTMCell(512, num_proj = WORD_VEC_SIZE, state_is_tuple = True)
    ], state_is_tuple = True)

    state = [[tf.zeros((BATCH_SIZE, sz)) for sz in sz_outer] for sz_outer in rnn_cell.state_size]

    W_soft = tf.get_variable('W_soft', shape = (NWORDS, WORD_VEC_SIZE), initializer = tf.truncated_normal_initializer(0.0, 1 / np.sqrt(WORD_VEC_SIZE)))
    b_soft = tf.get_variable('b_soft', shape = (NWORDS,), initializer = tf.truncated_normal_initializer(0.0, 0.01))
    cost = 0
    output = [None] * TIME_STEPS

    for t in range(TIME_STEPS):
        if t:
            last = y_[t - 1] if TRAINING else y[t - 1]
        else:
            last = tf.zeros((BATCH_SIZE, WORD_VEC_SIZE))

        y[t] = tf.concat(1, (y[t], last))
        y[t], state = rnn_cell(y[t], state)

        cost += tf.reduce_mean(tf.nn.sampled_softmax_loss(W_soft, b_soft, y[t], target_output[:, t : t + 1], 1000, NWORDS))
        output[t] = tf.reshape(tf.nn.softmax(tf.matmul(y[t], W_soft, transpose_b = True) + b_soft), (BATCH_SIZE, 1, NWORDS))

        scope.reuse_variables()

    output = tf.concat(1, output)
    cost /= TIME_STEPS

python - Tensorflow で 2 つの異なる LSTM セルを使用する

2 に答える 2

Related

Reference