tensorflow - テンソルフローのチェックポイントから開始モデルをトレーニングし続ける方法

Question

事前トレーニング済みの開始モデルをロードしました。

if FLAGS.pretrained_model_checkpoint_path: assert tf.gfile.Exists(FLAGS.pretrained_model_checkpoint_path) variables_to_restore = tf.get_collection( slim.variables.VARIABLES_TO_RESTORE) restorer = tf.train.Saver(variables_to_restore) restorer.restore(sess, FLAGS.pretrained_model_checkpoint_path) print('%s: Pre-trained model restored from %s' % (datetime.now(), FLAGS.pretrained_model_checkpoint_path))flowers_train.py を使用して、私のデータでモデルをトレーニングしました

トレーニングが完了した後、損失は約 1.0 になり、モデルは指定されたディレクトリに保存されました。

トレーニングを続けたいので、モデルを復元します。

if FLAGS.checkpoint_dir is not None: # restoring from the checkpoint file ckpt = tf.train.get_checkpoint_state(FLAGS.checkpoint_dir) tf.train.Saver().restore(sess, ckpt.model_checkpoint_path)

モデルのトレーニングを続行しますが、最初のステップでの損失は約 6.5 です。これは実際には、モデルがまったく初期化されていないことを意味します。

これは、このinception_train.pyから変更された inception_train.pyの内容全体です。

私が出発した最初の電車：

bazel-bin/inception/flowers_train --train_dir="{$TRAIN_DIR}" --data_dir="{$DATA_DIR}" --fine_tune=True --initial_learning_rate=0.001 --input_queue_memory_factor=1 --batch_size=64 --max_steps=100 --pretrained_model_checkpoint_path="/home/tensorflow/inception-v3/model.ckpt-157585"

このコマンドでトレーニングを続けようとしました:

bazel-bin/inception/flowers_train --train_dir="{$TRAIN_NEW_DIR}" --data_dir="{$DATA_DIR}" --fine_tune=False --initial_learning_rate=0.001 --input_queue_memory_factor=1 --batch_size=64 --max_steps=2000 --checkpoint_dir="{$TRAIN_DIR}"

訓練されたモデルを初期化するときに何が間違っているのか、誰か説明してもらえますか?

tensorflow - テンソルフローのチェックポイントから開始モデルをトレーニングし続ける方法

1 に答える 1

Related

Reference