python - TF-Slim NASNet モデルを使用した CIFAR-10 の結果の再現

Question

https://github.com/tensorflow/models/tree/master/research/slimの TF-Slim 実装を使用して、いくつかのベンチマーク目的で CIFAR-10 上の NASNet モデルの結果を再現したいと考えています。このモデルをゼロからトレーニングするためにtrain_image_classifier.py、スクリプトのコメント (31 ～ 37 行目) の指示に従って、次の行をの元のコードに追加しました/nets/nasnet/models.py。

247行目以降:

elif FLAGS.learning_rate_decay_type == 'cosine':
    return tf.train.cosine_decay(FLAGS.learning_rate,
                                 global_step,
                                 decay_steps,
                                 name='cosine_decay_learning_rate')

536行目以降:

clone_gradients = tf.clip_by_global_norm(clones_gradients, 5.0)

CIFAR-10 データをダウンロードして TFRecord 形式に変換した後、次を実行します。

DATASET_DIR=/tmp/data/cifar10
TRAIN_DIR=/tmp/train_logs
python3 train_image_classifier.py \
      --train_dir=${TRAIN_DIR} \
      --dataset_name=cifar10 \
      --dataset_split_name=train \
      --dataset_dir=${DATASET_DIR} \
      --model_name=nasnet_cifar \
      --preprocessing_name=cifarnet  \
      --learning_rate=0.025 \
      --optimizer=momentum \
      --learning_rate_decay_type=cosine \
      --num_epochs_per_decay=600.0 \
      --batch_size=32

600 エポック (= 937500 ステップ) 後も学習は続いているようですが、コサイン減衰により 600 エポックで学習率が 0 になるため、パラメータは更新されません。評価スクリプトの実行:

DATASET_DIR=/tmp/data/cifar10
TRAIN_DIR=/tmp/train_logs
python3 eval_image_classifier.py \
      --alsologtostderr \
      --checkpoint_path=${TRAIN_DIR} \
      --dataset_name=cifar10 \
      --dataset_split_name=test \
      --dataset_dir=${DATASET_DIR} \
      --model_name=nasnet_cifar \
      --preprocessing_name=cifarnet

次の結果が得られます。

/home/zelaa/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
WARNING:tensorflow:From eval_image_classifier.py:91: get_or_create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.get_or_create_global_step
INFO:tensorflow:Scale of 0 disables regularizer.
2018-02-24 19:22:39.646499: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0 with properties: 
name: GeForce GTX TITAN X major: 5 minor: 2 memoryClockRate(GHz): 1.076
pciBusID: 0000:02:00.0
totalMemory: 11.92GiB freeMemory: 11.81GiB
2018-02-24 19:22:39.646538: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:02:00.0, compute capability: 5.2)
WARNING:tensorflow:From eval_image_classifier.py:155: streaming_accuracy (from tensorflow.contrib.metrics.python.ops.metric_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.metrics.accuracy. Note that the order of the labels and predictions arguments has been switched.
WARNING:tensorflow:From eval_image_classifier.py:157: streaming_recall_at_k (from tensorflow.contrib.metrics.python.ops.metric_ops) is deprecated and will be removed after 2016-11-08.
Instructions for updating:
Please use `streaming_sparse_recall_at_k`, and reshape labels from [batch_size] to [batch_size, 1].
INFO:tensorflow:Evaluating train_logs/model.ckpt-1002284
INFO:tensorflow:Starting evaluation at 2018-02-24-18:22:51
2018-02-24 19:22:52.383834: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:02:00.0, compute capability: 5.2)
INFO:tensorflow:Restoring parameters from train_logs/model.ckpt-1002284
INFO:tensorflow:Evaluation [20/200]
INFO:tensorflow:Evaluation [40/200]
INFO:tensorflow:Evaluation [60/200]
INFO:tensorflow:Evaluation [80/200]
INFO:tensorflow:Evaluation [100/200]
INFO:tensorflow:Evaluation [120/200]
INFO:tensorflow:Evaluation [140/200]
INFO:tensorflow:Evaluation [160/200]
INFO:tensorflow:Evaluation [180/200]
INFO:tensorflow:Evaluation [200/200]
eval/Recall_5[0.9985]
eval/Accuracy[0.9577]
INFO:tensorflow:Finished evaluation at 2018-02-24-18:23:26

そのため、1 回の実行のテストエラーは 4.23 % であり、これはスケーラブルな画像認識のための転送可能なアーキテクチャの学習で示されている結果のいずれにも対応していません。紙の結果と一致させるのを妨げるものはありますか?

python - TF-Slim NASNet モデルを使用した CIFAR-10 の結果の再現

0 に答える 0

Related

Reference