c++ - Cifar10 予測出力を理解する方法は?

Question

2 つのクラス分類用にCifar10( caffe ) モデルをトレーニングしました。歩行者と非歩行者。トレーニングは問題ないようcaffemodelです。ファイルの重みを更新しました。歩行者用の画像 (64 x 160) と背景画像 (64 x 160) と共に、歩行者用のラベル 1 と非歩行者用の 2 の 2 つのラベルを使用しました。トレーニング後、ポジ画像（歩行者画像）とネガ画像（背景画像）でテストを行います。私のテストprototxtファイルは以下のとおりです

name: "CIFAR10_quick_test"
layers 
{
  name: "data"
  type: MEMORY_DATA
  top: "data"
  top: "label"
  memory_data_param 
  {
    batch_size: 1
    channels: 3
    height: 160
    width: 64
  }
  transform_param 
  {
    crop_size: 64
    mirror: false
    mean_file: "../../examples/cifar10/mean.binaryproto"
  }
}
layers {
  name: "conv1"
  type: CONVOLUTION
  bottom: "data"
  top: "conv1"
  blobs_lr: 1
  blobs_lr: 2
  convolution_param {
    num_output: 32
    pad: 2
    kernel_size: 5
    stride: 1
  }
}
layers {
  name: "pool1"
  type: POOLING
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layers {
  name: "relu1"
  type: RELU
  bottom: "pool1"
  top: "pool1"
}
layers {
  name: "conv2"
  type: CONVOLUTION
  bottom: "pool1"
  top: "conv2"
  blobs_lr: 1
  blobs_lr: 2
  convolution_param {
    num_output: 32
    pad: 2
    kernel_size: 5
    stride: 1
  }
}
layers {
  name: "relu2"
  type: RELU
  bottom: "conv2"
  top: "conv2"
}
layers {
  name: "pool2"
  type: POOLING
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: AVE
    kernel_size: 3
    stride: 2
  }
}
layers {
  name: "conv3"
  type: CONVOLUTION
  bottom: "pool2"
  top: "conv3"
  blobs_lr: 1
  blobs_lr: 2
  convolution_param {
    num_output: 64
    pad: 2
    kernel_size: 5
    stride: 1
  }
}
layers {
  name: "relu3"
  type: RELU
  bottom: "conv3"
  top: "conv3"
}
layers {
  name: "pool3"
  type: POOLING
  bottom: "conv3"
  top: "pool3"
  pooling_param {
    pool: AVE
    kernel_size: 3
    stride: 2
  }
}
layers {
  name: "ip1"
  type: INNER_PRODUCT
  bottom: "pool3"
  top: "ip1"
  blobs_lr: 1
  blobs_lr: 2
  inner_product_param {
    num_output: 64
  }
}
layers {
  name: "ip2"
  type: INNER_PRODUCT
  bottom: "ip1"
  top: "ip2"
  blobs_lr: 1
  blobs_lr: 2
  inner_product_param {
    num_output: 10
  }
}
layers {
  name: "prob"
  type: SOFTMAX
  bottom: "ip2"
  top: "prob"
}

テストのために、test_predict_imagenet.cpp特にパスと画像サイズを使用していくつかの変更を加えました。

テスト出力がわかりません。ポジティブイメージでテストすると、次のような出力が得られました

I0813 01:55:30.378114  7668 test_predict_cifarnet.cpp:72] 1
I0813 01:55:30.379082  7668 test_predict_cifarnet.cpp:72] 3.90971e-007
I0813 01:55:30.381088  7668 test_predict_cifarnet.cpp:72] 0.00406029
I0813 01:55:30.383090  7668 test_predict_cifarnet.cpp:72] 0.995887
I0813 01:55:30.384119  7668 test_predict_cifarnet.cpp:72] 1.96203e-006
I0813 01:55:30.385095  7668 test_predict_cifarnet.cpp:72] 3.50333e-005
I0813 01:55:30.386119  7668 test_predict_cifarnet.cpp:72] 1.2796e-008
I0813 01:55:30.387097  7668 test_predict_cifarnet.cpp:72] 1.48836e-005
I0813 01:55:30.389093  7668 test_predict_cifarnet.cpp:72] 1.12237e-007
I0813 01:55:30.390100  7668 test_predict_cifarnet.cpp:72] 4.71238e-008
I0813 01:55:30.391101  7668 test_predict_cifarnet.cpp:72] 9.04134e-008

ネガ画像でテストすると、次のような出力が得られました

I0813 01:53:40.896139 10856 test_predict_cifarnet.cpp:72] 1
I0813 01:53:40.897117 10856 test_predict_cifarnet.cpp:72] 6.20882e-006
I0813 01:53:40.898115 10856 test_predict_cifarnet.cpp:72] 7.10468e-005
I0813 01:53:40.900184 10856 test_predict_cifarnet.cpp:72] 0.999911
I0813 01:53:40.901185 10856 test_predict_cifarnet.cpp:72] 3.4275e-006
I0813 01:53:40.902189 10856 test_predict_cifarnet.cpp:72] 2.38526e-007
I0813 01:53:40.903192 10856 test_predict_cifarnet.cpp:72] 2.29073e-007
I0813 01:53:40.905187 10856 test_predict_cifarnet.cpp:72] 1.7243e-006
I0813 01:53:40.906188 10856 test_predict_cifarnet.cpp:72] 5.40765e-007
I0813 01:53:40.908195 10856 test_predict_cifarnet.cpp:72] 1.57534e-006
I0813 01:53:40.909195 10856 test_predict_cifarnet.cpp:72] 3.72312e-006

テスト出力を理解するには？

ビデオフィード (ビデオクリップのフレームごと) からモデルをテストするための、より効率的なテストアルゴリズムはありますか?

score 2 · Accepted Answer

なぜあなたはnum_output: 10最後の層のために持っているのip2ですか? 2 方向分類器だけが必要ですか? ラベル 0 と 1 ではなく、ラベル 1 と 2 を使用しているのはなぜですか?

得られたもの: 11 個の出力があります。1 つは"label"データ層からの出力であり、他の 10 個の出力はソフトマックス層の 10 ベクトル出力です。2 つのラベルのみを使用してトレーニングしたため、10 のベクトルの値が何であるかは不明です。したがって、10 のエントリのうち 8 つはまったく監視されませんでした。さらに、最初の出力から判断すると、両方のテストはラベル付きのサンプルであり、ラベル付きのサンプルでは1なかったよう2です。

あなたがすべきこと:
1. 最上部の完全に接続されたレイヤーを 2 つの出力のみを持つように変更します (新しいバージョンの protobuff に一致するようにフォーマットも変更しました)

layer {
  name: "ip2/pedestrains"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 2 # This is what you need changing
  }
}

2. トレーニングデータのバイナリラベルを 1/2 ではなく 0/1 に変更します。

これで、再びトレーニングして、何が得られるかを確認できます。

c++ - Cifar10 予測出力を理解する方法は?

1 に答える 1

Related

Reference