cntk - 画像データ、行 ID、およびラベルを 1 つの入力ファイルに結合しますか?

Question

この形式のトレーニング/テスト入力ファイルがあります (ファイル名ラベル):

...\000881.JPG  2
...\000961.JPG  1
...\001700.JPG  1
...\001291.JPG  1

上記の入力ファイルは、ImageDeserializer で使用されます。モデルのトレーニング後にコードから行 ID とラベルを取得できなかったため、次の形式で 2 つ目のテストファイルを作成しました。

|index 881 |piece_type 0 0 1 0 0 0
|index 961 |piece_type 0 1 0 0 0 0
|index 1700 |piece_type 0 1 0 0 0 0
|index 1291 |piece_type 0 1 0 0 0 0

2 番目のファイルの形式は、最初のファイルで表された情報と同じですが、形式が異なります。インデックスは行番号で、!piece_type は one hot 形式でエンコードされたラベルです。行番号とラベルを取得するには、2 番目の形式のファイルが必要です。2 番目のファイルは CTFDeserializer で使用され、次のような複合リーダーを作成します。

image_source = ImageDeserializer(map_file, StreamDefs(
    features = StreamDef(field='image', transforms=transforms), # first column in map file is referred to as 'image'
    labels   = StreamDef(field='label', shape=num_classes)      # and second as 'label'
))

text_source = CTFDeserializer("test_map2.txt")
text_source.map_input('index', dim=1, format="dense")
text_source.map_input('piece_type', dim=6, format="dense")

# define a composite reader
reader_config = ReaderConfig([image_source, text_source])

minibatch_source = reader_config.minibatch_source()

2 番目のファイルを追加した理由は、混同行列を作成できるようにするためです。次に、テストする特定のミニバッチの真のラベルと予測されたラベルの両方を取得できるようにする必要があります。入力画像へのポインタパックを取得するために、行番号があると便利です。

たった1つの入力ファイルでこれを行うことができるのでしょうか? 複数のファイルやフォーマットを扱うのは少し面倒です。

score 0 · Accepted Answer

You could load the test images without using a reader as described in this wiki page. Admittedly this puts the burden of all the transformations (cropping/mean subtraction etc.) to the user but at least the PIL package makes these easy. This CNTK tutorial uses PIL to crop and scale the input images before feeding them to CNTK.

cntk - 画像データ、行 ID、およびラベルを 1 つの入力ファイルに結合しますか?

1 に答える 1

Related

Reference