python - Keras で VGG19 を使用して単一の画像のラベルを予測できません

Question

[このチュートリアル]( https://towardsdatascience.com/keras-transfer-learning-for-beginners-6c9b8b7143e )に従って、KerasでトレーニングごとのVGG19モデルを使用するために転移学習法を使用しています。モデルをトレーニングする方法を示していますが、予測用のテスト画像を準備する方法は示していません。

コメント欄には次のように書かれています。

preprocess_image画像を取得し、同じ関数を使用して画像を前処理し、を呼び出しますmodel.predict(image)。これにより、その画像のモデルの予測が得られます。を使用argmax(prediction)すると、画像が属するクラスを見つけることができます。

preprocess_imageコードで使用されている名前の関数が見つかりません。私はいくつかの検索を行い、このチュートリアルで提案された方法を使用することを考えました.

しかし、これは次のようなエラーを出します:

decode_predictions expects a batch of predictions (i.e. a 2D array of shape (samples, 1000)). Found array with shape: (1, 12)

私のデータセットには 12 のカテゴリがあります。モデルをトレーニングするための完全なコードと、このエラーがどのように発生したかを次に示します。

import pandas as pd
import numpy as np
import os
import keras
import matplotlib.pyplot as plt

from keras.layers import Dense, GlobalAveragePooling2D
from keras.applications.vgg19 import VGG19
from keras.preprocessing import image
from keras.applications.vgg19 import preprocess_input
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Model
from keras.optimizers import Adam

base_model = VGG19(weights='imagenet', include_top=False)

x=base_model.output                                                          
x=GlobalAveragePooling2D()(x)                                                
x=Dense(1024,activation='relu')(x)                                           
x=Dense(1024,activation='relu')(x)                                           
x=Dense(512,activation='relu')(x)        

preds=Dense(12,activation='softmax')(x)                                      
model=Model(inputs=base_model.input,outputs=preds)                           

# view the layer architecture
# for i,layer in enumerate(model.layers):
#   print(i,layer.name)

for layer in model.layers:
    layer.trainable=False

for layer in model.layers[:20]:
    layer.trainable=False

for layer in model.layers[20:]:
    layer.trainable=True

train_datagen=ImageDataGenerator(preprocessing_function=preprocess_input)

train_generator=train_datagen.flow_from_directory('dataset',
                    target_size=(96,96), # 224, 224
                    color_mode='rgb',
                    batch_size=64,
                    class_mode='categorical',
                    shuffle=True)

model.compile(optimizer='Adam',loss='categorical_crossentropy',metrics=['accuracy'])

step_size_train=train_generator.n//train_generator.batch_size

model.fit_generator(generator=train_generator,
    steps_per_epoch=step_size_train,
    epochs=5)

# model.predict(new_image)

IPython:

In [3]: import classify_tl                                                                                                                                                   
Found 4750 images belonging to 12 classes.
Epoch 1/5
74/74 [==============================] - 583s 8s/step - loss: 2.0113 - acc: 0.4557
Epoch 2/5
74/74 [==============================] - 576s 8s/step - loss: 0.8222 - acc: 0.7170
Epoch 3/5
74/74 [==============================] - 563s 8s/step - loss: 0.5875 - acc: 0.7929
Epoch 4/5
74/74 [==============================] - 585s 8s/step - loss: 0.3897 - acc: 0.8627
Epoch 5/5
74/74 [==============================] - 610s 8s/step - loss: 0.2689 - acc: 0.9071

In [6]: model = classify_tl.model                                                                                                                                            

In [7]: print(model)                                                                                                                                                         
<keras.engine.training.Model object at 0x7fb3ad988518>

In [8]: from keras.preprocessing.image import load_img                                                                                                                       

In [9]: image = load_img('examples/0021e90e4.png', target_size=(96,96))                                                                                                      

In [10]: from keras.preprocessing.image import img_to_array                                                                                                                  

In [11]: image = img_to_array(image)                                                                                                                                         

In [12]: image = image.reshape((1, image.shape[0], image.shape[1], image.shape[2]))                                                                                          

In [13]: from keras.applications.vgg19 import preprocess_input                                                                                                               

In [14]: image = preprocess_input(image)                                                                                                                                     

In [15]: yhat = model.predict(image)                                                                                                                                         

In [16]: print(yhat)                                                                                                                                                         
[[1.3975363e-06 3.1069856e-05 9.9680350e-05 1.7175063e-03 6.2767825e-08
  2.6133494e-03 7.2859187e-08 6.0187017e-07 2.0794137e-06 1.3714411e-03
  9.9416250e-01 2.6067207e-07]]

In [17]: from keras.applications.vgg19 import decode_predictions                                                                                                             

In [18]: label = decode_predictions(yhat)

IPython プロンプトの最後の行で、次のエラーが発生します。

ValueError: `decode_predictions` expects a batch of predictions (i.e. a 2D array of shape (samples, 1000)). Found array with shape: (1, 12)

テスト画像を適切にフィードして予測を取得するにはどうすればよいですか?

score 1 · Accepted Answer

decode_predictions1000 個のクラスを持つ ImageNet データセットのクラスのラベルに従って、モデルの予測をデコードするために使用されます。ただし、微調整されたモデルには 12 クラスしかありません。したがって、ここで使用しても意味がありませんdecode_predictions。確かに、これら 12 のクラスのラベルが何であるかを知っている必要があります。したがって、予測で最大スコアのインデックスを取得し、そのラベルを見つけます。

# create a list containing the class labels
class_labels = ['class1', 'class2', 'class3', ...., 'class12']

# find the index of the class with maximum score
pred = np.argmax(class_labels, axis=-1)

# print the label of the class with maximum score
print(class_labels[pred[0]])

python - Keras で VGG19 を使用して単一の画像のラベルを予測できません

1 に答える 1

Related

Reference