tensorflow - ML-Engine 予測でエラーが発生するが、ローカル予測は正常に機能する

Question

私はここでたくさん検索しましたが、残念ながら答えを見つけることができませんでした.

TensorFlow 1.3ローカルマシンで (MacOS の PiP 経由でインストールして)実行しており、提供された" ssd_mobilenet_v1_coco" チェックポイントを使用してモデルを作成しました。

ローカルと ML-Engine (Runtime 1.2) でトレーニングを行い、savedModel を ML-Engine に正常にデプロイしました。

ローカル予測 (コードの下) は正常に機能し、モデルの結果が得られます

gcloud ml-engine local predict --model-dir=... --json-instances=request.json

 FILE request.json: {"inputs": [[[242, 240, 239], [242, 240, 239], [242, 240, 239], [242, 240, 239], [242, 240, 23]]]}

ただし、モデルをデプロイし、以下のコードを使用してリモート予測のために ML-ENGINE で実行しようとする場合:

gcloud ml-engine predict --model "testModel" --json-instances request.json(SAME JSON FILE AS BEFORE)

次のエラーが表示されます。

{
  "error": "Prediction failed: Exception during model execution: AbortionError(code=StatusCode.INVALID_ARGUMENT, details=\"NodeDef mentions attr 'data_format' not in Op<name=DepthwiseConv2dNative; signature=input:T, filter:T -> output:T; attr=T:type,allowed=[DT_FLOAT, DT_DOUBLE]; attr=strides:list(int); attr=padding:string,allowed=[\"SAME\", \"VALID\"]>; NodeDef: FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_1_depthwise/depthwise = DepthwiseConv2dNative[T=DT_FLOAT, _output_shapes=[[-1,150,150,32]], data_format=\"NHWC\", padding=\"SAME\", strides=[1, 1, 1, 1], _device=\"/job:localhost/replica:0/task:0/cpu:0\"](FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_0/Relu6, FeatureExtractor/MobilenetV1/Conv2d_1_depthwise/depthwise_weights/read)\n\t [[Node: FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_1_depthwise/depthwise = DepthwiseConv2dNative[T=DT_FLOAT, _output_shapes=[[-1,150,150,32]], data_format=\"NHWC\", padding=\"SAME\", strides=[1, 1, 1, 1], _device=\"/job:localhost/replica:0/task:0/cpu:0\"](FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_0/Relu6, FeatureExtractor/MobilenetV1/Conv2d_1_depthwise/depthwise_weights/read)]]\")"
}

ここで似たようなものを見ました: https://github.com/tensorflow/models/issues/1581

「data-format」パラメータの問題について。残念ながら、私はすでに TensorFlow 1.3 を使用しているため、そのソリューションを使用できませんでした。

MobilenetV1 にも問題があるようです: https://github.com/tensorflow/models/issues/2153

何か案は？

score 2 · Accepted Answer

モデルバージョンが、実行する必要がある正しい tensorflow バージョンを実行していることを確認する方法を知りたい場合は、まずこのモデルバージョンリストページをご覧ください。

必要な Tensorflow バージョンをサポートするモデルバージョンを知る必要があります。執筆時点：

ML バージョン 1.4 は TensorFlow 1.4.0 および 1.4.1 をサポートします
ML バージョン 1.2 は TensorFlow 1.2.0 をサポートし、
ML バージョン 1.0 は TensorFlow 1.0.1 をサポートします

必要なモデルのバージョンがわかったので、次のようにモデルから新しいバージョンを作成する必要があります。

gcloud ml-engine versions create <version name> \
--model=<Name of the model> \
--origin=<Model bucket link. It starts with gs://...> \
--runtime-version=1.4

私の場合、Tensorflow 1.4.1 を使用して予測する必要があったため、ランタイムバージョン 1.4 を使用しました。

この公式 MNIST チュートリアルページと、このML バージョニングページを参照してください。

tensorflow - ML-Engine 予測でエラーが発生するが、ローカル予測は正常に機能する

2 に答える 2

Related

Reference