pytorch - SageMaker Pipeline と RegisterModel を使用して PyTorch モデルをデプロイする際のエラー

Question

SageMaker Pipelineを使用して pytorch モデルをデプロイする例を誰か提供できますか?

SageMaker Studio の MLOps テンプレート (モデルの構築、トレーニング、デプロイ用の MLOps テンプレート) を使用して、MLOps プロジェクトを構築しました。

テンプレートは、sagemaker パイプラインを使用して、モデルの前処理とトレーニングと登録のためのパイプラインを構築しています。また、デプロイスクリプトは YAML ファイルに実装され、CloudFormation を使用して実行されます。モデルが登録されると、デプロイスクリプトが自動的にトリガーされます。

テンプレートは xgboost モデルを使用してデータをトレーニングし、モデルをデプロイします。Pytorch を使用してデプロイしたい。pytorch を xgboost に置き換え、データの前処理、モデルのトレーニング、モデルの登録に成功しました。しかし、モデルでは inference.py を使用しませんでした。そのため、モデルのデプロイでエラーが発生します。

エンドポイントを更新する際のエラーログは次のとおりです。

FileNotFoundError: [Errno 2] No such file or directory: '/opt/ml/model/code/inference.py'

pytorch モデルに inference.py を使用する例を見つけようとしましたが、 sagemaker パイプラインとRegisterModelを使用する例は見つかりませんでした。

どんな助けでも大歓迎です。

以下に、モデルのトレーニングと登録のためのパイプラインの一部を示します。

from sagemaker.pytorch.estimator import PyTorch
from sagemaker.workflow.pipeline import Pipeline
from sagemaker.workflow.steps import (
    ProcessingStep,
    TrainingStep,
)
from sagemaker.workflow.step_collections import RegisterModel

pytorch_estimator = PyTorch(entry_point= os.path.join(BASE_DIR, 'train.py'),
                            instance_type= "ml.m5.xlarge",
                            instance_count=1,
                            role=role,
                            framework_version='1.8.0',
                            py_version='py3',
                            hyperparameters = {'epochs': 5, 'batch-size': 64, 'learning-rate': 0.1})

step_train = TrainingStep(
        name="TrainModel",
        estimator=pytorch_estimator,

        inputs={
                "train": sagemaker.TrainingInput(
                            s3_data=step_process.properties.ProcessingOutputConfig.Outputs[
                            "train_data"
                            ].S3Output.S3Uri,
                            content_type="text/csv",
                        ),
                "dev": sagemaker.TrainingInput(
                            s3_data=step_process.properties.ProcessingOutputConfig.Outputs[
                            "dev_data"
                            ].S3Output.S3Uri,
                            content_type="text/csv"
                        ),
                "test": sagemaker.TrainingInput(
                            s3_data=step_process.properties.ProcessingOutputConfig.Outputs[
                            "test_data"
                            ].S3Output.S3Uri,
                            content_type="text/csv"
                        ),
        },
)
step_register = RegisterModel(
            name="RegisterModel",
            estimator=pytorch_estimator,
            model_data=step_train.properties.ModelArtifacts.S3ModelArtifacts,
            content_types=["text/csv"],
            response_types=["text/csv"],
            inference_instances=["ml.t2.medium", "ml.m5.large"],
            transform_instances=["ml.m5.large"],
            model_package_group_name=model_package_group_name,
            approval_status=model_approval_status,
        )
    
pipeline = Pipeline(
            name=pipeline_name,
            parameters=[
                processing_instance_type,
                processing_instance_count,
                training_instance_type,
                model_approval_status,
                input_data,
            ],
            steps=[step_process, step_train, step_register],
            sagemaker_session=sagemaker_session,
        )

pytorch - SageMaker Pipeline と RegisterModel を使用して PyTorch モデルをデプロイする際のエラー

1 に答える 1

Related

Reference