Argo で手動ワークフローを開始したいと考えています。私は Openshift と ArgoCD を使用しており、Argo で正常に実行されるワークフローをスケジュールしましたが、1 つのワークフローの手動実行をトリガーすると失敗します。
関連するワークフローは次のとおりです。
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: "obslytics-data-exporter-manual-workflow-"
labels:
owner: "obslytics-remote-reader"
app: "obslytics-data-exporter"
pipeline: "obslytics-data-exporter"
spec:
arguments:
parameters:
- name: start_timestamp
value: "2020-11-18T20:00:00Z"
entrypoint: manual-trigger
templates:
- name: manual-trigger
steps:
- - name: trigger
templateRef:
name: "obslytics-data-exporter-workflow-triggers"
template: trigger-workflow
volumes:
- name: "obslytics-data-exporter-workflow-secrets"
secret:
secretname: "obslytics-data-exporter-workflow-secrets"
コマンドを実行すると:
argo submit trigger.local.yaml
ビルド Pod は完了しましたが、残りの Pod は失敗します。
➜ dh-workflow-obslytics git:(master) ✗ oc get pods
NAME READY STATUS RESTARTS AGE
argo-ui-7fcf5ff95-9k8cc 1/1 Running 0 3d
gateway-controller-76bb888f7b-lq84r 1/1 Running 0 3d
obslytics-data-exporter-1-build 0/1 Completed 0 3d
obslytics-data-exporter-calendar-gateway-fbbb8d7-zhdnf 2/2 Running 1 3d
obslytics-data-exporter-manual-workflow-m7jdg-1074461258 0/2 Error 0 4m
obslytics-data-exporter-manual-workflow-m7jdg-1477271209 0/2 Error 0 4m
obslytics-data-exporter-manual-workflow-m7jdg-1544087495 0/2 Error 0 4m
obslytics-data-exporter-manual-workflow-m7jdg-1979266120 0/2 Completed 0 4m
obslytics-data-exporter-sensor-6594954795-xw8fk 1/1 Running 0 3d
opendatahub-operator-8994ddcf8-v8wxm 1/1 Running 0 3d
sensor-controller-58bdc7c4f4-9h4jw 1/1 Running 0 3d
workflow-controller-759649b79b-s69l7 1/1 Running 0 3d
で始まるポッドobslytics-data-exporter-manual-workflow
は、問題のあるポッドです。ポッドを記述してデバッグしようとすると:
➜ dh-workflow-obslytics git:(master) ✗ oc describe pods/obslytics-data-exporter-manual-workflow-4hzqz-3278280317
Name: obslytics-data-exporter-manual-workflow-4hzqz-3278280317
Namespace: dh-dev-argo
Priority: 0
PriorityClassName: <none>
Node: avsrivas-dev-ocp-3.11/10.0.111.224
Start Time: Tue, 24 Nov 2020 07:27:57 -0500
Labels: workflows.argoproj.io/completed=true
workflows.argoproj.io/workflow=obslytics-data-exporter-manual-workflow-4hzqz
Annotations: openshift.io/scc=restricted
workflows.argoproj.io/node-message=timeout after 0s
workflows.argoproj.io/node-name=obslytics-data-exporter-manual-workflow-4hzqz[0].trigger[1].run[0].metric-split(0:cluster_version)[0].process-metric(0)
workflows.argoproj.io/template={"name":"run-obslytics","arguments":{},"inputs":{"parameters":[{"name":"metric","value":"cluster_version"},{"name":"start_timestamp","value":"2020-11-18T20:00:00Z"},{"na...
Status: Failed
IP: 10.128.0.69
Controlled By: Workflow/obslytics-data-exporter-manual-workflow-4hzqz
Init Containers:
init:
Container ID: docker://25b95c684ef66b13520ba9deeba353082142f3bb39bafe443ee508074c58047e
Image: argoproj/argoexec:v2.4.2
Image ID: docker-pullable://docker.io/argoproj/argoexec@sha256:4e393daa6ed985cf680bcf0ecf04f7b0758940f0789505428331fcfe99cce06b
Port: <none>
Host Port: <none>
Command:
argoexec
init
State: Terminated
Reason: Completed
Exit Code: 0
Started: Tue, 24 Nov 2020 07:27:59 -0500
Finished: Tue, 24 Nov 2020 07:27:59 -0500
Ready: True
Restart Count: 0
Environment:
ARGO_POD_NAME: obslytics-data-exporter-manual-workflow-4hzqz-3278280317 (v1:metadata.name)
ARGO_CONTAINER_RUNTIME_EXECUTOR: k8sapi
Mounts:
/argo/podmetadata from podmetadata (rw)
/argo/staging from argo-staging (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-qpggm (ro)
Containers:
wait:
Container ID: docker://a94e7f1bc1cfec4c8b549120193b697c91760bb8f3af414babef1d6f7ccee831
Image: argoproj/argoexec:v2.4.2
Image ID: docker-pullable://docker.io/argoproj/argoexec@sha256:4e393daa6ed985cf680bcf0ecf04f7b0758940f0789505428331fcfe99cce06b
Port: <none>
Host Port: <none>
Command:
argoexec
wait
State: Terminated
Reason: Completed
Message: timeout after 0s
Exit Code: 0
Started: Tue, 24 Nov 2020 07:28:00 -0500
Finished: Tue, 24 Nov 2020 07:28:01 -0500
Ready: False
Restart Count: 0
Environment:
ARGO_POD_NAME: obslytics-data-exporter-manual-workflow-4hzqz-3278280317 (v1:metadata.name)
ARGO_CONTAINER_RUNTIME_EXECUTOR: k8sapi
Mounts:
/argo/podmetadata from podmetadata (rw)
/mainctrfs/argo/staging from argo-staging (rw)
/mainctrfs/etc/obslytics-data-exporter from obslytics-data-exporter-workflow-secrets (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-qpggm (ro)
main:
Container ID: docker://<some_id>
Image: docker-registry.default.svc:5000/<some_id>
Image ID: docker-pullable://docker-registry.default.svc:5000/<some_id>
Port: <none>
Host Port: <none>
Command:
/bin/sh
-e
Args:
/argo/staging/script
State: Terminated
Reason: Error
Exit Code: 126
Started: Tue, 24 Nov 2020 07:28:01 -0500
Finished: Tue, 24 Nov 2020 07:28:01 -0500
Ready: False
Restart Count: 0
Limits:
memory: 1Gi
Requests:
memory: 1Gi
Environment: <none>
Mounts:
/argo/staging from argo-staging (rw)
/etc/obslytics-data-exporter from obslytics-data-exporter-workflow-secrets (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-qpggm (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
podmetadata:
Type: DownwardAPI (a volume populated by information about the pod)
Items:
metadata.annotations -> annotations
obslytics-data-exporter-workflow-secrets:
Type: Secret (a volume populated by a Secret)
SecretName: obslytics-data-exporter-workflow-secrets
Optional: false
argo-staging:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
default-token-qpggm:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-qpggm
Optional: false
QoS Class: Burstable
Node-Selectors: node-role.kubernetes.io/compute=true
Tolerations: node.kubernetes.io/memory-pressure:NoSchedule
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 27m default-scheduler Successfully assigned dh-dev-argo/obslytics-data-exporter-manual-workflow-4hzqz-3278280317 to avsrivas-dev-ocp-3.11
Normal Pulled 27m kubelet, avsrivas-dev-ocp-3.11 Container image "argoproj/argoexec:v2.4.2" already present on machine
Normal Created 27m kubelet, avsrivas-dev-ocp-3.11 Created container
Normal Started 27m kubelet, avsrivas-dev-ocp-3.11 Started container
Normal Pulled 27m kubelet, avsrivas-dev-ocp-3.11 Container image "argoproj/argoexec:v2.4.2" already present on machine
Normal Created 27m kubelet, avsrivas-dev-ocp-3.11 Created container
Normal Started 27m kubelet, avsrivas-dev-ocp-3.11 Started container
Normal Pulling 27m kubelet, avsrivas-dev-ocp-3.11 pulling image "docker-registry.default.svc:5000/dh-dev-argo/obslytics-data-exporter:latest"
Normal Pulled 27m kubelet, avsrivas-dev-ocp-3.11 Successfully pulled image "docker-registry.default.svc:5000/dh-dev-argo/obslytics-data-exporter:latest"
Normal Created 27m kubelet, avsrivas-dev-ocp-3.11 Created container
Normal Started 27m kubelet, avsrivas-dev-ocp-3.11 Started container
上記の説明からわかる唯一のことは、エラーが原因でポッドが失敗することです。この問題をデバッグするためのエラーが表示されません。
Argo ウォッチ ログを読み取ろうとすると、次のようになります。
Name: obslytics-data-exporter-manual-workflow-8wzcc
Namespace: dh-dev-argo
ServiceAccount: default
Status: Running
Created: Tue Nov 24 08:01:10 -0500 (8 minutes ago)
Started: Tue Nov 24 08:01:10 -0500 (8 minutes ago)
Duration: 8 minutes 10 seconds
Progress:
Parameters:
start_timestamp: 2020-11-18T20:00:00Z
STEP TEMPLATE PODNAME DURATION MESSAGE
● obslytics-data-exporter-manual-workflow-8wzcc manual-trigger
└───● trigger obslytics-data-exporter-workflow-triggers/trigger-workflow
├───✔ get-labels(0) obslytics-data-exporter-workflow-template/get-labels obslytics-data-exporter-manual-workflow-8wzcc-2604296472 6s
└───● run obslytics-data-exporter-workflow-template/init
└───● metric-split(0:cluster_version) metric-worker
└───● process-metric run-obslytics
├─✖ process-metric(0) run-obslytics obslytics-data-exporter-manual-workflow-8wzcc-4222496183 6s failed with exit code 126
└─◷ process-metric(1) run-obslytics obslytics-data-exporter-manual-workflow-8wzcc-531670266 7m PodInitializing