Argo Workflows

This page provides steps for installing Argo Workflows, which are required to run batch jobs in Seldon Enterprise Platform

Installation of Argo

We suggest installing Argo Workflows in line with the official instructions. At the time of writing these are:

kubectl create namespace argo
kubectl apply -n argo -f https://github.com/argoproj/argo-workflows/releases/download/v3.4.6/install.yaml

Per Namespace Setup

If you intend to use batch jobs in a namespace, the following need to be configured:

  1. Service account and rolebinding for workflows

  2. Storage initializer secret for retrieving data

Service Accounts and Role Bindings

A service account and rolebinding need to be created to allow the Enterprise Platform server to access and create Argo Workflows:

export NAMESPACE=seldon

cat << EOF > workflow-role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: workflow
rules:
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - "*"
- apiGroups:
  - "apps"
  resources:
  - deployments
  verbs:
  - "*"
- apiGroups:
  - ""
  resources:
  - pods/log
  verbs:
  - "*"
- apiGroups:
  - machinelearning.seldon.io
  resources:
  - "*"
  verbs:
  - "*"
EOF

kubectl apply -n ${NAMESPACE} -f workflow-role.yaml

kubectl create -n ${NAMESPACE} serviceaccount workflow

kubectl create rolebinding -n ${NAMESPACE} workflow --role=workflow --serviceaccount=${NAMESPACE}:workflow

Storage Initializer Secret

Argo Workflows in Seldon Enterprise Platform use a storage initializer mechanism similar to one used for Prepackaged Model Servers in order to access data from external data stores.

Secrets containing the storage access credentials need to be created.

The format of these secrets depends on the storage initializer. By default, Seldon Enterprise Platform uses the Rclone-based storage initializer, which is specified in deploy-values.yaml:

batchjobs:
  storageInitializer:
    image: seldonio/rclone-storage-initializer:1.18.2

Running on GKE or inside Kind cluster

If running inside kind cluster or on GKE you must patch Argo’s config. For example:

kubectl patch -n argo configmap workflow-controller-configmap --type merge \
    -p '{"data": {"config": "containerRuntimeExecutor: k8sapi"}}'

Verification and Debugging

You can check the status of Argo Workflows by going to the Argo Workflows UI. First port-forward the web server:

kubectl port-forward -n argo svc/argo-server 2746

Then go to https://localhost:2746/ in the browser.

If Argo Workflows is set up correctly then you should be able to run the batch demo.

To see running jobs, you can use the Argo Workflows UI, or its CLI, if you install it. You can list jobs in the namespace with argo list -n <namespace>. An argo get tells you the pod names of the steps.

To see logs for a running job, go to the relevant pod. If you don’t have the Argo Workflows CLI you can work out the pod name as there should be a pod in the namespace with a running status and a name similar to the model name.