Seldon Pipeline Canary Promotion

Iris Model

iris

Iris is the genus of flower which contains 3 species: setosa, versicolor, and virginica. This demo is based on iris classification model based on flower properties like sepal length, sepal width, petal length, and petal width. The species are also the classes that will be used for the classification. Here we will:

  • Deploy a pretrained sklearn iris model

  • Load test the model

  • Observe requests and metrics

  • Deploy a canary XGBoost model

  • Load test canary model

  • Observe requests and metrics for both models

  • Promote the canary model

Launch a Seldon Pipeline

  1. From the Overview page, select the Create new deployment button in the top right corner

  2. In the Deployment Creation Wizard, enter the deployment details as follows:

    • Name: iris-classifier

    • Namespace: seldon

    • Type: Seldon ML Pipeline

    Expand to see deployment

    Deployment details

  3. Configure the default predictor as follows:

    • Runtime: Scikit Learn

    • Model URI: gs://seldon-models/scv2/samples/mlserver_1.6.0/iris-sklearn

    • Model Project: default

    • Storage Secret: (leave blank/none)

    Expand to see default predictor

    Default predictor spec

  4. Skip Next for the remaining steps, then click Launch.

  5. If your deployment is launched successfully, it will have an Available status in the Overview page.

Start Load Test

  1. Once the deployment is in an Available status, navigate to its Dashboard page by clicking on it.

  2. In the Requests Monitor section, click on the Start a load test button to start a load test with the following details:

    • Connections(total): 1

    • Load Parameter: Duration(seconds)

    • Value: 120

    • Json payload:

      {
          "inputs": [
              {
                  "name": "predict",
                  "data": [
                      0.38606369295833043,
                      0.006894049558299753,
                      0.6104082981607108,
                      0.3958954239450676
                  ],
                  "datatype": "FP64",
                  "shape": [
                      1,
                      4
                  ]
              }
          ]
      }
      
Expand to see load test

Load test

This will spawn a Kubernetes Job that will send continuous prediction requests for the specified seconds to the SKLearn model in the deployment.

Observe requests and metrics

Once the load test has started, you can monitor the upcoming requests, their responses and metrics in the Requests page the deployment. If this doesn’t work, consult the request logging docs section for debugging.

Expand to see prediction requests and responses

logs

You can also see core metrics in the Dashboard page.

Expand to see prediction requests metrics

metrics

Deploy a Canary model

The next step is to create an XGBoost canary model.

  1. Navigate to the Dashboard of the deployment and click on the Add Canary button.

  2. In the Canary Configuration Wizard, configure the default predictor as follows:

    • Runtime: XGBoost

    • Model URI: gs://seldon-models/xgboost/iris

    • Model Project: default

    • Storage Secret: (leave blank/none)

    • Canary Traffic Percentage: 10

    Expand to see default predictor

    Default predictor spec

  3. Skip Next for the remaining steps, then click Launch.

  4. If the canary model is launched successfully, the deployment will remain in an Available status.

This will create a new canary deployment with the XGBoost model and 10% of the traffic will be sent to it.

Note

The deployment status represents the status of the main model. If the canary model is not successfully launched, there will be a warning icon you can click on to see the error message.

Load test the canary model

This time, we will create a new load test with the canary model running and observe the requests and metrics for both models. You can use either the same Json payload from the previous load test or construct a new one with different values or number of predictions.

Warning

Remember that roughly 10% of the traffic will be sent to the canary model. If, however, the canary model is not available, all the traffic will be sent to the main model.

Observe requests and metrics for both models

Once the load test has started, you can monitor the upcoming requests, their responses and metrics in the Requests page the deployment. If this doesn’t work, consult the request logging docs section for debugging.

In order to see the requests for the canary model, you need to select the iris-classifier-canary predictor and the related Node in the Node Selector filter in the Requests page, as shown on the screenshot. Take a note that the number of requests for the canary model will be 10% of the total number of requests, as was specified in the canary deployment.

Expand to see prediction requests and responses for the canary model

logs

You can also see core metrics for both models in the Dashboard page.

Expand to see prediction requests metrics for both models

metrics

Promote the Canary model

Great! Now we have observed the requests and metrics for both models. If we are happy with how the canary model is performing, we can promote it to become the main model.

  1. Navigate to the Dashboard of the deployment and click on the Promote Canary button.

  2. In the Promote Canary dialog, click Confirm to promote the canary model to the main model.

  3. If the canary model is promoted successfully, the deployment will remain in an Available status.

Expand to see the promoted canary

metrics