Seldon Deployment Canary Promotion

Iris Model

iris

Iris is the genus of flower which contains 3 species: setosa, versicolor, and virginica. This demo is based on iris classification model based on flower properties like sepal length, sepal width, petal length, and petal width. The species are also the classes that will be used for the classification. Here we will:

  • Deploy a pretrained sklearn iris model

  • Load test the model

  • Observe requests and metrics

  • Deploy a canary XGBoost model

  • Load test canary model

  • Observe requests and metrics for both models

  • Promote the canary model

Launch a Seldon Deployment

  1. From the Overview page, select the Create new deployment button in the top right corner

  2. In the Deployment Creation Wizard, enter the deployment details as follows:

    • Name: iris-classifier

    • Namespace: seldon

    • Type: Seldon Deployment

    • Protocol: Seldon

    Expand to see deployment

    Deployment details

  3. Configure the default predictor as follows:

    • Runtime: Scikit Learn

    • Model URI: gs://seldon-models/v1.18.1/sklearn/iris

    • Model Project: default

    • Storage Secret: (leave blank/none)

    Expand to see default predictor

    Default predictor spec

  4. Skip Next for the remaining steps, then click Launch.

  5. If your deployment is launched successfully, it will have an Available status in the Overview page.

Start Load Test

  1. Once the deployment is in an Available status, navigate to its Dashboard page by clicking on it.

  2. In the Requests Monitor section, click on the Start a load test button to start a load test with the following details:

    • Connections(total): 1

    • Load Parameter: Duration(seconds)

    • Value: 120

    • Json payload:

      {
          "data": {
          "names": ["Sepal length", "Sepal width", "Petal length", "Petal Width"],
          "ndarray": [
              [6.8,  2.8,  4.8,  1.4],
              [6.1,  3.4,  4.5,  1.6]
          ]
          }
      }
      
Expand to see load test

Load test

This will spawn a Kubernetes Job that will send continuous prediction requests for the specified seconds to the SKLearn model in the deployment. Since we have 2 sets of prediction data in the Json payload, we should expect to see double the number of requests in the metrics and the Requests page.

Observe requests and metrics

Once the load test has started, you can monitor the upcoming requests, their responses and metrics in the Requests page the deployment. If this doesn’t work, consult the request logging docs section for debugging.

Expand to see prediction requests and responses

logs

You can also see core metrics in the Dashboard page.

Expand to see prediction requests metrics

metrics

Deploy a Canary model

The next step is to create an XGBoost canary model that will be sharing a percentage of the traffic with the main model.

  1. Navigate to the Dashboard of the deployment and click on the Add Canary button.

  2. In the Update Deployment Wizard, configure the default predictor as follows:

    • Runtime: XGBoost

    • Model URI: gs://seldon-models/xgboost/iris

    • Model Project: default

    • Storage Secret: (leave blank/none)

    • Canary Traffic Percentage: 10

    Expand to see default predictor

    Default predictor spec

  3. Skip Next for the remaining steps, then click Launch.

  4. While the canary model is being launched, the deployment status will move to an Updating state.

  5. If the canary model is launched successfully, the deployment status will become Available.

This will create a new canary deployment with the XGBoost model and roughly 10% of the traffic will be sent to it.

Note

The deployment status represents the overal status of the deployment, including the main and canary models.

Load test the canary model

This time, we will create a new load test with the canary model running and observe the requests and metrics for both models. You can use either the same Json payload from the previous load test or construct a new one with different values or number of predictions.

Warning

Remember that roughly 10% of the traffic will be sent to the canary model. If, however, the canary model is not available, all the traffic will be sent to the main model.

Observe requests and metrics for both models

Once the load test has started, you can monitor the upcoming requests, their responses and metrics in the Requests page the deployment. If this doesn’t work, consult the request logging docs section for debugging.

In order to see the requests for the canary model, you need to select the canary predictor in the Node Selector filter in the Requests page, as shown on the screenshot. Take a note that the number of requests for the canary model will be 10% of the total number of requests, as was specified in the canary deployment.

Expand to see prediction requests and responses for the canary model

logs

You can also see core metrics for both models in the Dashboard page.

Expand to see prediction requests metrics for both models

metrics

Promote the Canary model

Great! Now we have observed the requests and metrics for both models. If we are happy with how the canary model is performing, we can promote it to become the main model.

  1. Navigate to the Dashboard of the deployment and click on the Promote Canary button.

  2. In the Promote Canary dialog, click Confirm to promote the canary model to the main model.

  3. If the canary model is promoted successfully, the deployment status will become Available.

Expand to see the promoted canary

metrics