NVIDIA Triton Server and Alibi Explanations

In this demo we will deploy an image classification model on NVIDIA Triton with GPUs and run explanations using Seldon Alibi. This demo also uses the Kserve V2 protocol for model prediction and explanation payload. Learn more about V2 protocol at Predict Protocol - Version 2 git repository.

Create Model

  1. Click on Create. Click Create button

  2. Enter the deployment details as follows:

    • name: tfcifar

    • namespace: seldon

    • Type: Seldon Deployment

    • Protocol: Inference V2 Deployment details

  3. Configure the default predictor as follows:

    • runtime: Triton (ONNX, PyTorch, TensorFlow, TensorRT)

    • Model Project: default

    • For URI, you have several options. Choose one of the following:

      • TensorFlow model: gs://seldon-models/triton/tf_cifar10

      • ONNX model: gs://seldon-models/triton/onnx_cifar10

      • PyTorch model: gs://seldon-models/triton/pytorch_cifar10

      For the purposes of this demo, we will use the TensorFlow model:

       gs://seldon-models/triton/tf_cifar10
      
    • Storage Secret: (leave blank/none)

    • Model Name: cifar10 Default predictor spec

  4. Click Next to skip step Prediction Parameters.

  5. You may skip this step if using the TensorFlow model.

    Note

    To determine these settings we recommend you use the NVIDIA model analyzer.

    Warning

    Ensure GPUs are available to your cluster and you have provided enough memory for your model.

    For Resource Limits, set the following parameters:

    • For Requests, set the following values:

      • Number of GPU to 1.

      • Memory to 10Gi. (Skip this step if using the TensorFlow model)

    • For Limits, set the following values:

      • Number of GPU to 1.

      • Memory to 20Gi. (Skip this step if using the TensorFlow model)

    Note

    The Requests and Limits fields for Number of GPU are cannot be different.

    Set GPU values

  6. Skip to the end and click Launch. Launch deployment

  7. If your deployment is launched successfully, it will have Available status. Deployment dashboard showing newly created deployment with available status

Make model predictions

Once your deployment is Available, you can test your model with the following images. The payload will depend on the model from above you launched.

Note

For the purposes of this demo, we will use the TensorFlow Resnet32 payload.

Warning

Ensure GPUs are available to your cluster and you have provided enough memory for your model.

Configure an Alibi Anchor Images Explainer

Important

The alibi explainer demo here is limited to use with TensorFlow Resnet32 model mentioned earlier so please continue the setup only if you are running TensorFlow model and not with ONNX or PyTorch model. The reason being that the explainer artifact below is only complatible with TensorFlow Resnet32 model only.

The explanation will offer insight into why an input was classified as high or low. It uses the anchors technique to track features from training data that correlate to category outcomes. Create a model explainer using the URI below for the saved explainer.

To configure an explainer, complete the following steps:

  1. From the tfcifar10 deployment dashboard, click Create within the Model Explanations section. The "Create" button inside the Model Explanation card being clicked

  2. For step 1 of the Explainer Configuration Wizard, select Image then click Next. Text radio option selected

  3. For step 2, make sure Anchor is selected, then click Next. Anchors option selected

  4. For step 3, enter the following value for the Explainer URI:

    gs://seldon-models/tfserving/cifar10/cifar10_anchor_image_py3.7_alibi-0.7.0
    

    click Next. Next button clicked

  5. For step 4, click Next. (Do not change any fields). Next button clicked

  6. For step 5, click Next. (Do not change any fields). You may wish to enter a comment here for a gitops enabled namespace. Next button clicked

  7. For step 6, click Launch. Next button clicked

After a short while, the explainer should become available.

Explainer status showing Available

Get Explanation for a single prediction

Navigate to the Requests page using the left navigation drawer.

navigating to the Requests page

Click on View explanation button to generate explanations for the request.

Explaining the request

Congratulations, you’ve created an explanation for the request! 🥳

Note that the explanation request is also made as the same Kserve V2 protocol payload! 🚀

Next Steps

Why not try our other demos? Ready to dive in? Read our operations guide to learn more about how to use Enterprise Platform.