Batch Prediction Requests

Pre-requisites

Minio should already be installed with Seldon Deploy. The minio browser should be exposed on /minio/ (note the trailing forward slash). For a trial cluster, the credentials will by default be the same as the Deploy login, with minio using the email as its Access Key and the password as its Secret Key.

On a production cluster the namespace needs to have been setup with a service account. This can be found under the argo install documentation.

iris

We will:

  • Deploy a pretrained sklearn iris model

  • Run a batch job to get predictions

  • Check the output

Deploy Model

From the Deployments page, open the Deployment Creation Wizard by clicking on + Create near the top right of the window.

deployment wizard create button

Deployment Details

Choose a name for the deployment and which namespace you want it to be in, e.g. seldon. Set the Type and Protocol as shown below:

Name: batch-demo
Namespace: seldon
Type: Seldon Deployment
Protocol: Seldon

deployment wizard details step

Default Predictor

Set SciKit Learn as the Runtime and use the following model URI:

gs://seldon-models/v1.11.2/sklearn/iris

The Model Project can be left as default, and the Env Secret Name and Service Account fields can be left blank with the trial setup.

deployment wizard predictor step

Additional Creation Wizard Steps

Complete the remaining steps in the Deployment Creation Wizard by clicking Skip or Next. The defaults should all be fine.

Setup Input Data

Download the input data file.

Go to the minio browser and use the button in the bottom-right to create a bucket. Call it data.

createbucket

Again from the bottom-right choose to upload the input-data.txt file to the data bucket.

uploadfile

Run a Batch Job

Click on the tile for your newly-created deployment in the Deployments page of the Deploy UI.

model deployment tile

Go to the Batch Jobs page for this deployment by either clicking the ‘Batch Jobs’ button in the sidebar on the left, or by scrolling down to the ‘Requests Monitor’ pane and clicking on the ‘Batch Requests’ button under the text ‘Initiate or get the status of batch requests’.

Expand to see the sidebar button

batch jobs sidebar button

Expand to see the Requests Monitor button

batch jobs button in Requests Monitor pane

Click on the Create your first job button, enter the following details, and click Submit:

Input Data Location: s3://data/input-data.txt
Output Data Location: s3://data/output-data-{{workflow.name}}.txt
Number of Workers: 15
Number of Retries: 3
Batch Size: 1
Minimum Batch Wait Interval (sec): 0
Method: Predict
Transport Protocol: REST
Input Data Type: ndarray
Object Store Secret Name: seldon-rclone-secret
Expand to see the 'Create your first job' button

create your first job button

batchjobdetails

Note

Here seldon-rclone-secret is a pre-created secret in the same namespace as the model, containing env vars.

Give the job a couple of minutes to complete, then refresh the page to see the status.

batchjobstatus

In minio you should now see an output file:

miniooutput

If you open that file you should see contents such as:

{"data": {"names": ["t:0", "t:1", "t:2"], "ndarray": [[0.0006985194531162841, 0.003668039039435755, 0.9956334415074478]]}, "meta": {"requestPath": {"iris-container": "seldonio/sklearnserver:1.5.0-dev"}, "tags": {"tags": {"batch_id": "8a8f5e26-2b44-11eb-8723-ae3ff26c8be6", "batch_index": 3.0, "batch_instance_id": "8a8ff94e-2b44-11eb-b8d0-ae3ff26c8be6"}}}}
{"data": {"names": ["t:0", "t:1", "t:2"], "ndarray": [[0.0006985194531162841, 0.003668039039435755, 0.9956334415074478]]}, "meta": {"requestPath": {"iris-container": "seldonio/sklearnserver:1.5.0-dev"}, "tags": {"tags": {"batch_id": "8a8f5e26-2b44-11eb-8723-ae3ff26c8be6", "batch_index": 6.0, "batch_instance_id": "8a903666-2b44-11eb-b8d0-ae3ff26c8be6"}}}}
{"data": {"names": ["t:0", "t:1", "t:2"], "ndarray": [[0.0006985194531162841, 0.003668039039435755, 0.9956334415074478]]}, "meta": {"requestPath": {"iris-container": "seldonio/sklearnserver:1.5.0-dev"}, "tags": {"tags": {"batch_id": "8a8f5e26-2b44-11eb-8723-ae3ff26c8be6", "batch_index": 1.0, "batch_instance_id": "8a8fbe98-2b44-11eb-b8d0-ae3ff26c8be6"}}}}

If not, see the argo section for troubleshooting.

Micro batching

You can specify a batch-size parameter which will group multiple predictions into a single request. This allows you to take advantage of the higher performance batching provides for some models, and reduce networking overhead. The response will be split back into multiple, single-prediction responses so that the output file looks identical to running the processor with a batch size of 1.

Currently, we only support micro batching for ndarray and tensor payload types.