Text Generation with Custom HuggingFace Model

In this demo we will:

  • Launch a pretrained a custom text generation HuggingFace model in a Seldon Deployment

  • Send a text input request to get a generated text prediction

The custom HuggingFace text generation model is based on the TinyStories-1M model in the HuggingFace hub.

Create a V1 Seldon Deployment

  1. On the Overview page, click on Create new deployment.

  2. Enter the deployment details as follows, then click Next:

    Parameter

    Value

    Name

    hf-custom-tiny-stories

    Namespace

    seldon [1]

    Type

    Seldon Deployment

    Protocol

    Inference V2

  3. Configure the default predictor as follows, then click Next:

    Parameter

    Value

    Runtime

    HuggingFace

    Model Project

    default

    Model URI

    gs://seldon-models/v1.18.2/huggingface/text-gen-custom-tiny-stories

    Storage Secret

    (leave blank/none) [2]

    Model Name

    (Leave blank)

  4. Fill in the predictor parameters as follows:

    Predictor Parameters

    Value

    task

    text-generation

  5. Click Next for the remaining steps[3], then click Launch.

1. The
seldon
and
seldon-gitops
namespaces are installed by default, which may not always be available. Please select a namespace which best describes your environment.

2. A secret may be required for private buckets.
3. Additional steps may be required for your specific model.
Expand to see deployment details

Deployment Details

Expand to see default predictor

Default Predictor

Expand to see predictor parameters

Predictor Parameters

Get Prediction

  1. Click on the hf-custom-tiny-stories deployment created in the previous section to enter the deployment dashboard.

  2. Inside the deployment dashboard, click on the Predict button.

  3. On the Predict page, enter the following text:

    {
       "inputs": [{
         "name": "args",
         "shape": [1],
         "datatype": "BYTES",
         "data": ["this is a test"]
       }]
    }
    
  4. Click the Predict button.

Expand to see the prediction request and response

A screenshot showing the Predict page with the textarea prepopulated and the result of the prediction

Congratulations, you’ve successfully sent a prediction request using a custom HuggingFace model! 🥳

Next Steps

Why not try our other demos? Or perhaps try running a larger-scale model? You can find one in gs://seldon-models/v1.18.2/huggingface/text-gen-custom-gpt2. However, you may need to request more memory!