Text Generation with Custom HuggingFace Model

In this demo we will:

  • Launch a pretrained a custom text generation HuggingFace model in a Seldon Pipeline

  • Send a text input request to get a generated text prediction

The custom HuggingFace text generation model is based on the TinyStories-1M model in the HuggingFace hub.

Create Model

  1. Click on Create new deployment.

  2. Enter the deployment details as follows:

    - Name: hf-custom-tiny-stories
    - Namespace: seldon
    - Type: Seldon ML Pipeline
  3. Configure the default predictor as follows:

    - Runtime: HuggingFace
    - Model Project: default
    - Model URI: gs://seldon-models/scv2/samples/mlserver_1.3.5/huggingface-text-gen-custom-tiny-stories
    - Storage Secret: (leave blank/none)

    Default predictor spec

  4. Skip to the end and click Launch.

  5. If your deployment is launched successfully, it will have Available status.

Get Predictions

  1. Click on the hf-custom-tiny-stories deployment created in the previous section to enter the deployment dashboard.

  2. Inside the deployment dashboard, click on the Predict button.

  3. On the Predict page, enter the following text:

      "inputs": [{
        "name": "args",
        "shape": [1],
        "datatype": "BYTES",
        "data": ["The brown fox jumped"]
  4. Click the Predict button. A screenshot showing the Predict page with the textarea pre-populated

Congratulations, you’ve successfully sent a prediction request using a custom HuggingFace model! 🥳

Next steps

Why not try our other demos? Or perhaps try running a larger-scale model? You can find one in gs://seldon-models/scv2/samples/mlserver_1.3.5/huggingface-text-gen-custom-gpt2. However, you may need to request more memory!