PostgreSQL Persistence for Model Metadata

Important

Before starting the installation procedure, please download installation resources as explained here and make sure that all pre-requisites are satisfied.

This page also assumes that main Seldon Core and Seldon Enterprise Platform components are installed.

Warning

PostgreSQL is an external component outside of the main Seldon stack. Therefore, it is the cluster administrator’s responsibility to administrate and manage the PostgreSQL instance used by Seldon.

We use PostgreSQL for persisting model metadata information.

Seldon Enterprise Platform Configuration

Enabling/disabling the PostgreSQL dependency in Seldon Enterprise Platform can be done with setting the following Helm variable - metadata.pg.enabled. If it is set to false Seldon Enterprise Platform will not attempt to connect to a PostgreSQL database, but all model metadata functionality will be unavailable. If metadata.pg.enabled is true, then Seldon Enterprise Platform will expect a metadata-postgres Kubernetes secret to be present in the namespace where Seldon Enterprise Platform is running. This secret needs to contain the information for connecting to a PostgreSQL database. The structure of the secret is:

kind: Secret
apiVersion: v1
data:
  dbname: the_name_of_the_database_to_use_for_model_metadata
  host: the_database_host
  user: the_database_user_to_use_to_authenticate
  password: the_database_password_to_use_to_authenticate
  port: the_port_the_database_is_exposed_on
  sslmode: the_sslmode
  ca.crt: the_ca_certificate_to_verify_identity_of_the_server # optional, based on sslmode

Installation

PostgreSQL can be installed in many different ways - using managed solutions by cloud providers, or running it in Kubernetes.

Bringing your own PostgreSQL

One option is to use PostgreSQL outside of the Kubernetes cluster that runs Seldon Enterprise Platform. If you already have a database you want to use with Seldon Enterprise Platform running on prem or in the cloud you can add the connection information in the metadata-postgres secret in the namespace Seldon Enterprise Platform is running like this substituting, the values with the ones of your database:

kubectl create secret generic -n seldon-system metadata-postgres \
--from-literal=user=your_user \
--from-literal=password=your_password \
--from-literal=host=your.postgres.host \
--from-literal=port=5432 \
--from-literal=dbname=metadata \
--from-literal=sslmode=require \
--dry-run=client -o yaml \
| kubectl apply -n seldon-system -f -

In the next sections we explore how you can start using a managed PostgreSQL in AWS and GCP and connect it with Seldon Enterprise Platform.

Amazon RDS

Amazon RDS provides a managed PostgreSQL solution that can be used for Seldon Enterprise Platform’s Model Metadata Storage. For setting up RDS for the first time you can follow the docs here.

Some important points to remember while setting up RDS:

  • Make sure the instance is accessible from Seldon Enterprise Platform. If Seldon Enterprise Platform is not on the same VPC, make sure the VPC used by RDS has a public subnet as discussed here.

  • Make sure the security group used for accessing the RDS instances allow inbound and outbound traffic from and to Seldon Enterprise Platform. Setting up security groups for RDS is discussed here.

Once you have a running PostgreSQL instance, with a database and a user created, you can configure Seldon Enterprise Platform by adding the metadata-postgres secret as discussed in the previous section.

To manage backups see the official documentation. Here is more documentation on other best practices around RDS.

Google SQL

GCP provides a managed PostgreSQL solution that can be used for Seldon Enterprise Platform’s Model Metadata Storage. For setting up Google SQL for the first time you can follow the docs here.

For connection instructions follow the official documentation. Make sure that the instance is accessible from Seldon Enterprise Platform. If using the public IP generated for the instance make sure the network that runs Seldon Enterprise Platform is part of the Cloud SQL authorized networks by following this guide.

Once you have a running PostgreSQL instance, with a database and a user created, you can configure Seldon Enterprise Platform by adding the metadata-postgres secret as discussed in the previous section.

SSL Support

By default, Seldon Enterprise Platform will not perform any verification of the Postgres server certificate. To allow server certificate verification, change the SSL mode to verify-ca or verify-full as needed and place one or more root certificates in the ca.crt key in the kubernetes secret. Intermediate certificates should also be added to the file if they are needed to link the certificate chain sent by the server to the root certificates stored on the client.

kubectl create secret generic -n seldon-system metadata-postgres \
--from-literal=user=your_user \
--from-literal=password=your_password \
--from-literal=host=your.postgres.host \
--from-literal=port=5432 \
--from-literal=dbname=metadata \
--from-literal=sslmode=verify-ca \
--from-file=ca.crt=/path/to/caFile \
--dry-run=client -o yaml \
| kubectl apply -n seldon-system -f -

Further, if the server attempts to verify the identity of the client by requesting the client’s leaf certificates, create another kubernetes TLS secret with client certificates for the connection. Here, we create a secret named postgres-client-certs for this purpose. See helm chart configuration section for details on usage of these secrets created.

kubectl create secret tls -n seldon-system postgres-client-certs \
--cert=`/path/to/cert` \
--key=`/path/to/key` \
--dry-run=client -o yaml \
| kubectl apply -n seldon-system -f -

Running PostgreSQL in Kubernetes

You can also run PostgreSQL in the Kubernetes cluster that runs Seldon Enterprise Platform. We recommend using the Zalando PostgreSQL operator to manage the PostgreSQL installation and maintenance. The official documentation can be seen here.

Warning

If your cluster is using Kubernetes version 1.25 or higher, you should install version 1.9.0+ of Zalando’s PostgreSQL operator. You can also confer with their installation matrix.

The instructions that follow will help you to quickly spin up a PostgreSQL instance. However, we don’t recommend using it in a production context, and should be treated as development only.

Below we show an example deployment of a PostgreSQL cluster:

To install the Zalando operator you can run:

git clone https://github.com/zalando/postgres-operator.git
cd postgres-operator
git checkout v1.8.2 # Use a tag to pin what we are using.
kubectl create namespace postgres || echo "namespace postgres exists"
helm install postgres-operator ./charts/postgres-operator --namespace postgres

If you want to install the operator UI you can do it by following this doc.

To install a minimal PostgreSQL setup you can run:

cat << EOF | kubectl apply -f -
apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
  name: seldon-metadata-storage
  namespace: postgres
spec:
  teamId: "seldon"
  volume:
    size: 5Gi
  numberOfInstances: 2
  users:
    seldon:  # database owner
    - superuser
    - createdb
  databases:
    metadata: seldon  # dbname: owner
  postgresql:
    version: "13"
EOF

For a more complex setup consisting of more users, databases, replicas, etc. please refer to the official documentation of the operator here.

Once the database instances have been created by the Zalando operator you can create the expected secret using the auto generated password:

kubectl get secret seldon.seldon-metadata-storage.credentials.postgresql.acid.zalan.do -n postgres -o 'jsonpath={.data.password}' | base64 -d > db_pass
kubectl create secret generic -n seldon-system metadata-postgres \
  --from-literal=user=seldon \
  --from-file=password=./db_pass \
  --from-literal=host=seldon-metadata-storage.postgres.svc.cluster.local \
  --from-literal=port=5432 \
  --from-literal=dbname=metadata \
  --from-literal=sslmode=require \
  --dry-run=client -o yaml \
  | kubectl apply -n seldon-system -f -
rm db_pass

Configuring Seldon Enterprise Platform

Once you have your PostgreSQL database ready and the secrets with credentials ready, add the following to deploy-values.yaml. See SSL support section for configuring client certs for mutual TLS verification.

metadata:
  pg:
    enabled: true
    secret: metadata-postgres
    clientTLSSecret: "postgres-client-certs" # Optional, only needed for SSL verification

Warning

Setting metadata.pg.enabled will cause the request logger to automatically try to retrieve metadata from Enterprise Platform. Ensure you have the correct configuration for this to work properly.

Production operations on self-managed PostgreSQL

One of the drawbacks of using self-hosted PostgreSQL rather than a managed solution is that you will need to handle operating the PostgreSQL cluster. Here is a list of some resources for best practices and how to handle some operations: