Metrics Monitoring

Important

Before starting the installation procedure, please download installation resources as explained here and make sure that all pre-requisites are satisfied.

This page also assumes that main Seldon components are installed.

Installation

The analytics component is configured with the Prometheus integration. The monitoring for Seldon Deploy is based on the Open Source Analytics package which brings in Prometheus (and Grafana) and is required for metrics collection.

Before installing we should set up a recording rules file. Name this model-usage.rules.yml. The contents of this file are given in the last section.

Copy the default model-usage.rules.yml (and edit if desired)

cp ./seldon-deploy-install/prerequisites-setup/prometheus/model-usage.rules.yml model-usage.rules.yml

Create a configmap from the file with

kubectl create configmap -n seldon-system model-usage-rules --from-file=model-usage.rules.yml --dry-run=client -o yaml | kubectl apply -f -

Without these usage rules you may see warnings about usage.

This can be mounted by setting the below extraConfigmapMounts in an analytics-values.yaml:

cat << EOF > ./analytics-values.yaml
grafana:
  resources:
    limits:
      cpu: 200m
      memory: 220Mi
    requests:
      cpu: 50m
      memory: 110Mi

prometheus:
  alertmanager:
    resources:
      limits:
        cpu: 50m
        memory: 64Mi
      requests:
        cpu: 10m
        memory: 32Mi
  nodeExporter:
    service:
      hostPort: 9200
      servicePort: 9200
    resources:
      limits:
        cpu: 200m
        memory: 220Mi
      requests:
        cpu: 50m
        memory: 110Mi
  server:
    livenessProbePeriodSeconds: 30
    retention: "90d"
    extraArgs:
      query.max-concurrency: 400
      storage.remote.read-concurrent-limit: 30
    persistentVolume:
      enabled: true
      existingClaim: ""
      mountPath: /data
      size: 32Gi
    resources:
      limits:
        cpu: 2
        memory: 4Gi
      requests:
        cpu: 800m
        memory: 1Gi
    extraConfigmapMounts:
      - name: prometheus-config-volume
        mountPath: /etc/prometheus/conf/
        subPath: ""
        configMap: prometheus-server-conf
        readOnly: true
      - name: prometheus-rules-volume
        mountPath: /etc/prometheus-rules
        subPath: ""
        configMap: prometheus-rules
        readOnly: true
      - name: model-usage-rules-volume
        mountPath: /etc/prometheus-rules/model-usage/
        subPath: ""
        configMap: model-usage-rules
        readOnly: true
EOF

Other settings in the above are suggested only. Configure to suit your disk availability.

helm repo add seldonio https://storage.googleapis.com/seldon-charts
helm repo update

helm upgrade seldon-core-analytics seldonio/seldon-core-analytics \
    --version 1.7.0 \
    --namespace seldon-system \
    -f analytics-values.yaml \
    --install

This Prometheus installation is already configured to scrape metrics from Seldon Deployments. Seldon Core documentation on analytics covers metrics discussion and configuration of Prometheus itself.

It’s possible to leverage further custom parameters provided by the helm charts, such as:

  • grafana_prom_admin_password - The admin password for grafana to use

  • persistence.enabled - This provides the configuration to enable prometheus persistence

Bringing your own Prometheus

It is possible to use your own Prometheus instance - see prometheus section in the default values file

seldon-deploy-install/sd-setup/helm-charts/seldon-deploy/values.yaml

Verification / Troubleshooting

We can port-forward prometheus in order to check it. With seldon-core-analytics the prometheus service we can do this with:

kubectl port-forward -n seldon-system svc/seldon-core-analytics-prometheus-seldon 9000:80

Then go to localhost:9090 in the browser.

To confirm the recording rules are present, go to Status > Rules and search for model-usage.

If you have a seldon model running, go to Status > Targets and search for seldon_app or just seldon. Any targets for seldon models should be green.

On the /graph page if you select from the insert metric at cursor drop-down there should be metrics that begin with the names seldon.