Audit Logging

Configure audit logging for all requests going through Seldon Enterprise Platform

Audit logging is an optional part of the platform which can be enabled or disabled based on your requirements. With audit logging all requests to Seldon Enterprise Platform will be logged to a configurable place.

For a description of the log schema, available metrics, and limitations please refer to the audit architecture page.

Configuration

Basic configuration

Audit Logging can be quite verbose, so it is disabled by default. Audit logging in Enterprise Platform supports two different output mechanisms:

  • writing to stdout

  • forwarding the logs to a Fluentd instance that then can store them in your desired storage solution

Both can be setup independently by adjusting the Helm values of the Enterprise Platform chart as follows:

audit:
  stdout:
    enabled: true # Defaults to false. If true, will write audit logs to stdout.
  fluentd:
    enabled: true # Defaults to false. If true, will try to connect and send audit logs to the Fluentd instance defined in the values below.
    requireConn: true # Defaults to true. If true and Fluentd.enabled is true, Enterprise Platform will not start until a connection to the audit Fluentd instance can be established.
    host: audit-fluentd-aggregator.seldon-logs.svc.cluster.local # The address of the Fluentd instance.
    port: 24224 # The port used for writing to the Fluentd instance.

Note

As audit logging is an essential security requirement, we default to requiring a connection to the audit Fluentd instance to be established before Enterprise Platform starts serving users if Fluentd logging is enabled. To disable this behaviour when logging to fluentd you can set audit.fluentd.requireConn to false in the Enterprise Platform Helm values file.

Fluentd Setup

Fluentd can be used to listen for audit logs and forward them to a number of storage solutions.

Copy the default Fluentd config file (and edit if desired):

cp ./seldon-deploy-install/reference-configuration/audit/values-audit-fluentd.yaml values-audit-fluentd.yaml

Install Fluentd using the bitnami/fluentd Helm chart:

helm upgrade --install audit-fluentd fluentd \
  --repo https://charts.bitnami.com/bitnami \
  --namespace seldon-logs \
  --version 5.5.11 \
  --values values-audit-fluentd.yaml

The important parts that must be present in the Fluentd configuration are the source configuration and the match part of the output configuration.

Enterprise Platform uses the forward input plugin for sending requests to Fluentd. The source configuration should look something like this, where {{ .Values.aggregator.port }} is the same as .Values.audit.fluentd.port in the Enterprise Platform Helm chart:

   <source>
        @type forward
        bind 0.0.0.0
        port {{ .Values.aggregator.port }}
   </source>

For output, the pattern to use for Enterprise Platform’s audit logs in the <match> clause is deploy_audit.**. An example configuration to write to a s3/minio bucket will look like this:

<match deploy_audit.**>
        @type s3
        aws_key_id {{ .Values.aggregator.s3.accessKey }}
        aws_sec_key {{ .Values.aggregator.s3.secretKey }}
        s3_bucket deploy-audit         # The bucket to store the log data
        s3_endpoint http://minio.minio-system.svc.cluster.local:9000/   # The endpoint URL (like "http://localhost:9000/")
        s3_region us-east-1           # See the region settings of your Minio server
        path logs/                    # This prefix is added to each file
        force_path_style true         # This prevents AWS SDK from breaking endpoint URL
        time_slice_format %Y%m%d%H%M  # This timestamp is added to each file name

        <format>
          @type json
        </format>

        <buffer time>
          @type file
          path /tmp/buffer/s3
          timekey 3600                 # Flush the accumulated chunks every hour
          timekey_wait 10m             # Wait for 10m before flushing
          timekey_use_utc true         # Use this option if you prefer UTC timestamps
          chunk_limit_size 256m        # The maximum size of each chunk
        </buffer>
      </match>

For a list of all output plugins Fluentd supports refer to the official documentation.

Note

This configuration of Fluentd for audit logging is different from the Elasticsearch installation for container logs.

As part of the EFK stack for container logs, Fluentd runs as a DaemonSet so it can collect application logs from all relevant pods transparently, regardless of which nodes they are placed on.

For audit logging, the pods do not need to be running on the same node because Enterprise Platform explicitly forwards information to Fluentd. Using a small number of Fluentd pods, rather than one per node, allows audit logging to be more resource efficient. It also allows for scaling the audit logging pipeline with the actual load on it, rather than with the number of nodes in Kubernetes.

Troubleshooting

Preserving the source IP of the original client

The audit logs contain a SourceAddress field which shows the IP of the sender of the request. Seldon Enterprise Platform will first look for a X-Forwarded-For header to determine the original client and only if not present will use the address in the request.

By default, Istio does not include the X-Forwarded-For header. To include it, some configuration to the Ingress Gateway is required as described in the official documentation depending on the type of your load balancer.

The Kubernetes Nginx ingress controller also has documentation on proxying and source IP addresses.

If the actual client IP is still not shown properly, make sure there are no other proxies, load balancers, or firewalls overwriting the IP.