Audit Logging¶
Configure audit logging for all requests going through Seldon Enterprise Platform
Audit logging is an optional part of the platform which can be enabled or disabled based on your requirements. With audit logging all requests to Seldon Enterprise Platform will be logged to a configurable place.
For a description of the log schema, available metrics, and limitations please refer to the audit architecture page.
Configuration¶
Basic configuration¶
Audit Logging can be quite verbose, so it is disabled by default. Audit logging in Enterprise Platform supports two different output mechanisms:
writing to
stdout
forwarding the logs to a Fluentd instance that then can store them in your desired storage solution
Both can be setup independently by adjusting the Helm values of the Enterprise Platform chart as follows:
audit:
stdout:
enabled: true # Defaults to false. If true, will write audit logs to stdout.
fluentd:
enabled: true # Defaults to false. If true, will try to connect and send audit logs to the Fluentd instance defined in the values below.
requireConn: true # Defaults to true. If true and Fluentd.enabled is true, Enterprise Platform will not start until a connection to the audit Fluentd instance can be established.
host: audit-fluentd-aggregator.seldon-logs.svc.cluster.local # The address of the Fluentd instance.
port: 24224 # The port used for writing to the Fluentd instance.
Note
As audit logging is an essential security requirement, we default to requiring a connection to the audit Fluentd instance to be established before Enterprise Platform starts serving users if Fluentd logging is enabled.
To disable this behaviour when logging to fluentd you can set audit.fluentd.requireConn
to false
in the Enterprise Platform Helm values file.
Fluentd Setup¶
Fluentd can be used to listen for audit logs and forward them to a number of storage solutions.
Copy the default Fluentd config file (and edit if desired):
cp ./seldon-deploy-install/reference-configuration/audit/values-audit-fluentd.yaml values-audit-fluentd.yaml
Install Fluentd using the bitnami/fluentd
Helm chart:
helm upgrade --install audit-fluentd fluentd \
--repo https://charts.bitnami.com/bitnami \
--namespace seldon-logs \
--version 5.5.11 \
--values values-audit-fluentd.yaml
The important parts that must be present in the Fluentd configuration are the source
configuration and the match
part of the output configuration.
Enterprise Platform uses the forward
input plugin for sending requests to Fluentd.
The source configuration should look something like this, where {{ .Values.aggregator.port }}
is the same as .Values.audit.fluentd.port
in the Enterprise Platform Helm chart:
<source>
@type forward
bind 0.0.0.0
port {{ .Values.aggregator.port }}
</source>
For output, the pattern to use for Enterprise Platform’s audit logs in the <match>
clause is deploy_audit.**
.
An example configuration to write to a s3/minio bucket will look like this:
<match deploy_audit.**>
@type s3
aws_key_id {{ .Values.aggregator.s3.accessKey }}
aws_sec_key {{ .Values.aggregator.s3.secretKey }}
s3_bucket deploy-audit # The bucket to store the log data
s3_endpoint http://minio.minio-system.svc.cluster.local:9000/ # The endpoint URL (like "http://localhost:9000/")
s3_region us-east-1 # See the region settings of your Minio server
path logs/ # This prefix is added to each file
force_path_style true # This prevents AWS SDK from breaking endpoint URL
time_slice_format %Y%m%d%H%M # This timestamp is added to each file name
<format>
@type json
</format>
<buffer time>
@type file
path /tmp/buffer/s3
timekey 3600 # Flush the accumulated chunks every hour
timekey_wait 10m # Wait for 10m before flushing
timekey_use_utc true # Use this option if you prefer UTC timestamps
chunk_limit_size 256m # The maximum size of each chunk
</buffer>
</match>
For a list of all output plugins Fluentd supports refer to the official documentation.
Note
This configuration of Fluentd for audit logging is different from the Elasticsearch installation for container logs.
As part of the EFK stack for container logs, Fluentd runs as a DaemonSet
so it
can collect application logs from all relevant pods transparently,
regardless of which nodes they are placed on.
For audit logging, the pods do not need to be running on the same node because Enterprise Platform explicitly forwards information to Fluentd. Using a small number of Fluentd pods, rather than one per node, allows audit logging to be more resource efficient. It also allows for scaling the audit logging pipeline with the actual load on it, rather than with the number of nodes in Kubernetes.
Troubleshooting¶
Preserving the source IP of the original client¶
The audit logs contain a SourceAddress
field which shows the IP of the sender of the request.
Seldon Enterprise Platform will first look for a X-Forwarded-For
header to determine the original client and only if not present will use the address in the request.
By default, Istio does not include the X-Forwarded-For
header.
To include it, some configuration to the Ingress Gateway is required as described in the official documentation depending on the type of your load balancer.
The Kubernetes Nginx ingress controller also has documentation on proxying and source IP addresses.
If the actual client IP is still not shown properly, make sure there are no other proxies, load balancers, or firewalls overwriting the IP.