Audit Logging¶
Configure audit logging for all requests going through Seldon Deploy
Audit logging is an optional part of the platform which can be enabled or disabled based on your requirements. With audit logging all requests to Seldon Deploy will be logged to a configurable place.
For a description of the log schema, available metrics, and limitations please refer to the audit architecture page.
Configuration¶
Basic configuration¶
Audit Logging can be quite verbose, so it is disabled by default. Audit logging in Deploy supports two different output mechanisms:
writing to
stdout
forwarding the logs to a Fluentd instance that then can store them in your desired storage solution
Both can be setup independently by adjusting the helm values of the Deploy chart as follows:
audit:
stdout:
enabled: true # Defaults to false. If true, will write audit logs to stdout.
fluentd:
enabled: true # Defaults to false. If true, will try to connect and send audit logs to the Fluentd instance defined in the values below.
requireConn: true # Defaults to true. If true and Fluentd.enabled is true, Deploy will not start until a connection to the audit Fluentd instance can be established.
host: audit-fluentd-aggregator.seldon-logs.svc.cluster.local # The address of the Fluentd instance.
port: 24224 # The port used for writing to the Fluentd instance.
Note
As audit logging is an essential security requirement, we default to requiring a connection to the audit Fluentd instance to be established before Deploy starts serving users if Fluentd logging is enabled.
To disable this behaviour when logging to fluentd you can set audit.fluentd.requireConn
to false
in the Deploy helm values file.
Fluentd Setup¶
Fluentd can be used to listen for audit logs and forward them to a number of storage solutions.
Copy the default Fluentd config file (and edit if desired):
cp ./seldon-deploy-install/reference-configuration/audit/values-audit-fluentd.yaml values-audit-fluentd.yaml
Install Fluentd using the bitnami/fluentd
helm chart:
helm upgrade --install audit-fluentd fluentd \
--repo https://charts.bitnami.com/bitnami \
--namespace seldon-logs \
--version 5.5.11 \
--values values-audit-fluentd.yaml
The important part that must be present in the Fluentd configuration is the source
configuration, and the match
part of
the output configuration.
Deploy uses the forward
input plugin for sending requests to Fluentd.
The source configuration should look something like this, where {{ .Values.aggregator.port }}
is the same as .Values.audit.fluentd.port
in the Deploy helm chart:
<source>
@type forward
bind 0.0.0.0
port {{ .Values.aggregator.port }}
</source>
For output, the pattern to use for Deploy’s audit logs in the <match>
clause is deploy_audit.**
.
An example configuration to write to a s3/minio bucket will look like this:
<match deploy_audit.**>
@type s3
aws_key_id {{ .Values.aggregator.s3.accessKey }}
aws_sec_key {{ .Values.aggregator.s3.secretKey }}
s3_bucket deploy-audit # The bucket to store the log data
s3_endpoint http://minio.minio-system.svc.cluster.local:9000/ # The endpoint URL (like "http://localhost:9000/")
s3_region us-east-1 # See the region settings of your Minio server
path logs/ # This prefix is added to each file
force_path_style true # This prevents AWS SDK from breaking endpoint URL
time_slice_format %Y%m%d%H%M # This timestamp is added to each file name
<format>
@type json
</format>
<buffer time>
@type file
path /tmp/buffer/s3
timekey 3600 # Flush the accumulated chunks every hour
timekey_wait 10m # Wait for 10m before flushing
timekey_use_utc true # Use this option if you prefer UTC timestamps
chunk_limit_size 256m # The maximum size of each chunk
</buffer>
</match>
For a list of all output plugins Fluentd supports refer to the official documentation.
Note
This configuration of Fluentd for audit logging is different from the Elasticsearch installation for container logs.
As part of the EFK stack for container logs, Fluentd runs as a DaemonSet
so it
can collect application logs from all relevant pods transparently,
regardless of which nodes they are placed on.
For audit logging, the pods do not need to be running on the same node because Deploy explicitly forwards information to Fluentd. Using a small number of Fluentd pods, rather than one per node, allows audit logging to be more resource efficient. It also allows for scaling the audit logging pipeline with the actual load on it, rather than with the number of nodes in Kubernetes.
Troubleshooting¶
Preserving the source IP of the original client¶
The audit logs contain a SourceAddress
field which shows the IP of the sender of the request.
Seldon Deploy will first look for a X-Forwarded-For
header to determine the original client and only if not present will use the address in the request.
By default, Istio does not include the X-Forwarded-For
header.
To include it, some configuration to the Ingress Gateway is required as described in the official documentation depending on the type of your load balancer.
The Kubernetes Nginx ingress controller also has documentation on proxying and source IP addresses.
If the actual client IP is still not shown properly, make sure there are no other proxies, load balancers, or firewalls overwriting the IP.