Kafka Integration¶
This page contains configuration details on Kafka integration.
Note
Integration with Kafka must be configured both for Seldon Core v2 and Seldon Enterprise Platform. This documentation will contain relevant Helm values that needs to be changed for each Helm chart installation.
Warning
Kafka is an external component outside of the main Seldon stack. Therefore, it is the cluster administrator’s responsibility to administrate and manage the Kafka instance used by Seldon.
Warning
For production installation we highly recommend using managed Kafka instance.
Configuring Kafka Integration¶
General Configuration¶
The general Kafka configuration options are set on via Seldon Core v2 and Seldon Enterprise Platform Helm Chart values.
In components-values.yaml
you can find kafka
section with following defaults (Seldon Core 2.5.0):
kafka:
bootstrap: seldon-kafka-bootstrap.seldon-mesh:9092
topicPrefix: seldon
debug:
consumer:
autoOffsetReset: earliest
sessionTimeoutMs: 6000
topicMetadataRefreshIntervalMs: 1000
topicMetadataPropagationMaxMs: 300000
messageMaxBytes: 1000000000
producer:
lingerMs: 0
messageMaxBytes: 1000000000
topics:
replicationFactor: 1
numPartitions: 1
In most situation you will only need to set the following options
kafka:
bootstrap: seldon-kafka-bootstrap.kafka.svc.cluster.local:9092
topics:
replicationFactor: 3
numPartitions: 4
In deploy-values.yaml
you can find requestLogger.kafka_consumer
section with following defaults
requestLogger:
kafka_consumer:
enabled: true
bootstrap_servers: seldon-kafka-bootstrap.kafka.svc.cluster.local:9092
group_id: metronome
auto_offset_reset: earliest
Kafka Encryption (TLS)¶
In production settings, it’s recommended to set up TLS encryption when talking to Kafka. This will ensure that neither the credentials nor the payloads are transported in plaintext.
Note
TLS encryption involves only single-sided TLS. This means that the contents will be sent encrypted to the server, but the client won’t send any form of certificate. Therefore, it does not take care of authenticating the client. Client authentication can be configured through mutual TLS (mTLS) or SASL mechanism, which are covered in the Kafka Authentication section below.
Warning
Seldon Core v2 prior to version 2.6.0 only supports TLS encryption with mTLS or SASL authentication mechanisms.
When TLS is enabled, the client will need to know the root CA certificate used to create the server’s certificate. This will be used to validate the certificate sent back by the Kafka server.
Note
This certificate will be expected to be encoded as a PEM certificate.
Within our cluster, we can provide the server’s root CA certificate through a secret, which can be created as:
kubectl create secret generic kafka-broker-tls -n seldon --from-file ./ca.crt
Note
It is important that the field used for the certificate within the secret is named ca.crt
.
Otherwise, SCv2 may not be able to find the certificate.
Once we have created this secret, we will then need to reference it within the security.kafka.ssl.client.brokerValidationSecret
field of the Helm chart values.
The resulting set of values to include in components-values.yaml
should be similar to:
security:
kafka:
ssl:
client:
brokerValidationSecret: kafka-broker-tls
Kafka Authentication¶
In production settings, Kafka clusters typically employ some form of authentication. This is especially true when utilizing managed Kafka solutions, which are highly recommended. Therefore, during the installation of the main SCv2 components, it is necessary to ensure that we provide the correct credentials to establish a connection to Kafka.
The authentication mechanism offered by Kafka will generally depend on the Kafka flavour used. However, it will usually be one of the following:
SASL
, where credentials are in the form of a user and password combination.mTLS
, where a set of SSL certificates will be used as credentials.OAuth 2.0
, where client credential flow is used to obtain JWT token.
Within the cluster, these credentials will be provided as secrets. Therefore, when installing SCv2, we will need to take care to create the appropriate secret in the correct format and update the SCv2 Helm values accordingly. Below, you can find the high-level instructions to provide the credentials for each of the authentication mechanisms.
When using SASL
as the authentication mechanism for Kafka, the credentials will be a user and password combination.
The password will be provided through a secret, which can be created as:
kubectl create secret generic kafka-sasl-secret --from-literal password=<kafka-password> -n <namespace>
Note
This password must be present in seldon-logs
namespace and every namespace containing Seldon Core v2 runtime, here seldon
namespace.
Note
It is important that the field used for the password within the secret is named password
.
Otherwise, SCv2 may not be able to find the correct password.
Once we have created this secret, we will then need to reference it within the corresponding Helm values for both Seldon Core v2 and Seldon Enterprise Platform installation.
Seldon Core v2 Helm Chart
For Seldon Core v2 we need to specify following Helm values:
security.kafka.sasl.mechanism
- SASL security mechanism, e.g.SCRAM-SHA-512
security.kafka.sasl.client.username
- Kafka usernamesecurity.kafka.sasl.client.secret
- Created secret withpassword
security.kafka.ssl.client.brokerValidationSecret
- Certificate Authority of Kafka Brokers
The resulting set of values to include in components-values.yaml
should be similar to:
security:
kafka:
protocol: SASL_SSL
sasl:
mechanism: SCRAM-SHA-512
client:
username: <kafka-username> # TODO: Replace with your Kafka username
secret: kafka-sasl-secret # NOTE: Secret name from previous step
ssl:
client:
secret: # NOTE: Leave empty
brokerValidationSecret: kafka-broker-tls # NOTE: Optional
Note
The security.kafka.ssl.client.brokerValidationSecret
field is optional.
Leave it empty if your brokers use well known Certificate Authority like for example Let’s Encrypt.
Seldon Enterprise Platform Helm Chart
Modification to Seldon Enterprise Platform Helm Values will look almost exactly the same.
The only difference is that corresponding entries will go under requestLogger.kafka_consumer
.
The resulting set of values to include in deploy-values.yaml
should be similar to:
requestLogger:
kafka_consumer:
protocol: SASL_SSL
sasl:
mechanism: SCRAM-SHA-512
client:
username: <kafka-username> # TODO: Replace with your Kafka username
secret: kafka-sasl-secret # NOTE: Secret name from earlier step
ssl:
client:
secret: # NOTE: Leave empty
brokerValidationSecret: kafka-broker-tls # NOTE: Optional
When using the OAuth 2.0 authentication mechanism for Kafka, the credentials will be a Client ID and Client Secret that can be used with your Identity Provider to obtain JWT tokens used to authenticate with Kafka brokers.
The credentials will be provided in form of K8s secret kafka-oauth.yaml
apiVersion: v1
kind: Secret
metadata:
name: kafka-oauth
type: Opaque
stringData:
method: OIDC
client_id: <client id>
client_secret: <client secret>
token_endpoint_url: <token endpoint url>
extensions: ""
scope: ""
which must be present appropriate namespaces
kubectl apply -f kafka-oauth.yaml -n <namespace>
Note
This secret must be present in seldon-logs
namespace and every namespace containing Seldon Core v2 runtime, here seldon
namespace.
Client ID, client secret and token endpoint url should come from identity provider, e.g. Keycloak or Azure AD.
Seldon Core v2 Helm Chart
For Seldon Core v2 we need to specify following Helm values:
security.kafka.sasl.mechanism
- set toOAUTHBEARER
security.kafka.sasl.client.secret
- Created secret with client credentialssecurity.kafka.ssl.client.brokerValidationSecret
- Certificate Authority of Kafka brokers
The resulting set of values to include in components-values.yaml
should be similar to:
security:
kafka:
protocol: SASL_SSL
sasl:
mechanism: OAUTHBEARER
client:
secret: kafka-oauth # NOTE: Secret name from earlier step
ssl:
client:
secret: # NOTE: Leave empty
brokerValidationSecret: kafka-broker-tls # NOTE: Optional
Note
The security.kafka.ssl.client.brokerValidationSecret
field is optional.
Leave it empty if your brokers use well known Certificate Authority like for example Let’s Encrypt.
Seldon Enterprise Platform Helm Chart
Modification to Seldon Enterprise Platform Helm Values will look almost exactly the same.
The only difference is that corresponding entries will go under requestLogger.kafka_consumer
.
The resulting set of values to include in deploy-values.yaml
should be similar to:
requestLogger:
kafka_consumer:
protocol: SASL_SSL
sasl:
mechanism: OAUTHBEARER
client:
secret: kafka-oauth # NOTE: Secret name from earlier step
ssl:
client:
secret: # NOTE: Leave empty
brokerValidationSecret: kafka-broker-tls # NOTE: Optional
When using mTLS
(i.e. mutual TLS) Kafka will use a set of certificates to authenticate the client.
These will usually be:
A client certificate, which we will refer to as
tls.crt
.A client key, which we will refer to as
tls.key
.A root certificate, which we will refer to as
ca.crt
.
Note
These certificates will be expected to be encoded as PEM certificates.
These certificates will be provided through a secret, which can be created as:
kubectl create secret generic kafka-client-tls -n <namespace> \
--from-file ./tls.crt \
--from-file ./tls.key \
--from-file ./ca.crt
Note
This secret must be present in seldon-logs
namespace and every namespace containing Seldon Core v2 runtime, here seldon
.
Note
It is important that the field used within the secret follow the same naming convention: tls.crt
, tls.key
and ca.crt
.
Otherwise, SCv2 may not be able to find the correct set of certificates.
Once we have created this secret, we will then need to reference it within the corresponding Helm values for both Seldon Core v2 and Seldon Enterprise Platform installation.
Seldon Core v2 Helm Chart
For Seldon Core v2 we need to specify following Helm values:
security.kafka.ssl.client.secret
- Secret name containing client certificatessecurity.kafka.ssl.client.brokerValidationSecret
- Certificate Authority of Kafka Brokers
The resulting set of values to include in components-values.yaml
should be similar to:
security:
kafka:
protocol: SSL
ssl:
client:
secret: kafka-client-tls # NOTE: Secret name from earlier step
brokerValidationSecret: kafka-broker-tls # NOTE: Optional
Note
The security.kafka.ssl.client.brokerValidationSecret
field is optional.
Leave it empty if your brokers use well known Certificate Authority like for example Let’s Encrypt.
Seldon Enterprise Platform Helm Chart
Modification to Seldon Enterprise Platform Helm Values will look almost exactly the same.
The only difference is that corresponding entries will go under requestLogger.kafka_consumer
.
The resulting set of values to include in deploy-values.yaml
should be similar to:
requestLogger:
kafka_consumer:
protocol: SSL
ssl:
client:
secret: kafka-client-tls # NOTE: Secret name from earlier step
brokerValidationSecret: kafka-broker-tls # NOTE: Optional
Managed Kafka Examples¶
Example configuration steps for selected managed Kafka solutions.
Create API Keys
In your Confluent Cloud environment create new API keys.
The easiest way to obtain all required information is to head to Clients
-> New client
(choose e.g. Go) and generate new Kafka cluster API key from there.
See Confluent Cloud documentation in case of issues.
This will generate for you:
Key
(we use it asusername
)Secret
(we use it aspassword
)
Do not forget to also copy the bootstrap.servers
from the example config.
Create Kubernetes Secret
Create K8s secrets storing the SASL password
for both Seldon Core v2 and Seldon Enterprise Platform to use:
kubectl create secret generic confluent-kafka-sasl --from-literal password="<Confluent Cloud API Secret>" -n seldon
kubectl create secret generic confluent-kafka-sasl --from-literal password="<Confluent Cloud API Secret>" -n seldon-system
Configure Seldon Core v2 and Seldon Enterprise Platform
Make following adjustments to both Seldon Core v2 and Seldon Enterprise Platform Helm values:
components-values.yaml
kafka:
bootstrap: < Confluent Cloud Broker Endpoints >
topics:
replicationFactor: 3
numPartitions: 4
consumer:
messageMaxBytes: 8388608
producer:
messageMaxBytes: 8388608
security:
kafka:
protocol: SASL_SSL
sasl:
mechanism: "PLAIN"
client:
username: < username >
secret: confluent-kafka-sasl
ssl:
client:
secret:
brokerValidationSecret:
You may need to tweak replicationFactor
and numPartitions
to your cluster configuration.
deploy-values.yaml
requestLogger:
kafka_consumer:
bootstrap: < Confluent Cloud Broker Endpoints >
protocol: SASL_SSL
sasl:
mechanism: "PLAIN"
client:
username: < username >
secret: confluent-kafka-sasl
ssl:
client:
secret:
brokerValidationSecret:
Troubleshooting
First check Confluent Cloud documentation.
Set the kafka config map debug setting to “all”. For Helm install you can set
kafka.debug=all
.
Confluent Cloud managed Kafka supports OAuth 2.0 to authenticate your Kafka clients. See Confluent Cloud documentation for further details.
Configure Identity Provider in Confluent Cloud Console
In your Confluent Cloud Console go to Account & Access / Identity providers and register your Identity Provider.
See Confluent Cloud documentation for further details.
Configure Identity Pool
In your Confluent Cloud Console go to Account & Access / Identity providers and add new identity pool to your newly registered Identity Provider.
See Confluent Cloud documentation for further details.
Obtain Required Details
You will need following information from Confluent Cloud:
Cluster ID:
Cluster Overview
→Cluster Settings
→General
→Identification
Identity Pool ID:
Accounts & access
→Identity providers
→<specific provider details>
You will also need following information from your identity provider, e.g. Keycloak or Azure AD:
Client ID
Client secret
Token Endpoint URL
Create Kubernetes Secret
Create K8s secrets storing the required Client Credentials, kafka-secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: confluent-kafka-oauth
type: Opaque
stringData:
method: OIDC
client_id: <client id>
client_secret: <client secret>
token_endpoint_url: <token endpoint url>
extensions: logicalCluster=<cluster id>,identityPoolId=<identity pool id>
scope: ""
which must be present in appropriate namespaces
kubectl apply -f kafka-secret.yaml -n <namespace>
Note
This secret must be present in seldon-logs
namespace and every namespace containing Seldon Core v2 runtime, here seldon
namespace.
Note
If you are using Azure AD you may will need to set scope: api://<client id>/.default
.
Configure Seldon Core v2 and Seldon Enterprise Platform
Make following adjustments to both Seldon Core v2 and Seldon Enterprise Platform Helm values:
components-values.yaml
kafka:
bootstrap: < Confluent Cloud Broker Endpoints >
topics:
replicationFactor: 3
numPartitions: 4
consumer:
messageMaxBytes: 8388608
producer:
messageMaxBytes: 8388608
security:
kafka:
protocol: SASL_SSL
sasl:
mechanism: OAUTHBEARER
client:
secret: confluent-kafka-oauth
ssl:
client:
secret:
brokerValidationSecret:
You may need to tweak replicationFactor
and numPartitions
to your cluster configuration.
deploy-values.yaml
requestLogger:
kafka_consumer:
bootstrap: < Confluent Cloud Broker Endpoints >
protocol: SASL_SSL
sasl:
mechanism: OAUTHBEARER
client:
secret: confluent-kafka-oauth
ssl:
client:
secret:
brokerValidationSecret:
Troubleshooting
First check Confluent Cloud documentation.
Set the kafka config map debug setting to “all”. For Helm install you can set
kafka.debug=all
.
Warning
You will need at least Standard
tier for your Event Hub Namespace as Basic
tier does not support Kafka protocol.
Warning
Seldon Core v2 creates 2 Kafka topics for each pipeline and model plus one global topic for errors.
This means that total number of topics will be 2 x (#models + #pipelines) + 1
which will likely exceed the limit of Standard
tier in Azure Event Hub.
See quota information here.
Prerequisites
To start you will need to have an Azure Event Hub Namespace. You can create one following Azure quickstart docs. Note that you do not need to create an Event Hub (topics) as Core v2 will require all the topics it needs automatically.
Create API Keys
To connect to Azure Event Hub provided Kafka API you need to obtain:
Kafka Endpoint
Connection String
You can obtain both using Azure Portal as documented here.
Note
You should get the Connection String for a namespace level as we will need to dynamically create new topics.
The Connection String should be in format of
Endpoint=sb://<namespace>.servicebus.windows.net/;SharedAccessKeyName=XXXXXX;SharedAccessKey=XXXXXX
Create Kubernetes Secret
Create K8s secrets storing the SASL password
for both Seldon Core v2 and Seldon Enterprise Platform to use:
kubectl create secret generic azure-kafka-secret --from-literal password="Endpoint=sb://<namespace>.servicebus.windows.net/;
SharedAccessKeyName=XXXXXX;SharedAccessKey=XXXXXX" -n seldon
kubectl create secret generic azure-kafka-secret --from-literal password="Endpoint=sb://<namespace>.servicebus.windows.net/;SharedAccessKeyName=XXXXXX;SharedAccessKey=XXXXXX" -n seldon-system
Configure Seldon Core v2 and Seldon Enterprise Platform
Make following adjustments to both Seldon Core v2 and Seldon Enterprise Platform Helm values:
components-values.yaml
kafka:
bootstrap: <namespace>.servicebus.windows.net:9093
topics:
replicationFactor: 3
numPartitions: 4
security:
kafka:
protocol: SASL_SSL
sasl:
mechanism: "PLAIN"
client:
username: $ConnectionString
secret: azure-kafka-secret
ssl:
client:
secret:
brokerValidationSecret:
You may need to tweak replicationFactor
and numPartitions
to your cluster configuration.
deploy-values.yaml
requestLogger:
kafka_consumer:
bootstrap_servers: <namespace>.servicebus.windows.net:9093
protocol: SASL_SSL
sasl:
mechanism: "PLAIN"
client:
username: $ConnectionString
secret: azure-kafka-secret
ssl:
client:
secret:
brokerValidationSecret:
Note
The username should read $ConnectionString
and this is not a variable for you to replace.
Troubleshooting
First check Azure Event Hub troubleshooting guide.
Set the kafka config map debug setting to “all”. For Helm install you can set
kafka.debug=all
.Verify that you did not hit quotas for topics or partitions in your Event Hub namespace