1 of 14

Enhancing Kafka topic management

in Kubernetes with the

Unidirectional Topic Operator

Federico Valeri

StrimziCon 2024

https://strimzi.io

2 of 14

Topics

  • A topic is a named feed where messages can be published and consumed
  • Messages are immutable and stored in one or more (replicated) partitions
  • Each partition is an append-only log made of segments

2

3 of 14

Partitions

  • Distributed evenly across brokers
  • The replication factor ensure fault tolerance (typical config: RF=3, minISR=2)
  • The retention policy and ingestion rate determine how much disk space is used (set both “retention.ms” and “retention.bytes”)
  • Message ordering is only guaranteed at the partition level, not across partitions

3

4 of 14

Challenges

  • Manage and deploy topics declaratively (GitOps)
  • Support bulk operations (busy clusters, load testing)
  • Delegate topic management in a controlled way (config limits)
  • Monitor topic state and be notified about any issue (metrics)

4

5 of 14

Topic Operator

  • Extends Kubernetes API introducing the KafkaTopic custom resource
  • Reconciles KTs in a single namespace for a single Kafka cluster
  • Can be deployed as part of the Kafka custom resource, or as standalone deployment
  • Supports TLS and SASL authentication against the target Kafka cluster

5

6 of 14

KafkaTopic

6

7 of 14

Timeline

7

2018 (0.6)

Old gen

Bidirectional serial reconciliation. Metadata stored locally (SoT).

New gen “alpha”

Unidirectional batch reconciliation. KRaft support.

2023 (0.36)

2024 (0.41)

New gen “GA”

Replication factor change. Minor improvements.

20XX (X.XX)

Future

Multi-namespace reconciliation.

Management delegation.

8 of 14

Comparison

Old generation

  • Bidirectional serial reconciliation
  • Stateful application
  • Limited scalability
  • Complex architecture
  • ZK requirement

New generation

  • Unidirectional batch reconciliation
  • Stateless application
  • Improved scalability
  • Simplified architecture
  • ZK and KRaft support

8

9 of 14

Old gen issues

  • Hard to maintain due to its internal complexity
    • There are corner cases which are not well understood (e.g. invalid state store)
  • Limited scalability due to the original design choices
    • Reconciles one topic at a time
    • Updates a persistent store containing topic metadata
  • Requires ZooKeeper to get topic events from Kafka
    • ZooKeeper is deprecated since Kafka 3.5.0, and will go away in 4.0.0

9

10 of 14

New gen features

  • Scales much better
  • Supports unmanaging a topic
  • Supports replication factor change
  • Enables finalizer to ensure proper cleanup of topic deletion events (default: enabled)
  • Provides additional metrics for external requests (default: disabled)

10

11 of 14

Replication factor change

  • Challenges
    • Kafka only provides the “kafka-reassign-partitions.sh” command line tool
    • The user has to decide leader and replica movements without causing imbalance
  • Cruise Control
    • The “topic_configuration” REST endpoint supports this operation
    • Leverages the same heuristic used for rebalances, and supports batch requests
  • Demo

11

12 of 14

Multi-namespace reconciliation

“I would like to have better topic support in a multi-tenant Kubernetes cluster where users (application developers) are sandboxed to their own namespace.”

Challenges:

  • Kafka has limited support multi-tenancy
    • How to reconcile multiple KTs with the same name in different namespaces? Namespace prefix with “CreateTopicPolicy”?
  • RBAC reconciliation in watched namespaces
  • Coordination with the User Operator

12

13 of 14

Race conditions

  • Multiple KafkaTopic created with the same topicName
    • Oldest resource is reconciled, others get “ResourceConflict” error
  • Applications automatically creating their own topics
    • Set “auto.create.topics.enable=false” and code your applications to wait for topic creation
  • The operator reverts Cruise Control’s dynamic throttling configs
    • Used while rebalancing and should be ignored (working on a fix, workaround available)

13

14 of 14

Thank you

14