Table of Contents

About

The distributed workers are stateless.

The data are stored within internal Kafka topics:

Management

Creation

Connect creates this topics for you but you can create them yourself:

# config.storage.topic=connect-configs
$ bin/kafka-topics --create --zookeeper localhost:2181 --topic connect-configs --replication-factor 3 --partitions 1 --config cleanup.policy=compact

# offset.storage.topic=connect-offsets
$ bin/kafka-topics --create --zookeeper localhost:2181 --topic connect-offsets --replication-factor 3 --partitions 50 --config cleanup.policy=compact

# status.storage.topic=connect-status
$ $ bin/kafka-topics --create --zookeeper localhost:2181 --topic connect-status --replication-factor 3 --partitions 10 --config cleanup.policy=compact

List them

kafka-topics --list --zookeeper localhost:2181 | grep -i connect
connect-configs
connect-offsets
connect-statuses

  • Query them
kafka-console-consumer \
--bootstrap-server localhost:9092 \
--from-beginning \
--property print.key=true \
--topic connect-offsets

Persistence

The offsets, status, and configurations are written to the topics using converters specified through the following required properties.

internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false

Most users will always want to use the JSON converter without schemas. Offset and config data is never visible outside of Connect in this format.