About
The distributed workers are stateless.
The data are stored within internal Kafka topics:
- task configurations,
- and status
Articles Related
Management
Creation
Connect creates this topics for you but you can create them yourself:
# config.storage.topic=connect-configs
$ bin/kafka-topics --create --zookeeper localhost:2181 --topic connect-configs --replication-factor 3 --partitions 1 --config cleanup.policy=compact
# offset.storage.topic=connect-offsets
$ bin/kafka-topics --create --zookeeper localhost:2181 --topic connect-offsets --replication-factor 3 --partitions 50 --config cleanup.policy=compact
# status.storage.topic=connect-status
$ $ bin/kafka-topics --create --zookeeper localhost:2181 --topic connect-status --replication-factor 3 --partitions 10 --config cleanup.policy=compact
List them
kafka-topics --list --zookeeper localhost:2181 | grep -i connect
connect-configs
connect-offsets
connect-statuses
- Query them
kafka-console-consumer \
--bootstrap-server localhost:9092 \
--from-beginning \
--property print.key=true \
--topic connect-offsets
Persistence
The offsets, status, and configurations are written to the topics using converters specified through the following required properties.
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false
Most users will always want to use the JSON converter without schemas. Offset and config data is never visible outside of Connect in this format.