Kafka - Stream Application

About

What is a Stream? Also known as Pipe, Message Queue or Event Processing in Kafka.

The stream API

The Kafka cluster stores streams of records in categories called topics.

Articles Related

Management

Configuration

Application Id

Application Id (application.id): Each stream processing application must have a unique id.

This id is used in the following places to isolate resources used by the application from others:

As the default Kafka consumer and producer client.id prefix
As the Kafka consumer group.id for coordination
As the name of the sub-directory in the state directory (cf. state.dir the directory location for state stores)
As the prefix of internal Kafka topic names

Update

When an application is updated, it is recommended to change application.id unless it is safe to let the updated application re-use the existing data in internal topics and state stores. One pattern could be to embed version information within application.id, e.g., my-app-v1.0.0 vs. my-app-v1.0.2.

Concept

https://docs.confluent.io/3.0.0/streams/concepts.html#streams-concepts

A stream processor is a node in the processor topology that represents a single processing step.
A stream is an unbounded, continuously updating data set.

API

Two:

A high-level Kafka Streams DSL that provides common data transformation operations in a functional programming style such as map and filter operations.
A low-level Processor API that lets you add and connect processors as well as interact directly with state stores.

Javadoc

Jar

Group Id	Artifact Id	Description / why needed
org.apache.kafka	kafka-streams	Base library for Kafka Streams. Required.
org.apache.kafka	kafka-clients	Kafka client library. Contains built-in serializers/deserializers. Required.
org.apache.avro	avro	Apache Avro library. Optional (needed only when using Avro).
io.confluent	kafka-avro-serializer	Confluent’s Avro serializer/deserializer. Optional (needed only when using Avro).

Code / Demo

Code examples that demonstrate how to implement real-time processing applications using Kafka Streams. See readme.
Demo

Documentation / Reference

Stream DSL API