It is a distributed real time streaming platform.
Three capabilities;
- Publish and Subscribe to stream of records (like in a Message Queue)
- Store streams of records with fault tolerance.
- Process streams of records, real time.
Application Areas;
- Building real time streaming data pipelines.
- Building real time streaming applications.
Concepts;
Run as cluster (can accommodate multiple data centers)
Stores data streams in categories called topics
Record, consists;
- key
- value
- timestamp
Core APIs
- Producer API
- Consumer API
- Streams API
- Connector API
The communication between the client and the servers is done with TCP.
![]() |
Source: https://kafka.apache.org/intro |
A category or feed name to which records are published.
Kafka topics are always multi subscriber; (Can have zero or many consumers)
For each topic, the Kafka cluster maintains a partitioned log.
Distribution
The partitions of the log are distributed over the servers in a Kafka cluster.
Each server in the Kafka cluster can handle data and requests for a share of the partitions. Each partition is replicated for fault tolerance.
A partition has,
- one "leader",
- zero or more "followers"
Geo Replication
Kafka MirrorMaker is there for geo-replication support for the clusters. With MirrorMaker, messages are replicated across multiple data centers or cloud regions.
Producers
Producers can publish data to the topics. He is responsible for choosing which record to assign to which partition in the topic.
Consumers
![]() |
Source: https://kafka.apache.org/intro |
Kafka for Stream Processing
Kafka as Storage System
Kafka as a Messaging System
Traditionally a messaging queue has two models; queuing and publish-subscribe. In a queue, a set of consumers can read from a server and each record goes to one of them. Publish-subscribe allows to broadcast data to multiple processes.The consumer group concept in Kafka generalizes these two concepts.
Kafka Architecture
![]() |
Source :https://kafka.apache.org/21/documentation/streams/architecture |
How to configure a Kafka environment in Ubuntu?
Prerequisites
- updated package information in the system (apt update)
- default java version
Get downloaded and extracted the binary distribution of Kafka
Create A Topic
Send Messages To Kafka
Using Kafka Consumer
How it was invented?
Initially it was invented by LinkedIn corporation as a Message Queue.
No comments:
Post a Comment