Introduction to Kafka Clusters

Apache Kafka is distributed. Kafka supports message partitioning over Kafka servers and distributing consumption over a cluster of consumer machines while maintaining per-partition ordering semantics. Kafka cluster can grow elastically and transparently without any downtime.

 

Components of Kafka cluster

  1. Broker

    • Kafka is run as a cluster comprised of one or more servers each of which is called a broker. 

    • Topics are created within the context of broker processes.

    • In a Kafka cluster, each server plays a dual role; it acts as a leader for some of its partitions and also a follower for other partitions. This ensures the load balance within the Kafka cluster.

  2. Topic

    • Kafka maintains feeds of messages in categories, and a category or feed name to which messages are published, is called a topic.​​

    • A Kafka cluster maintains a partitioned log for each topic.

    • Kafka topics are created on a Kafka broker acting as a Kafka server. 
  3. Producer

    • ​​Producers publish data to the topics by choosing the appropriate partition within the topic. 

  4. Consumer

    • ​​Consumers are applications or processes that subscribe to messages

  5. ZooKeeper

    • ​​ZooKeeper serves as the coordination interface between the Kafka broker and consumers.

      • brokers get the state information and 

      • consumers track message offsets.

We have seen Kafka terminology like Topic, Partition, Broker, Producer, Consumet etc. in detail.

ZooKeeper is an application library that allows distributed processes to coordinate with each other through a shared hierarchical name space of data registers. You can read more about ZooKeeper from previous notes as well.

 

Types of Kafka clusters

  1. A single node—single broker cluster

  2. A single node—multiple broker clusters

  3. Multiple nodes—multiple broker clusters

All the names are self explanatory.

single node—single broker cluster is the simplest, as it contains a single node with a single broker on the cluster. All the producers and consumers will be communicating with the single Kafka broker. ZooKeeper will be coordinating with Kafka broker and the consumers.

In single node—multiple broker clusters, there will be multiple brokers on the same node, and all producers and consumers can communicate with any of these Kafka brokers. ZooKeeper will be coordinating with Kafka brokers and the consumers.

In Multiple nodes—multiple broker clusters, there will be multiple brokers on the multiple nodes, and all producers and consumers can communicate with any of these Kafka brokers. ZooKeeper will be coordinating with Kafka brokers and the consumers. Note that in a multi node cluster, we should install Kafka on each node of the cluster, and all the brokers from the different nodes need to connect to the same ZooKeeper.

Learn Serverless from Serverless Programming Cookbook

Contact

Please first use the contact form or facebook page messaging to connect.

Offline Contact
We currently connect locally for discussions and sessions at Bangalore, India. Please follow us on our facebook page for details.
WhatsApp (Primary): (+91) 7411174113
Phone (Escalations): (+91) 7411174114

Business newsletter

Complete the form below, and we'll send you an e-mail every now and again with all the latest news.

About

CloudMaterials is my blog to share notes and learning materials on Cloud and Data Analytics. My current focus is on Microsoft Azure and Amazon Web Services (AWS).

I like to write and I try to document what I learn to share with others. I believe that knowledge is useless unless you share it; the more you share, the more you learn.

Recent comments

Photo Stream