Kafka replication was introduced in Kafka 0.8.
Kafka replication feature guarantees that the message will be published and consumed even in the case of broker failure.
Both producers and consumers are replication-aware.
In replication, each partition of a message has n replicas and can afford n-1 failures to guarantee message delivery.
One replica acts as the lead replica for the rest of the replicas.
-
Zookeeper keeps the information about the lead replica and the current follower in-sync replicas (ISR).
-
The lead replica maintains the list of all in-sync follower replicas.
Each replica stores its part of the message in local logs and offsets, and is periodically synced to the disk. This ensures that either a message is written to all the replicas or to none of them.
Replication modes
Synchronous replication
-
A producer first identifies the lead replica from ZooKeeper and publishes the message.
-
As soon as the message is published, it is written to the log of the lead replica and all the followers of the lead start pulling the message; by using a single channel, the order of messages is ensured.
-
Each follower replica sends an acknowledgement to the lead replica once the message is written to its respective logs.
-
Once replications are complete and all expected acknowledgements are received, the lead replica sends an acknowledgement to the producer.
-
On the consumer’s side, all the pulling of messages is done from the lead replica.
Asynchronous replication
-
As soon as a lead replica writes the message to its local log, it sends the acknowledgement to the message client and does not wait for acknowledgements from follower replicas.
-
But, as a downside, this mode does not ensure message delivery in case of a broker failure.
Failure Scenarios
-
Replication guarantees that any successfully published message will not be lost and will be consumed, even in the case of broker failures.
-
If any of the follower in-sync replicas fail, the leader drops the failed follower from its ISR list.
-
Whenever the failed follower comes back, it first truncates its log to the last checkpoint and then starts to catch up with all messages from the leader, starting from the checkpoint.
-
When fully synced with the leader, the leader adds it back to the current ISR list.
-
If the lead replica fails, a message partition is resent by the producer to the new lead broker.
-
The process of choosing the new lead replica involves all the followers’ ISRs registering themselves with Zookeeper.
-
The very first registered replica becomes the new lead replica.
-
Rest of the registered replicas become the followers of the newly elected leader.
-
Each replica registers a listener in Zookeeper so that it will be informed of any leader change.
- heartin's blog
- Log in or register to post comments
Recent comments