Member-only story
Kafka best practices edition → How to design Kafka message key and why it is important in determining application performance?
What is a Kafka Message: A record or unit of data within Kafka. Each message has a key and a value, and optionally headers.The key is commonly used for data about the message and the value is the body of the message
Message Key → Can be null or contain some value that say’s something about data, like user/email id or hash of message e.t.c
Message Value → It is the actual data that need to be send to kafka.
Why “Key” value is important →
- Deduplication : If two Kafka keys contain the same value, then the old entry will be removed if log.cleaner.enable is true. This helps to achieve deduplication.
- Ordering consumption: If the requirement of the consumer is to process messages in an orderly fashion, then the key should be defined in a way that is unique to ordering. Say if the key is an email id(like abc@medium.com), then all messages produced for that email id will land in the same partition. So as a partition can have only one consumer (with in same group id) that reads data, consumer can guarantee ordering. If key is a null or a unique value like the counter, then messages will be distributed across partitions, in which case consumers cannot vouch for ordering.
- Data Skewing: If we use customer email id as key, then all data from the same customer will go to the same…