Member-only story

Aditya
3 min readApr 10, 2021

Kafka best practices edition → How to design Kafka message key and why it is important in determining application performance?

What is a Kafka Message: A record or unit of data within Kafka. Each message has a key and a value, and optionally headers.The key is commonly used for data about the message and the value is the body of the message

Message Key → Can be null or contain some value that say’s something about data, like user/email id or hash of message e.t.c

Message Value → It is the actual data that need to be send to kafka.

Why “Key” value is important →

  1. Deduplication : If two Kafka keys contain the same value, then the old entry will be removed if log.cleaner.enable is true. This helps to achieve deduplication.
  2. Ordering consumption: If the requirement of the consumer is to process messages in an orderly fashion, then the key should be defined in a way that is unique to ordering. Say if the key is an email id(like abc@medium.com), then all messages produced for that email id will land in the same partition. So as a partition can have only one consumer (with in same group id) that reads data, consumer can guarantee ordering. If key is a null or a unique value like the counter, then messages will be distributed across partitions, in which case consumers cannot vouch for ordering.
  3. Data Skewing: If we use customer email id as key, then all data from the same customer will go to the same…
Aditya
Aditya

Written by Aditya

Principal data engineer → Distributed Threat hunting security platform | aws certified solutions architect | gssp-java | Chicago-IL

Responses (1)

Write a response