Multi Version Concurrency Control MVCC based design (hot question in distributed systems interview)

Aditya
2 min readMay 17, 2023

What is concurrency → Ability of a program to do multiple things at once.

Why concurrency → To run a program faster, a programmer needs to design their program processes to run concurrently, so that each concurrent part of the program(i.e process) can be run independently of the others

What is the issue with concurrent access → When multiple processes try to access a single data item (like database rows or files e.t.c) it may result in reading data by one process when another process is still writing

This leads to birth of isolation levels , locking and trade offs around it and being actively used in designs from past two decades . https://en.m.wikipedia.org/wiki/Isolation_(database_systems)

What is multiversion concurrency control(mvcc) → In a nut shell mvcc is a design pattern for multiple processes to access data item without blocking each other and sacrificing data consistency.

How do we control concurrency using multi version

  • MVCC works by maintaining multiple versions of each data item that need concurrent access.
  • Each process operates on a specific version , representing a consistent state of the data item at a specific point in time.
  • When a transaction modifies a data item, it creates a new version of that item, rather than overwriting the existing one. This…

--

--

Aditya

Principal data engineer → Distributed Threat hunting security platform | aws certified solutions architect | gssp-java | Chicago-IL