Welcome to Hello Engineer, your weekly guide to becoming a better software engineer! No fluff - pure engineering insights.
You can also checkout : What is HTTP ?
Don’t forget to check out the latest job openings at the end of the article!
I would like to know what type of content you want in the future. It will help me to create the right content for you. Please fill out this form to let us know
CAP theorem is something that often confuses people in system design interviews, especially if they haven’t understood it clearly. But it’s an important concept that can really shape how you design your system.
In this post, we’ll explain what CAP theorem is, how it works, and the trade-offs involved — all in a simple way. This will help you understand it better and feel more confident in your next system design interview.
What is CAP Theorem ?
At a basic level, the CAP theorem says that in any distributed system, you can only get two out of these three things:
Consistency – Every node has the same data. So if you update something, all future reads from any node will give you that latest value.
Availability – Every request gets a response, even if it’s not the most recent data. So the system stays responsive no matter what.
Partition Tolerance – The system keeps running even if there’s a break in communication between parts of the system (like a network issue).
Here’s the main idea that makes CAP theorem easier to understand in interviews: In real-world distributed systems, you always need partition tolerance. Network issues are bound to happen, and your system has to deal with them.
So in most cases, CAP comes down to one simple decision: When there’s a network issue, do you want your system to stay consistent or stay available?
Let’s break this down with a simple example to understand it better.
3 Pillars of CAP
1. Consistency
Consistency means that every time you read data, you either get the most recent update — or an error if that’s not possible. All healthy nodes in the system will return the same data at the same time.
For example, if you write something to Node A, and then read from Node B, you’ll see that latest update right away.
This is really important for systems where accuracy is critical like in banking apps, where checking your balance must always show the most recent amount.
Just a heads-up — consistency in CAP theorem is not the same as the consistency in ACID databases.
Yeah, it’s a bit confusing, but they mean different things in different contexts.
2. Availability
Availability means that every request, whether it’s a read or write, gets a response. But that response might not always have the latest data.
So even if some nodes are out of sync, the system still works and stays responsive.
This is super important for apps that need to be always up and running, like online shopping sites, where showing something is better than showing nothing.
3. Partition Tolerance
Partition Tolerance means the system keeps working even if some parts of it can’t talk to each other because of a network issue.
A network partition happens when a failure splits the system into two or more groups of servers that can’t communicate.
When this happens, the system has to choose between staying consistent or staying available.
Partition tolerance is a must for distributed systems because network problems are pretty common. A partition-tolerant system can still run smoothly, even when the network isn’t perfect.
Understand Through an Example
Let’s say you have a website with two servers, one in the India and one in Europe.
Now imagine this:
User A is in the India. They connect to the India server and update their display name on their profile.
That update is then sent to the Europe server so both servers stay in sync.
Later, User B (who’s in Europe) checks User A’s profile and sees the new, updated name.
Everything works smoothly because the servers can talk to each other and keep the data consistent.
But things change when there’s a network partition — the connection between the USA and Europe servers goes down. Now we face an important decision:
When User B (in Europe) tries to view User A’s profile, what should we do?
Option A: Return an error, because we can’t be sure the data is the latest (this means choosing consistency).
Option B: Show the last known data, even if it might be a bit outdated (this means choosing availability).
This is the heart of the CAP theorem — deciding between consistency and availability when something goes wrong in the network.
This is where the CAP theorem becomes real — we have to pick between consistency and availability.
In this case, the choice is pretty obvious: it’s better to show User B the old name than to show an error. A slightly outdated profile is still useful — an error page isn’t.
Let’s look at a few more real-world examples where systems make similar choices.
Consistency Over Availability
Some systems just can’t afford to show outdated data, they need to stay consistent, even if that means being less available during issues.
If you prioritize consistency, your design might include:
Distributed Transactions: You make sure different parts of your system (like the database and cache) stay in sync using methods like two-phase commit. This keeps data accurate, but it also adds complexity and can slow things down for users.
Single-Node Setup: You can avoid consistency problems by using just one database. This limits how much your system can grow, but it keeps everything simple and reliable with one source of truth.
Tech Choices that support strong consistency:
Traditional databases like PostgreSQL or MySQL
Google Spanner
DynamoDB (when used in strong consistency mode)
Here are a few examples of systems require consistency over availability:
Ticket Booking Systems: Let’s say User A books seat 6A on a flight. But because of a network issue, User B still sees it as available and books it too. Now two people have the same seat — that’s a big problem.
Online Shopping (Inventory): Imagine Amazon has just one toothbrush left. If the system shows it as available to multiple users during a network issue, it could be sold to more people than they actually have in stock.
Financial Systems: Stock trading platforms need to show real-time data. If you see stale prices, you might place trades that don’t reflect the actual market — and that can cost people a lot of money.
Availability Over Consistency
Most systems can handle a bit of inconsistency and should focus on availability. In these cases, eventual consistency is good enough — meaning the system will become consistent over time, maybe after a few seconds or minutes.
If you prioritize availability, your design might include:
Multiple Replicas: You can have extra read replicas with asynchronous replication, which means they may be slightly behind but can still serve requests. This improves performance and keeps the system available — even if some data is a bit outdated.
Change Data Capture (CDC): This tracks changes in the main database and pushes updates to replicas, caches, or other systems in the background. It helps the main system stay available while the updates reach everywhere eventually.
Tech choices that support high availability:
Cassandra
DynamoDB (when used across multiple availability zones)
Redis clusters
Here are a few examples of systems require availability over consistency:
Social Media: If User A changes their profile picture, it's okay if User B still sees the old one for a little while.
Streaming Platforms (like Netflix): If someone updates a movie’s description, it's not a big deal if some users still see the old version for a short time.
Review Sites (like Yelp): If a restaurant changes its hours, showing the older timing briefly is still better than showing nothing.
NOTE :
The main question to ask is: “Would it really hurt if users saw slightly outdated data for a bit?”
If the answer is yes, go with consistency. If the answer is no, availability is usually the better choice.
Different Levels of Consistency
When people talk about consistency in the CAP theorem, they usually mean strong consistency where every read shows the latest write. But there are other types of consistency too, and knowing them can help you design smarter systems depending on your needs.
Strong Consistency: Every read gives you the most recent write. This is the most reliable but also the slowest. It's needed for systems like bank accounts, where accuracy is critical. (This is the type we’ve focused on so far.)
Causal Consistency: Events that are related always appear in the correct order. For example, comments on a post will always show up after the post itself, never before.
Read-Your-Own-Writes: A user can immediately see their own updates, even if others can't yet. This is common in social media, where users expect to see their changes (like profile updates) right away.
Eventual Consistency: The system will become consistent over time, but it may not be instant. This is fine for things like DNS or systems where a small delay in updates is okay. Most distributed systems use this when they prioritize availability.
Wrapping Up!
The CAP theorem is super important, it helps shape the way you think about system design, especially in interviews. And it’s something you shouldn’t ignore.
But the good news is: it doesn’t have to be complicated.
Just ask yourself one simple question: “Does every read need to return the most recent write?” If the answer is yes, go with consistency. If the answer is no, availability is usually the better choice.
Loved this deep dive? Hit a like ❤️
For more simple explanations, useful insights on coding, system design, and tech trends, Subscribe To My Newsletter! 🚀
If you have any questions or suggestions, leave a comment.
See you next week with more exciting content!
Here's Something Extra for You: Exciting Job Openings 🚀
Member of Technical Staff - 3 Distributed systems, Nutanix : Link
Software Engineer (IC2), Oracle : Link
Software Engineer 3, Google: Link
Software Engineer, GenAI, Google: Link
Software Engineer 2, Amazon : Link
Software Engineer, Coinbase: Link
Software Engineer, Gojek : Link
just a feedback, maybe you can mention some questions that are asked in Interviews around CAP Theorem and leave the readers to find answer on their own.