Consistency is a business decision
Consistency (and eventual consistency) is often treated as a technical risk. Yet, it existed long before computers. Ignoring it leads to fragile systems.
(Image: NicoElNino/Shutterstock.com)
- Golo Roden
In German, a translation has become common that steers thinking about distributed systems in the wrong direction. "Eventual Consistency" is often rendered as "eventually consistent." Eventually, meaning possibly. A database that is possibly consistent sounds like a system you'd better not trust. No wonder many developers instinctively reach for stronger guarantees as soon as the term is mentioned.
However, the English meaning of "eventual" is different. It means "ultimately" or "in the end." So, Eventual Consistency does not mean that consistency might occur. It means that consistency reliably occurs, just not immediately. The question is not whether, but when. And this question is not technical, but a business one. It always has been.
A better mental model than "possibly inconsistent" is that of stale data. At any given moment, some part of a system might access data that does not reflect the very latest state. The data is not wrong. It's just not current. It was correct a moment ago. It will be correct again in a moment. It's just stale. Every system has stale data somewhere. The question is not whether data can be stale, but how stale is acceptable. Milliseconds? Seconds? Minutes? Hours? The answer depends on the use case, and it's rarely "never."
The Convenient Lie of Strong Consistency
There's something else that stubbornly persists in software development: the claim that a relational database is always consistent. Transactions guarantee it. ACID guarantees it. After the commit, the data is there, and everyone sees it. But this is only true within a single database instance, for a single query, at a single moment. Real systems don't work like that.
Anyone using read replicas already accepts that read operations can return stale data. Anyone using a cache accepts that the cached data might not be current at the time of reading. Anyone operating a mobile app accepts that the display on the device shows the state of the last synchronization, not the current one. And anyone rendering HTML server-side accepts that the page might already be outdated by the time it's delivered.
The moment data leaves the database, it begins to age. By the time it reaches a user's screen, it's already a snapshot of the past. The customer sees "Order confirmed," but the inventory system hasn't processed the order yet. The customer sees "3 in stock," but someone else has just put two of them in their cart. This is not a bug. This is how distributed systems work. And any system with a user interface is a distributed system, because the end device is its own node, the network is unreliable, and time passes between request and response.
Videos by heise
In my consulting work, I regularly encounter teams who are convinced their systems are strongly consistent. Upon closer inspection, it almost always turns out that consistency ends at the database boundary. Beyond that lies a world of caches, replicas, message queues, and rendered interfaces, where data is already stale before it arrives. The system is already eventually consistent. It's just that no one says it out loud.
Consistency Before Computers
Eventual Consistency is not a phenomenon that emerged with distributed software systems. It's a fundamental problem of the physical world, and companies have found ways to deal with it for centuries.
Imagine a company with two sales offices in different cities, at a time when the telephone was the fastest means of communication. Both offices sell from the same inventory. A customer in Munich wants to buy the last unit of a product. A customer in Hamburg wants to buy the same unit. Neither sales representative knows what the other is doing at that moment.
How did companies solve this problem? Not through perfect real-time synchronization. They accepted that conflicts would occur and developed processes to handle them. They oversold and apologized. They maintained safety stocks. They called the warehouse before promising a delivery. They compensated disappointed customers. They managed risks, not consistency.
The same product, the same problem, just with pen and paper instead of databases. No technology in the world can eliminate this fundamental challenge: two people, two locations, one resource, incomplete information. The laws of nature guarantee that information takes time to spread. Perfect synchronization is not just difficult. It's impossible.
What's interesting is that these analog processes often worked remarkably well. Not because they prevented inconsistencies, but because they developed strategies for dealing with them. Accounting reconciled inventory at the end of the day. Sales called the warehouse when in doubt. And if two customers were promised the same product, there was a conversation, an apology, and a solution. The business process was prepared for conflicts because no one would have thought of declaring them impossible.
When I talk to teams about Eventual Consistency, this historical perspective often helps more than any architecture diagram. It shifts the problem from the technical corner to the business one, where it belongs. The question is not: How do we prevent inconsistency? It's: How do we deal with it?
Asking the Right Questions
When developers discuss Eventual Consistency, they often ask the wrong question. They ask: Is Eventual Consistency acceptable here? This frames consistency as a binary property, as if you either have it or you don't.
The right questions are different: How often does a conflict actually occur? If two users try to buy the last item simultaneously: Does it happen once a day? Once a month? Once a year? The frequency determines whether a conflict is a real problem or a theoretical concern. What does a conflict cost? Is the consequence a disappointed customer? A refund? Manual intervention? A legal problem? The costs determine how much effort is justified to prevent it. And what does prevention cost? Stronger consistency guarantees are not free. They require coordination, hence latency. They require locks, hence lower throughput. They require infrastructure, hence money.
These are business questions, not engineering questions. The development team can explain what is technically possible and what each option costs. But the decision about acceptable risk belongs to the business. A payment system and a social media feed have different tolerances. A patient record and a shopping cart have different requirements. The context determines the answer.
This is precisely why discussions about Eventual Consistency do not belong exclusively in technical meetings. The team may have strong opinions about technical correctness but might not know that the business department would readily accept a one-second delay if it lowers infrastructure costs. Or it might not know that a particular use case has regulatory requirements that demand stronger guarantees. The conversation needs both perspectives.
Everyday examples show how natural Eventual Consistency already is. Package tracking shows a status that was recorded hours ago. The package was scanned when it left the sorting center. It's been in transit since then, possibly already delivered. The information displayed is stale, and no one minds because an approximate overview of the progress is sufficient. Stock displays in online shops are a similar case. "Only 3 left in stock" was correct when the page was rendered. In the meantime, someone might have bought one. Someone might have put one in their cart without completing the order. The number is an indication, not a guarantee. And that's acceptable because the checkout process handles the edge case where the item is actually out of stock.
The example of airlines is particularly insightful. Airlines deliberately overbook because they know that some passengers won't show up. The booking system accepts more reservations than there are seats. If everyone does show up, the problem is resolved at the gate: compensation, rebooking, upgrade. The system is designed to accept conflicts and resolve them afterward. This is not a failure of consistency. It's a business strategy that fills seats that would otherwise remain empty on millions of flights each year.
All these systems work. Their users don't complain because the consistency window is short enough or because the business process elegantly handles exceptions. The goal was never perfect consistency. The goal was sufficient consistency.
The ATM That Kept Working
Among the many examples I use in discussions about Eventual Consistency, one is particularly insightful because it completely flips the perspective.
ATMs are normally online and connected to the bank's systems in real-time. When withdrawing money, the ATM checks the account balance, verifies coverage, and dispenses the cash. Simple and consistent. But what happens when the network connection fails and the ATM goes offline?
Most developers I ask this question respond: "The ATM must stop operating. No connection means no balance check. No check means possible overdrafts. So, shut down until the connection is restored." This is the obvious, the safe, the technically correct answer. But it's also the wrong one.
Imagine a prominent millionaire standing at an ATM wanting to withdraw 50 euros, only to be told that the ATM is out of order. The headline the next day: "Bank leaves top client in the lurch due to network problem." That's not a headline a bank wants. The reputational damage far outweighs any overdraft risk on a small withdrawal.
So, what do banks actually do? The ATM continues to operate, even offline. But with intelligent risk management. Most outages are brief: the connection drops for a few minutes and then restores itself. By the time anyone notices, the problem has already resolved itself. Most people also only withdraw money when they know they have it. No one wants the embarrassment of being rejected at an ATM with a queue behind them. This self-selection significantly reduces the overdraft risk. And in case the connection fails for longer, the bank limits the maximum withdrawal amount in offline mode.
But the truly interesting point lies elsewhere. If someone actually overdraws their account in offline mode, the bank has two ways to frame the situation. First option: "Customer exploited a system vulnerability during a network outage." This sounds like a security incident. Second option: "Bank showed flexibility and helped the customer despite technical difficulties." This sounds like excellent customer service. And by the way, the bank charges overdraft interest. The customer who withdrew 50 euros they didn't have will pay it back, with a surcharge. The bank has turned a technical limitation into a revenue stream.
This is what business thinking about consistency means. The developer sees a consistency problem and wants to prevent it at all costs. The business side sees a risk-reward calculation and finds a solution that is better than both "always consistent" and "always available." The best answer wasn't in the engineering meeting. It was in the business meeting.
Hiding is the Real Risk
The danger lies not in Eventual Consistency. It lies in pretending it doesn't exist.
If you consider your system strongly consistent, you don't design compensation logic. If you don't expect conflicts, you don't build conflict handling. If you don't plan for stale data, you don't design a user experience that can handle it. And when the race condition eventually occurs, when the cache returns stale data at the wrong moment, when replica lag leads to visible inconsistency, there's no plan. The system fails in a way no one foresaw because no one bothered to foresee it.
Conversely, acknowledging Eventual Consistency means designing for it. You think about what happens when data is stale. You build idempotent operations that can be safely repeated. You create compensation mechanisms in case something goes wrong. You communicate uncertainty to users instead of conveying false security.
In practice, this means concrete design decisions. Instead of displaying "Order successful" when the order has merely been accepted, you display "Order being processed" and update the status once processing is complete. Instead of disabling a button after clicking and hoping for consistency, you design the operation to be idempotent so that a double-click causes no harm. Instead of showing an error when an item is out of stock between the cart and checkout, you offer an alternative. These are not technical workarounds. They are well-thought-out user experiences based on an honest assessment of the system's reality.
The German mistranslation is accidentally profound. "Eventuell konsistent" sounds ominous because uncertainty sounds ominous. But uncertainty is the reality of distributed systems. The choice is not between security and uncertainty. It's between acknowledged and hidden uncertainty. One leads to robust systems. The other leads to surprises.
Eventual Consistency is not a limitation to overcome. It is a reality for which to design. Your systems are already eventually consistent. The question is whether you design for that reality or pretend it doesn't exist. And this is not a question an engineering team alone should answer. It's a business decision. It always has been.
(rme)