Won’t fix! – Part 1: Why software estimates are reliably wrong
Software projects regularly exceed time and budget. This is not due to incompetence, but to the nature of software and human thinking.
(Image: Ell_lial6/Shutterstock.com)
- Golo Roden
Some problems in software development are not bugs that can be fixed, but structural characteristics of the discipline. They are ongoing issues that have accompanied developers for decades and will continue to do so for decades to come.
This article is the first part of a series that sheds light on problems that cannot be optimized away: Won’t fix – as irreparable issues are called in GitHub repositories.
How long does it take to paint an Easter egg? A bit of meadow, an Easter bunny, a sun and some clouds in the sky, including blowing it out and threading the string tied to a match. 10 minutes? 15 minutes? Half an hour? Or even a whole hour? Even for a task whose requirements are clear and whose process has been well known since childhood, the answers range between 10 minutes and an hour, which is a span of 500 percent.
In the software industry, such fluctuations are considered unacceptable. There, developers are expected to make reliable predictions about costs and timeframes for incomparably more complex tasks, whose requirements are often not even fully known. “How much will it cost, and how long will it take?” are the common questions for which customers expect a reliable answer.
However, studies in the software industry have shown a consistent picture for decades: projects take longer than planned, cost more than budgeted, and deliver less than promised. This pattern runs through all topics, all company sizes, and all methods. Neither Waterfall nor Agile nor anything in between has fundamentally changed this. That reliable predictions regularly fail is less surprising upon closer inspection than the expectation that they should be accurate.
What does “estimate” actually mean here?
Before answering the question of why software estimates are so often wrong, it's worth taking a step back: Why estimate at all? The answer is less obvious than it seems, because behind the question “How long will this take?” lie very different concerns.
Sometimes it's about a cost forecast. A company wants to know if an investment is worthwhile and needs a figure for budget planning. Sometimes it's about a time forecast, for example, because a market event dictates a deadline. And sometimes it's simply about feasibility: Is this even realizable, and if so, what is the scale of the effort involved?
This distinction is rarely made in practice, although it is crucial. Those who want to check feasibility do not need an hourly rate. Those who plan an annual budget do not need precision down to the day. And those who require a fixed delivery date are not helped by an effort estimate in story points. Added to this is another misunderstanding: an estimate is often treated as a commitment, although by its nature it is an approximation under uncertainty. Those who give an estimate communicate an expectation. Those who receive an estimate frequently understand a promise. This difference may sound subtle, but in practice it is the source of countless conflicts between development teams and their clients.
Videos by heise
It's called development, not production
The software industry contributes to a fundamental misjudgment by talking about production efficiency, software factories, and throughput as if it were about manufacturing components on an assembly line. These terms suggest an industrial, reproducible process where estimates are possible and meaningful because the assembly line sets the pace.
However, software development is not production. The name says it all: it is development. A creative, and in a sense even artistic, process. The outstanding American computer scientist Donald E. Knuth called his life's work “The Art of Computer Programming” and, with the concept of Literate Programming, coined the idea that programs should not only work but also be readable and aesthetically pleasing. His associated question has lost none of its relevance: “When was the last time you spent a pleasant evening in a comfortable armchair reading a good program?” The fact that this question sounds absurd to most people says a lot about the industry's misunderstanding of its own work.
Software is executable knowledge. And knowledge cannot be generated on demand within a given time or financial framework. It arises through discovery, invention, and understanding. And this process cannot be timed. Ideas come when there is creative space for them, not when the project plan calls for them. As early as 1986, Fred Brooks distinguished in his essay “No Silver Bullet” between essential and accidental complexity. Accidental complexity can be reduced through better tools and methods: better editors, more powerful frameworks, more efficient build systems. Essential complexity, however, lies in the problem itself, and no tool in the world can eliminate it. It is the part of the task that remains after all technical hurdles have been removed.
Estimating software is essentially trying to predict how long it will take to understand a problem that no one has yet fully understood. That this prediction is rarely accurate is less surprising than the fact that it is constantly demanded anyway.
The journey from Berlin to Munich
A metaphor makes this tangible. Suppose someone is asked to predict how long a hike from Berlin to Munich will take. The calculation seems simple: measure the distance on the map, set an average walking speed, done. On paper, this looks convincing.
Not in reality. The map does not show elevation differences, at least not in the required detail. It doesn't show that paths might be closed, that rivers need to be crossed and a bridge needs to be found, that the weather can change, and that detours are unavoidable. Anyone who has ever hiked knows that surprises lurk even on a supposedly familiar route: a fallen tree, a construction site, a wrong turn that costs an extra kilometer. The plan and the actual route have little in common.
Software projects behave the same way. The specification is the map; the code is the route. And between the two lies the unknown terrain: libraries that behave differently than documented, requirements that turn out to be contradictory during implementation, technical debt in existing systems that no one has marked on the map.
In software development, the Cone of Uncertainty describes exactly this phenomenon. It states that the uncertainty of an estimate is greatest at the beginning of a project and only decreases in the course of the project as more is learned about the problem and its solution. At the start of a project, the actual duration can deviate from the estimated value by a factor of four, either up or down. Only when a significant portion of the work has already been done does the estimate approach reality. The paradox: Precisely at the beginning, when uncertainty is greatest, budgets are set, contracts are signed, and timelines are communicated. It's like planning a hike without ever looking at the terrain and then considering the calculated schedule binding.
When the mind plays along
Even if the structural difficulties are known, a second issue remains: the human mind systematically works against realistic assessments. In the 1970s, psychologists Daniel Kahneman and Amos Tversky described the Planning Fallacy, the observation that people systematically estimate projects too optimistically, even when they have experience with similar endeavors. What's remarkable about the Planning Fallacy is that knowing about its existence hardly helps: even someone who knows that their last estimate was off by a factor of three will not correct the next estimate accordingly. The Optimism Bias ensures that risks and obstacles are underestimated. The anchoring effect causes the first number mentioned to pull all subsequent estimates in its direction, regardless of how well that first number was justified.
In teams, group effects are added. In Planning Poker, for example, a common estimation method in agile teams, all participants estimate simultaneously to precisely reduce this mutual influence. The method contains another silent admission: the values used correspond to an approximate Fibonacci sequence. Small tasks can be classified with fine granularity, hence the values 1, 2, 3, and 5. Large tasks, on the other hand, can only be estimated roughly. There, a 13, a 20, or a 40 is sufficient. The scale itself reflects what the industry has long known: the larger the task, the less reliable the estimate.
Another often underestimated factor is the confusion between code complete and feature complete. Many estimates mentally end with the final commit. But a finished feature also includes debugging, troubleshooting, testing, integration, code review, documentation, and deployment. These activities typically make up the larger part of the total effort, but are frequently not included in the original estimate.
In addition, there is a phenomenon that Fred Brooks already described in 1975 in “The Mythical Man-Month” and which is known as Brooks's Law: if a software project falls behind schedule and additional staff are brought in, it will be completed even later. The communication overhead in a team grows quadratically with team size, and the onboarding of new colleagues consumes capacity from those who are already under pressure. A communication channel arises between every two people. A team of 5 people therefore has 10 such channels, a team of 10 already has 45. Each additional person brings not only a new channel but as many as there are existing team members. In other words, each additional person not only increases capacity but also coordination effort, and beyond a certain point, the effort outweighs the benefit. This insight has been known for half a century and is still regularly ignored.
Pragmatic Solutions
If estimates are unreliable for structural and psychological reasons, the question arises whether there are better approaches. There are indeed, even if none solve the fundamental problem.
The movement under the slogan No Estimates advocates the most radical position: instead of estimating better, the question should be changed. If work is broken down into such small units that each can be completed in a few days, the question of the overall estimate loses its significance. Progress is not predicted, but measured. What was delivered yesterday is a better indicator for tomorrow than any estimate from three months ago.
For those who still need a forecast, probabilistic methods offer a middle ground. Monte Carlo simulations, for example, use historical data to generate not a single estimate, but a probability distribution: “There is an 85 percent chance we will be done in eight to twelve weeks” is a more honest statement than “It will take ten weeks.” It makes uncertainty visible instead of hiding it behind pseudo-precision. And it forces all parties involved to talk about risk tolerance, rather than about supposedly fixed deadlines.
Perhaps the most effective approach, however, is a fundamentally different way of dealing with the question. Instead of thinking of a project as a monolithic block that must be fully estimated and planned in advance, an iterative approach can be taken in numerous instances. Step by step, until the result is good enough or until the invested budget is exhausted. If software provides value from the outset, if each iteration produces a usable result, then the question shifts from “When will everything be finished?” to “Is the next step still worthwhile?”. The decision to continue or abandon is then made not based on an estimate from months ago but on the basis of the actual progress made and the effort invested.
This does not work in every context, for example, when regulatory requirements necessitate a complete system or when a physical product must be delivered by a fixed deadline. But in surprisingly many cases, it is a viable approach that is at least worth an honest discussion. It shifts the risk from a major upfront decision to many small decisions along the way, where more knowledge is available at each step than at the beginning.
The belief in counting and measuring
The German typographer Paul Renner once wrote: “The belief in counting and measuring leads to the grossest errors in all arts.” The software industry has fallen prey to the fallacy that a creative process can be planned with the same methods as an industrial one. Software estimates fail not due to lack of discipline, not due to lack of experience, and not due to poor tools. They fail due to the nature of software as executable knowledge and the nature of human thinking, simultaneously and in a way that cannot simply be optimized away.
Those who accept this can deal with estimates differently. Ask different questions, choose different formats, set different expectations. Treat estimates as what they are: approximations under uncertainty that must be regularly reviewed and adjusted. And above all, stop treating estimates as promises, because that is not what they are, and that is not what they can be.
The illusion that a creative process can be planned like a production line is just one of several that the software industry has fallen for. Another, at least equally persistent, concerns the question of error-freeness: the notion that bugs can be completely avoided with enough care and the right tools.
Why that too is an illusion, and why even “Hello World” contains a bug, is the subject of the second part.
(mma)