How to Estimate Tasks (and Why You Are Always Wrong)
Every engineer underestimates tasks. The estimate is for the happy path - writing the code, the part you can see clearly when you look at the ticket. What the estimate doesn’t include: understanding the existing code well enough to change it safely, handling edge cases that appear once you start writing, dealing with the broken local environment, getting the PR reviewed, addressing review feedback, the merge conflict, the flaky test in CI, and the deploy that fails in staging because of a configuration difference you didn’t know about.
This isn’t a failure of effort or intelligence. It’s structural. You can’t estimate what you can’t see, and you can’t see the things you don’t know are there.
Understanding why estimates fail is the first step to making better ones.
The Planning Fallacy
The planning fallacy is the documented tendency for people to underestimate the time and cost of future tasks, while overestimating the time and cost of tasks done by others. It was named by Daniel Kahneman and Amos Tversky in 1979 and has been replicated many times since.
The mechanism: when you estimate a task, you imagine the best-case scenario - a smooth execution of the plan you can see. You don’t systematically account for the unknown unknowns, the interruptions, the dependencies that turn out to be broken.
The counterintuitive finding from the research: asking people to consider what might go wrong doesn’t help much. They list a few obstacles, add time for them, and still arrive at an optimistic estimate. The obstacles they imagine are the ones they can imagine - the known unknowns. The actual delays usually come from elsewhere.
Why “Just Double It” Isn’t Enough
A common heuristic: take your estimate, double it (or multiply by pi, or some other constant). This is better than nothing - it accounts for the systematic underestimation bias. But it’s blunt. A two-hour task that becomes four hours is fine. A two-week task that becomes a month is a significant schedule impact. The multiplier doesn’t scale consistently with complexity.
More practically, the heuristic doesn’t build better estimating skill. If you multiply everything by two and it’s still wrong half the time, you haven’t learned anything about what made it wrong.
Breaking Tasks Down
The most reliable estimation technique is decomposition: break the work into specific subtasks, estimate each, and sum them up. The sum is usually better than a top-down estimate of the whole.
This works for a few reasons. Small tasks have less variance. Thinking through the subtasks forces you to identify dependencies and unknowns you would have missed with a high-level estimate. The process of decomposition often reveals that you don’t understand part of the task well enough to estimate it - which is information.
For a ticket like “add user avatar upload support”:
- Research file upload limits and storage options: 2h
- Design the API endpoint and response schema: 1h
- Implement the upload endpoint: 3h
- Add image resizing/validation: 2h
- Store the avatar URL in the user profile: 1h
- Update the profile UI to show avatar: 2h
- Write tests: 2h
- Handle error cases (too large, wrong format, storage failure): 2h
- Deploy and verify in staging: 1h
Sum: 16h. A naive estimate might have been “2-3 days.” The decomposition produces roughly the same answer but also reveals the work that’s easy to miss in a high-level estimate (error handling, deployment verification).
When you can’t decompose - when you don’t understand the task well enough to list the subtasks - the right response is to estimate the investigation first: “I need 2 hours to understand the existing code before I can estimate this.”
Reference Class Forecasting
There’s another evidence-based technique: instead of planning how you’ll do this task, look at how similar tasks actually went.
How long did the last three “add a new field to the user profile” tasks take? If they took 2 days, 3 days, and 1.5 days respectively, the base rate for this class of task is roughly 2-3 days. Start there, then adjust for specific differences.
This sounds obvious, but most engineers don’t do it. They estimate each task from scratch, in isolation, using the same planning-fallacy-prone imagination. Using historical data bypasses the cognitive bias and grounds the estimate in reality.
You don’t need a formal system for this. A rough mental model - “refactors in this codebase usually take longer than expected because of test coverage,” “API integrations with this vendor always have undocumented edge cases” - is enough to improve your calibration.
Communicating Estimates
The estimate you give affects how much it’s trusted. “About a week” is different from “3-4 days if the database schema is straightforward, up to a week if I need to handle migrations carefully.”
The second version is more useful to everyone: it communicates your uncertainty, it identifies the key risk factor, and it gives the recipient enough information to ask a useful follow-up question.
When estimates carry uncertainty, say so explicitly. “I’d need to look at the payment service integration first - it could be 2 days or 5 days depending on what I find” is honest and useful. Committing to a specific number for something you don’t yet understand creates false confidence that leads to missed deadlines and eroded trust.
If your estimate changes after you start work, say so immediately. Don’t wait until the original estimate has passed and the stakeholder is asking where the ticket is. “I’ve discovered the authentication layer is more complex than expected - the original 2-day estimate was wrong, it’s probably 4 days” is uncomfortable to send but appreciated. Silence until the deadline is not.
What Improves Over Time
The engineers with the best estimation track records share a few habits:
They write down their estimates and compare them to actuals. Without this feedback loop, you accumulate intuition without calibrating it. With it, you learn that you consistently underestimate anything touching the payment code, or that your estimates for “small” tasks are accurate but large tasks always run over.
They separate “what I’m building” from “what I’m figuring out.” Exploration has its own estimate. Implementation has its own estimate. Conflating them produces one bad estimate instead of two honest ones.
They push back on deadlines that aren’t grounded in estimates. “We need this in two days” is a requirement. Whether it’s achievable is a separate question. The answer to that question should come from the work, not from the deadline working backward.
Estimation will never be precise. The goal isn’t to get it exactly right - it’s to get better calibrated over time, to communicate uncertainty clearly, and to catch when you’ve made a bad assumption before it becomes an overdue ticket.