Calibrated probability estimates for estimating task time

There is a popular idea in software engineering, that is that it is not possible to estimate how long a project will take. Articles like this and this are this argue this compellingly. However, this idea has a way of being self-fulfilling.

After reading How to Measure Anything by Douglas Hubbard, I’ve changed my mind about the feasibility of software project estimation. My overall position is one of challenging myself and others to get better at quantifying their current level of uncertainty. Let’s first go over what that has to do with estimation.

The book’s title reeks of hubris, measure anything?!, yes. It’s a bold project, but let’s focus for a moment on the process the book spells out, and how it applies to software project estimation. The book starts by defining a measurement as:

a measurement is a reduction in uncertainty, quantitatively expressed

Uncertainty is also defined, it is the existence of multiple possibilities. It can be measured by the probabilities of those possibilities.

So implicit in the idea of uncertainty is the idea of probability. The book takes a strong Bayesian stance. One thing that needs to be cleared up is what is meant by “there is a 90% probability of event A happening”. What if event A is something that will only happen once? For example, suppose A is “the 2018 midterm elections will result in democrats winning a Senate majority but not the House”. That event can’t be repeated, but that probability is still meaningful, and in order understand what the correctness of such a probabilistic statement is, we need the idea of calibration.

Calibration is the idea that an estimator is calibrated if for all p% claims of probability it makes, it is correct about p% of the time. So, even if the 2018 midterm elections only happen once, if the estimator (e.g. Nate Silver) makes a bunch of claims about something with “p% confidence”, those claims are right about p% of the time. As a concrete example: On the eve of the 2016 election, FiveThirtyEight had Trump’s chances at about 28.6%, so if FiveThirtyEight is right about 28% of the time when making similar claims, we can consider them as calibrated.

What does this have to do with software project estimation? It starts with quantifying your current level of uncertainty, if you are bad at that, then Kahneman’s Planning Fallacy applies. The good news is that you can learn to calibrate your estimates of your own uncertainty. Once you are calibrated, you can make claims like “with 90% probability, this software project will be done in 1 - 3 months”, and be right 90% of the time.

Doing calibration training is a matter of taking a bunch of tests and paying Douglas Hubbard money, so I thought of a better way. I’m working on a command line tool called task that records tasks in a SQLite database, along with a 90% Confidence Interval (CI) of how long that task will take. You can tell it that your are done by typing task end and then the ended field will be set. Once it gathers enough data, you can compute how not-calibrated you are, and then use that to update your estimates in the future. As you keep using this tool you should get better at making 90% CI estimates, and once you align your estimation correctness rate with the probability level you initially claimed, you will emerge as a competent project estimator. Even if you are wrong 10% of the time, if your 90% CI level for how long a project takes happens to be right 90% of the time, then your estimates are correct, and the estimation skeptics can be shown a counterexample to their claims that project ETA is unknowable.

Overall, I agree that estimation is hard, but I take that as a challenge, not as a limitation. If you can decompose a project into tasks, and give calibrated estimates of the probability of each of those tasks taking a certain amount of time, you can predict how long it will take, about 90% of the time.