Scrum: Estimating story points

Story points are used to estimate user stories, this is useful because then we can plan an expected amount of user stories we can complete inside the next sprint. Although this is correct, there is more to it. Some time ago someone in my team proposed to estimate stories in days instead of story points, in this post I will explain why I think this is a bad idea, and why sizing userstories can have more benefits than being an aid to sprint planning.

The problem with time estimations

It is understandable why someone would propose estimating in days, or hours: you know how long your sprint takes, so if you estimate for each task how long it will take, you can estimate which tasks will fit in the sprint. Several problems will quickly arise with this method, though:

Not every person works with the same speed. Not everyone has the same knowledge level, the same experience level, or even the same way of working or motivation. So first of all, it is difficult to reach consensus when estimating, and secondly a time-duration estimation will unlikely to be accurate at all, because it depends on who will pick up the work. What do you do if one person estimates ‘1 day’ and another estimates ‘4 days’? Assume they estimated accurately according to their personal speed of working. If you go with the lowest or highest number, it is completely inaccurate if the opposite person picks up the work. If you go with the an average, it is still quite inaccurate, no matter which one of the persons pick it up. On the other hand, if you are estimating abstractly and relatively to other items you are much more likely to be accurate because both persons should be able to recognize and agree on the size in relation to other items. “The item is smaller than item X but bigger than item Y”.
Time is absolute. Let’s say you know a user story is “A shitload of work that will take a lot of time”. So, you want to give a very high estimation, and you’re estimating in time. What do you estimate? 20 days? 23 days? You can’t really know, because humans are bad at estimating. “It is 20 story points of work”, is entirely abstract and should create the incentive to decompose the story into smaller stories, easily digestable pieces. A large number of abstract story points means ‘This user story is too big or complex, and therefore too difficult to estimate accurately’. You can pretend it means the same thing if you’re saying “I estimate it takes 20 days”, but then you are being dishonest, because according to your choice of words, your estimation is already accurate up-to-the-day.
Time sets expectations Imagine completing an item estimated at ‘5 days’ but it took you 7 days. How does it feel to be late? Now imagine completing an item estimated at ‘5 points’ out of sprint backlog of ‘20 points’. No matter how long it took you, imagine how the burndown chart will look after burning 1/4th of the sprint. How does it feel to burn a large part of the sprint backlog? It probably feels a lot better, no matter how long it took you. It promotes the feeling of delivering something valuable, rather than missing a deadline and underperforming.
No velocity You cannot track velocity properly when estimating in duration. So how can you know if your team is improving? Here’s what happens if your team is improving but you estimate in time: Stories begin to take less time to implement, so your estimations will deflate, and then you plan for more stories to be completed per sprint. The closest thing to velocity you can measure now is amount of stories burned, but not every story is the same size, so it doesn’t tell you anything regarding the continuous improvement of the team.

Estimating in ‘Complexity’ or ‘Amount of effort’?

Complexity or ‘amount of effort’ are popular concepts to use for estimating story points. They have the right idea, but I do not recommend the semantics either of them.

The problem with Complexity What do you do when you know a user story that is very simple (low complexity) but for some reason just takes a lot of time to do? Low estimation because it’s not complicated? Or high estimation because it takes a lot of work that can be expected to take a lot of time? A low estimation wouldn’t be accurate, because it sets the expectation that it’s a small amount of work and will be done quickly. A high estimation would not be estimating complexity anymore. However, if you do use the name ‘Complexity’, then be pragmatic and:
- Make the change easy. Refactor, optimize and/or automate the process so that it does not take a lot of time to implement a simple change. Now you can choose a low estimation. Although you had to perform extra work as overhead, this will be visible in a slower burnrate for the sprint. On the upside, in future sprints where you need to make similar changes, the velocity should go up and reflect that you now have less technical debt.
- Decompose the item into small items that do not take a lot of time. For each smaller item, the low estimation will be accurate.
The problem with ‘amount of effort’ The amount of effort an item takes to complete can depend on external factors such as impediments, technical debt, experience level, etc. These are things you want to see reflected in the velocity, not in the estimations of user stories, because the product increment that you deliver with a story remains the same regardless of dealing these type of external factors. Therefore, keep a user stories estimation consistent. As your team removes impediments, reduces technical debt, and gains more experience, you want to see your velocity increase as empirical evidence that your team is improving.

Measure your velocity

But truthfully, how you call it is really not so important. Personally I prefer to use the word ‘size’, to avoid the problems listed above. What matters is that the whole team estimates their user stories in relation to eachother, and that the estimations are detached from external factors such as impediments. This is the only way you allow yourself to measure your velocity accurately, and this is one of the core concepts about Scrum. Continuously improvement based on empirical evidence. Measure, inspect and adapt. Use your velocity to identify whether your team is really improving. If you have a properly measured velocity, you can begin asking yourselves at every retrospective:

Are we going faster?
Why are(n’t) we going faster?
What can we do to go faster?

I recommend that your team builds and maintains a reference sheet of items your team is very familiar with along with their value in story points. Always estimate in relation to the references. You will see that you can become much faster in estimations by always comparing to the same references.

Summary

Story point estimation is the concept of abstractly assigning a ‘size’ value to user story.
An estimation in story points should be relative, using other items as references.
Estimate the size without taking into account amount of work/effort imposed by impediments, technical debt and lack of experience.
Be consistent, so that you can build a reliable measurement of velocity. Use velocity to analyze the teams ability to continuously improve.