Key Takeaways
- Story points are a unit of measure for the overall effort, complexity, and risk of a user story — not a measure of time.
- Points are relative, not absolute: a 5-point story is roughly 2.5x the effort of a 2-point story.
- Story points are meaningful only within one team — never compare story points across different teams.
- Velocity (total points completed per sprint) enables predictable sprint planning and capacity forecasting.
Defining story points
A story point is an abstract unit of measure that expresses the overall effort required to implement a user story or any other piece of work. Rather than predicting how many hours or days a task will take, the team assigns a number that reflects the combined complexity, volume of work, and degree of uncertainty involved.
The key insight behind story points is that humans are remarkably bad at estimating absolute duration but surprisingly good at comparing relative sizes. We struggle to say “this task will take exactly 14 hours,” but we can confidently say “this task is about twice as hard as that other one we finished last sprint.” Story points leverage this cognitive strength.
The concept was popularized by Mike Cohnand has roots in Extreme Programming (XP), where the original term was “ideal days.” Over time, teams found that abstracting further — away from any time-based unit — produced more honest and more useful estimates. Today, story points are the default estimation unit for most Scrum teams around the world.
Story points are assigned during backlog refinement or sprint planning, often through a technique called planning poker. Each team member independently selects a point value, and if estimates diverge, the team discusses until they reach consensus. This process surfaces assumptions, uncovers hidden complexity, and builds shared understanding of the work ahead.
How story points work
Five core principles that make story points an effective estimation tool.
Relative sizing
Story points compare stories to each other rather than to clock time. A story worth 5 points is roughly two to three times the effort of a 2-point story. The team decides what each number means relative to their own experience.
Complexity + Effort + Risk
Points capture three dimensions at once. A story can be high-effort but low-risk, or low-effort but technically uncertain. The single number reflects the combined picture, making it easier to compare dissimilar work items.
Team-specific calibration
Story points are meaningful only within the team that assigned them. A team of senior engineers may rate a story as a 3 while a junior team rates similar work as an 8. Both are correct for their context.
Velocity tracking
Each sprint, the team totals the points of completed stories. Over time this velocity number stabilizes, giving the team a data-driven way to forecast how much they can deliver in upcoming sprints.
Predictable planning
Once velocity is established, product owners can estimate how many sprints a set of stories will take. Velocity trends enable forecasting without asking individuals to predict their personal output in hours.
Common story point scales
There is no single correct scale. Choose the one your team finds most intuitive.
Fibonacci sequence
The most popular scale: 1, 2, 3, 5, 8, 13, 21. The increasing gaps between numbers force the team to accept uncertainty at larger sizes rather than pretending to know the difference between a 14 and a 15.
Powers of 2
Uses 1, 2, 4, 8, 16, 32. The doubling pattern is easy to reason about — each step up means roughly twice the effort. Some teams find this simpler than Fibonacci because the progression is more regular.
T-shirt sizes mapped to numbers
XS = 1, S = 2, M = 3, L = 5, XL = 8, XXL = 13. T-shirt labels feel less precise, which can reduce debates. The underlying numbers still allow velocity tracking and sprint planning.
The Fibonacci sequence is the most popular choice because the gaps between numbers grow as the numbers increase. This mirrors real-world uncertainty: you can tell the difference between a 1-point task and a 2-point task, but the difference between a 13-point task and a 14-point task is meaningless. The Fibonacci scale forces you to bucket work into distinct sizes, which reduces pointless debate over small differences.
Whichever scale you choose, consistency is more important than the specific numbers. Once your team has a shared mental model of what each value represents, stick with it. Changing scales mid-project resets your velocity data and makes historical comparisons unreliable.
Story points vs hours
The most common question teams ask is: “Why not just estimate in hours?” The answer comes down to a well-studied cognitive bias called the planning fallacy. Research by Daniel Kahneman and Amos Tversky shows that people systematically underestimate the time needed for future tasks, even when they have experience with similar work. Hour estimates feel precise but are often wrong.
Story points sidestep this trap by shifting the question from “how long?” to “how big?” When a team says a story is 5 points, they are not committing to a timeline. They are saying it is roughly the same complexity as other 5-point stories they have completed. The velocity metric then translates that relative data into throughput over time — without requiring anyone to predict the future in hours.
That said, hours are not always wrong. They work well for teams doing repetitive, well-understood work where task durations are predictable. They also make sense when contracts require time-based billing or when stakeholders need hour-level granularity. For most product development teams, however, story points produce more reliable sprint plans and less estimation stress.
The worst approach is converting story points back to hours. This combines the cognitive overhead of both systems without the benefits of either. If your organization demands time estimates, use velocity-based forecasting: “Based on our velocity of 30 points per sprint, this 90-point epic will likely take three sprints.” That gives stakeholders the date range they need without forcing the team into hour-level guessing.
How to assign story points
The first step is to choose a reference story. Pick a recently completed user story that the entire team understands well. Assign it a middle-range value — most teams use 3 or 5. This story becomes your anchor point. Every future estimate is a comparison: “Is this new story bigger, smaller, or about the same as our reference?”
Next, use planning poker to estimate collaboratively. Each team member independently selects a point value without seeing what others chose. Once everyone has voted, the cards are revealed simultaneously. If all votes agree (or are close), the team accepts that estimate and moves on. If there is a wide spread, the highest and lowest voters explain their reasoning, and the team votes again.
Planning poker works because it prevents anchoring bias — the tendency for the first number spoken to influence everyone else. Simultaneous voting ensures each person commits to their own assessment before hearing others. The subsequent discussion surfaces assumptions and risks that might otherwise go unnoticed.
Over the first few sprints, your estimates will feel rough. That is completely normal. As the team builds a catalog of completed stories with known point values, new estimates become faster and more consistent. After three to five sprints, most teams can estimate an entire backlog refinement session in under an hour.
A practical tip: keep a reference sheet of three to five completed stories at different point values. When the team debates whether something is a 5 or an 8, pull up the reference stories and compare. This grounds the conversation in concrete examples rather than abstract feelings.
Common pitfalls to avoid
Story points work best when the team treats them as a planning tool, not a performance metric.
Inflating points
When velocity becomes a performance target, teams start inflating estimates to look more productive. This destroys the reliability of the metric. Velocity should be a planning tool, never a scorecard.
Cross-team comparison
Comparing velocity between teams is meaningless because each team calibrates points differently. A team with velocity 40 is not necessarily more productive than a team with velocity 20.
Converting to hours
Mapping story points back to hours defeats their purpose. The abstraction exists to free teams from time-based pressure. If stakeholders demand hours, provide date-range forecasts based on velocity instead.
Over-precision
Spending twenty minutes debating whether a story is a 5 or an 8 is a waste. If the team is split, go with the higher number and move on. The goal is rough consensus, not mathematical certainty.
The root cause of most pitfalls is treating velocity as a performance indicator. When managers use velocity to evaluate team productivity or compare teams, it creates perverse incentives. Teams inflate estimates, avoid risky work, and game the numbers. Velocity should be visible only to the team and the product owner, used solely for sprint planning and capacity forecasting.
Another common mistake is re-estimating stories after they are completed. If a 3-point story actually took much longer than expected, resist the urge to change it to an 8 retroactively. Instead, use the experience to inform future estimates. The whole point of velocity is that it absorbs estimation errors over time — as long as the team is consistent, the average smooths out the variance.
Frequently asked questions
A story point is an abstract unit of measure that agile teams use to express the overall effort, complexity, and uncertainty involved in completing a user story. Rather than estimating in hours, the team assigns a relative number that compares the story to other work they have done before.
Story points remove the pressure of time-based estimation, account for varying skill levels across the team, and naturally incorporate uncertainty. Over multiple sprints, velocity — the average number of points completed — becomes a reliable planning metric without requiring precise time predictions.
Start by choosing a well-understood story as a baseline (often assigned a 3 or 5). Then compare each new story to that reference: is it simpler, about the same, or more complex? Use planning poker to let each team member vote independently before discussing differences.
No. Story points are calibrated within a single team based on their own experience and velocity. A 5-point story for one team may represent very different work than a 5-point story for another team. Cross-team comparisons are unreliable and should be avoided.
There is no universal number. Each team develops its own velocity over time — typically measured as a rolling average of the last three to five sprints. A new team should track their velocity for several sprints before using it for planning.
Continue reading
More guides on agile estimation and team practices.
What is planning poker?
Learn what planning poker is, how the estimation technique works, and why agile teams use it for sprint planning.
Read guidePlanning poker vs T-shirt sizing
Compare planning poker and T-shirt sizing estimation methods. Learn the pros, cons, and when to use each technique.
Read guideEstimating user stories
A practical guide to estimating user stories using story points. Learn the process, common scales, and mistakes to avoid.
Read guide