If you only quantify one thing, quantify the Cost of Delay.
Weighted Shortest Job FirstWeighted Shortest Job First (WSJF) is a prioritization model used to sequence jobs (for example, Features, Capabilities, and Epics) to produce maximum economic benefit. In SAFe, WSJF is estimated as the Cost of Delay (CoD) divided by the job duration.
In a flow-based system, priorities must be continuously updated to provide the best economic outcomes. In other words, job sequencing produces the best results rather than prioritization based on a theoretical return on investment for each job.
To that end, SAFe applies WSJF to prioritize backlogs. Backlogs are continuously prioritized based on a WSJF algorithm that uses relative user and business value, time criticality, risk reduction and/or opportunity enablement, and job size. WSJF also conveniently and automatically ignores sunk costs, a fundamental principle of Lean economics.
In Principles of Product Development Flow , Reinertsen describes a model (WSJF) for prioritizing jobs based on the Cost of Delay. Simply put, CoD is the money lost by delaying or not doing a job for a specific time. It’s a measure of the economic value of a job over time. For example, if implementing a prospective feature would be worth 100,000 per month, and there was a delay of three months, the total CoD would be 300,000. In the SAFe context, jobs are the Features, Capabilities, and Epics contained in their respective backlogs.
Jobs that can deliver the most value in the shortest duration provide the best economic return. As applied in SAFe, the WSJF model supports the economic principles of Lean product development flow:
- Taking an economic view
- Ignoring sunk costs
- Making financial choices continuously
- Using decision rules to decentralize decision-making and control
- If you only quantify one thing, quantify the Cost of Delay
Figure 1 shows the impact of applying Reinertsen’s WSJF for prioritizing jobs to be done. The blue-shaded areas illustrate the total CoD in each case. The jobs with the highest WSJF deliver the best economic outcomes. As the figure shows, ‘picking the next best job to do’ can have a dramatic financial impact.
Estimating the Cost of Delay
As described above, the calculation of WSJF assumes one can determine the CoD (numerator) in absolute financial terms per unit of time, and the job time can be estimated with some degree of accuracy. In practice, however, both numbers can be extremely difficult to estimate. With regard to the numerator, CoD is an estimate at best; it’s hard for anyone to know the actual value of a new job (a new feature) that has yet to be delivered to market. But Agile teaches us how to quickly estimate on a relative basis. Since there are many ‘jobs to be done’ in the backlog, simply use relative numbers to compare jobs.
The first task is to get the right stakeholders together and collectively estimate the Cost of Delay relative to other jobs in the backlog. Figure 2 illustrates the three primary components of CoD for any particular job.
Figure 3 shows the formula for the CoD. First, compare backlog items relative to each other using the same modified Fibonacci numbers as in ‘estimating poker.’ Then the (relative) CoD is calculated as follows:
Estimating the Job Duration
The next item in the equation, the denominator of WSJF, is the job duration. This duration can also be challenging to determine, especially early on, when the available capacity and time needed for each job are unknown. In other words, before doing the work, it’s hard to know who will be working on it, how many people can be engaged, and how long it will take. However, since larger jobs take longer to complete than smaller ones, job size can be used as a good proxy for the duration. (As an analogy, if I’m the only one mowing my lawn, and the front yard is three times bigger than the back yard, it will take three times longer to cut.)
Using job size as a proxy for duration results in a straightforward calculation for comparing jobs via WSJF, as Figure 4 illustrates.
A Note on Using Job Size as a Proxy for Duration
It’s important to note that job size is not a perfect proxy for job duration. Let’s consider two scenarios:
- Suppose specialty skills are readily available, enabling a large job with a high value to be delivered more quickly than expected. In that situation, it may provide more value in a shorter period. (If three people can mow my large front lawn while I do the small backyard, these jobs will have approximately the same duration but not the same value.)
- A small job may have a scarcity of resources, or dependencies might mean that a smaller job could take longer than a bigger job.
If either of these is the case, simply use the relative estimated duration and adjust accordingly. But rarely do we need to worry about these two exceptions. In most situations, fast, WSJF relative estimating is adequate. Since this is a flow-based system, minor errors in the selection are not that critical, as that next important job will rise to the top of the backlog soon enough.
The actual calculation and prioritization are more straightforward than the explanation that brings us to this point. Compare jobs (three features, in this example) for each CoD component and job size using a simple table or spreadsheet (Figure 5). As with estimating stories, the modified Fibonacci sequence reflects higher uncertainty when the numbers become larger. Specific instructions follow:
- Start by estimating the CoD components (user-business value, time criticality, risk reduction and/or opportunity enablement), in columns 1,2, and 3, one column at a time, setting the smallest item to ‘1’. Then determine the other jobs’ estimate relative to that job. Note: it’s critical to ensure each column has a ‘1’ representing the smallest item, as that normalizes the parameters against each other.
- Sum each component to calculate the CoD in column 4.
- Estimate the job size in column 5; again, give the smallest job a 1.
- Calculate the WSJF in column 6 by dividing the CoD by job size.
The highest WSJF is the next most important job to do.
This model encourages splitting large jobs into smaller ones. Otherwise, critical big jobs might never get done. But that’s just Agile at work. Since the implementation is incremental, a different job will be selected whenever continuing work on a big job doesn’t rank well against its peers. Another advantage of SAFe’s WSJF model is that the specific monetary elements of CoD components are unnecessary, significantly reducing complexity and the time spent on prioritization. Instead, each job is compared relative to the other items from the same backlog. Since updated backlog estimates include only the remaining job size, frequent reprioritization means that the system will automatically ignore sunk costs. It will always pick the next best job based on current economic factors.
Applicability of WSJF
WSJF is a general algorithm that is particularly useful in flow-based systems where frequent reprioritization is a driver of economic value. But it doesn’t make decisions; it is simply a reasoning tool for use by stakeholders who must ultimately do so.
As presented in this article, WSJF is particularly useful for prioritizing features and capabilities in the ART and Solution Train backlogs. This is because:
- Features are the primary economic driver for trains; some investment in prioritization is warranted.
- There aren’t many features in flight at any one time; An ART backlog typically contains perhaps 100 or so features waiting for attention. It is generally straightforward to maintain rolling WSJF estimates over time.
- Since features are critical to ART performance, Input from Product Management, Architects, Business Owners, affected teams, and other stakeholders is critically important. It’s worth the effort to insist on a collaborative effort to prioritize ART features.
But it’s not as well suited for use in a few other places, including:
- Team backlog prioritization – Stories are small, and there are a lot of them in flight. Here, the priorities are driven by the priorities of the features in the ART backlog that spawned the stories and local concerns. It just isn’t worth spending time on multi-parameter analysis and discussion. And since the stories are small and implemented in an iteration, the denominator (duration) is less of a priority determinant.
- As the sole determinant in Portfolio Epic Prioritization – The second area where WSJF is useful—but not totally adequate by itself — is in the later portfolio backlog prioritization steps. In the early kanban steps, WSJF is quite useful and is called out as such in the Portfolio Backlog article. But since these are substantial investments, a simple comparison of a ‘2 to a 5’ might have implications running in the millions or tens of millions of Euros. In this case, more time should be invested in the estimates called for in the Lean business case, including value or potential monetary returns and better-informed speculation on likely duration. For more on using WSJF for Epics, see the Epic article.
Learn More Knaster, Richard, and Dean Leffingwell. SAFe 5.0 Distilled, Achieving Business Agility with the Scaled Agile Framework. Addison-Wesley, 2020  Leffingwell, Dean. Agile Software Requirements: Lean Requirements Practices for Teams, Programs, and the Enterprise. Addison-Wesley, 2011.  Reinertsen, Don. Principles of Product Development Flow: Second Generation Lean Product Development. Celeritas Publishing, 2009.
Last update: 15 December 2022