If you only quantify one thing, quantify the Cost of Delay.
Weighted Shortest Job FirstWeighted Shortest Job First (WSJF) is a prioritization model used to sequence jobs (eg., Features, Capabilities, and Epics) to produce maximum economic benefit. In SAFe, WSJF is estimated as the Cost of Delay (CoD) divided by job size.
In a flow-based system, updating priorities continuously provides the best economic outcomes. In such a flow context, it is job sequencing, rather than theoretical, individual job return on investment, that produces the best result.
To that end, SAFe applies WSJF to prioritize backlogs by calculating the relative Cost of Delay (CoD) and job size (a proxy for the duration). Backlog priorities are continuously updated based on relative user and business value, time factors, risk reduction and opportunity enablement, and relative job size. WSJF also conveniently and automatically ignores sunk costs, a fundamental principle of Lean economics.
Reinertsen describes a comprehensive model, called WSJF, for prioritizing jobs based on the economics of Lean product development flow . WSJF is calculated by dividing the Cost of Delay (CoD) by the duration. CoD is the money that will be lost by delaying or not doing a job for a period of time. For example, if a prospective feature would be worth $100,000 per month, and there was a delay of three months, the total CoD would be $300,000.
Jobs that can deliver the most value (or CoD) in the shortest duration provide the best economic return. As applied in SAFe, the model supports some additional principles of product development flow, including:
- Taking an economic view
- Ignoring sunk costs
- Making financial choices continuously
- Using decision rules to decentralize decision-making and control
- If you only quantify one thing, quantify the Cost of Delay
Figure 1 shows the impact of correctly applying Reinertsen’s WSJF. The areas shaded in blue illustrate the total CoD in each case. Doing the weighted shortest job first delivers the best economics, by a very large factor.
(Note: As shown in Figure 1, Reinertsen  uses the actual monetary values for the Cost of Delay and estimated length of job duration, whereas SAFe applies relative estimation using a modified Fibonacci sequence, described later in this article.)
Estimating the Cost of Delay
In SAFe, the ‘jobs’ are the Features, Capabilities, and Epics that live in their respective backlogs. However, since it can be challenging to determine the total cost of delay for things that haven’t ever been implemented, SAFe uses a proxy for CoD, which estimates the size of the job relative to other jobs in the backlog. Three primary components contribute to the CoD:
- User-business value – What is the relative value to the customer or business? Do our users prefer this over that? What is the revenue impact on our business? Is there a potential penalty or other negative effects if we delay?
- Time criticality – How does the user/business value decay over time? Is there a fixed deadline? Will they wait for us or move to another solution? Are there Milestones on the critical path impacted by this? What is the current effect on customer satisfaction?
- Risk reduction-opportunity enablement value – What else does this do for our business? Does it reduce the risk of this or a future delivery? Is there value in the information we will receive? Will this feature enable new business opportunities?
Teams compare backlog items relative to each other using the same modified Fibonacci numbers as in ‘estimating poker.” Then the (relative) CoD is calculated as follows:
Estimating the Job Duration
The next item in the equation, the denominator in WSJF, is the job duration. That can also be pretty difficult to determine, especially early on, when it’s hard to say who will do the work or what capacity allocation can be applied. Fortunately, job size is a good proxy for the duration. (If I’m the only one mowing my lawn, and the front yard is three times bigger than the backyard, the front is going to take three times longer.) Using job size, we have a straightforward calculation for comparing jobs via WSJF, as Figure 3 illustrates.
Then, a simple table can be used to compare jobs (three features, in this case), as shown in Figure 4.
As with story estimating, the modified Fibonacci sequence is used as it better reflects the range of uncertainty in estimates as the size gets bigger. To use the table in Figure 4, the team estimates each feature relative to the others for each of the three components of CoD and job size. Start by reviewing one column at a time, setting the smallest item to a “one,” and then set the others relative to that. Then calculate and divide the CoD by job size. The job with the highest WSJF is the next most important job to do.
This model encourages splitting large jobs into multiple smaller ones to compete against other smaller jobs. Otherwise, critical big jobs might never get done. But that’s just Agile at work. Since the implementation is incremental, a different job will be selected whenever a continuing job doesn’t rank well against its peers.
Another advantage of SAFe’s WSJF model is that the absolute value (money) of CoD components is not needed. Instead, teams rate the components of each item against the other items from the same backlog. Finally, as the updated backlog estimates include only the remaining job size, frequent reprioritization means that the system will automatically ignore sunk costs.
Using Job Size as a Proxy for Duration
While we apply job size as a proxy for the duration, job size does not always make a good proxy. Let’s consider two scenarios:
- If the availability of specialty skills means that a larger job with a higher value may be delivered more quickly than would otherwise be the case, then it can be chosen because it provides more value in a shorter period. (If three people are available to mow my large front lawn while I do the small backyard, these jobs will have approximately the same duration but not the same value.)
- A small job may have a scarcity of resources or dependencies with other jobs that might take longer than a bigger job.
In this case, results can be prioritized accordingly. But rarely do we need to worry about these two exceptions. In most situations, fast, WSJF relative estimating is adequate. Since this is a flow-based system, small errors in the selection are not that critical, as that next important job will rise to the top of the backlog soon enough.
Using Job Costs as a Proxy for Epic Duration
When estimated job costs are known, it may be a better proxy than estimated job size for the denominator in WSJF. These costs are often known in the later stages of the portfolio Kanban after the Lean business case has been created. As these big jobs are selected for implementation, a more refined WSJF using estimated costs as the proxy, or better yet—an estimate for the duration—is warranted.
When estimated job costs are used for the denominator of WSJF, normalizing the cost of these epics simplifies the math. To do this, give the lowest estimated epic cost a ‘1.0,’ then divide the cost of subsequent epics by the lowest value (e.g., 1.5/ 0.5 = 3.0), as shown in Figure 5 below.
(Note: If you have good monetary estimates for CoD, use it for the numerator for all epics being prioritized. Likewise, if you have good estimates for the duration, use that instead of a proxy variable.)
Learn More Leffingwell, Dean. Agile Software Requirements: Lean Requirements Practices for Teams, Programs, and the Enterprise. Addison-Wesley, 2011.  Reinertsen, Don. Principles of Product Development Flow: Second Generation Lean Product Development. Celeritas Publishing, 2009.
Last update: 10 February 2021