If you only quantify one thing, quantify the Cost of Delay.

—Don Reinertsen

 

Weighted Shortest Job First

Weighted Shortest Job First (WSJF) is a prioritization model used to sequence jobs (eg., Features, Capabilities, and Epics) to produce the maximum economic benefit.

In a flow-based system, priorities are updated continuously to provide the best economic outcomes. Job sequencing, rather than theoretical, individual job return on investment, produces the best result.

To that end, WSJF is used to prioritize backlogs by calculating the relative CoD and job size (a proxy for the duration). Backlog priorities are continuously updated based on relative user and business value, time factors, risk reduction and opportunity enablement, and relative job size. WSJF also conveniently and automatically ignores sunk costs, a fundamental principle of Lean economics.

Details

Reinertsen describes a comprehensive model, called WSJF, for prioritizing jobs based on the economics of Lean product development flow [2]. WSJF is calculated by dividing the CoD by the duration. CoD is the money that will be lost by delaying or not doing a job for a period of time. For example, if a prospective feature would be worth $100,000 per month, and there was a delay of three months, the CoD would be $300,000.

Jobs that can deliver the most value (or CoD) in the shortest duration provide the best economic return. As applied in SAFe, the model supports some additional principles of product development flow, including:

  • Taking an economic view
  • Ignoring sunk costs
  • Making financial choices continuously
  • Using decision rules to decentralize decision-making and control
  • If you only quantify one thing, quantify the Cost of Delay

Figure 1 shows the impact of correctly applying Reinertsen’s WSJF. The areas shaded in blue illustrate the total CoD in each case. Doing the weighted shortest job first delivers the best economics.

Note: As shown in Figure 1, Reinertsen [2] uses the actual monetary values for the Cost of Delay and estimated length of job duration, whereas SAFe applies relative estimation using a modified Fibonacci sequence, described later in this article.

Figure 1. Applying the WSJF algorithm delivers the best overall economics

Estimating the Cost of Delay

In SAFe, the ‘jobs’ are the Features, Capabilities, and Epics that live in their respective backlogs.  However, since it can be challenging to determine the total cost of delay for things that haven’t ever been implemented, SAFe uses a proxy for CoD, which estimates the relative to other jobs in the backlog. Three primary components contribute to the CoD:

  • User-business value – What is the relative value to the customer or business? Do our users prefer this over that? What is the revenue impact on our business? Is there a potential penalty or other negative effects if we delay?
  • Time criticality – How does the user/business value decay over time? Is there a fixed deadline? Will they wait for us or move to another solution? Are there Milestones on the critical path impacted by this? What is the current effect on customer satisfaction?
  • Risk reduction-opportunity enablement value – What else does this do for our business? Does it reduce the risk of this or a future delivery? Is there value in the information we will receive? Will this feature enable new business opportunities?

Teams compare backlog items relative to each other using the modified Fibonacci numbers use in ‘estimating poker.” Then the relative CoD is calculated as follows:

Figure 2. Calculating relative CoD

Estimating the Job Duration

The next factor is the job duration. That can also be pretty difficult to determine, especially early on, when it’s hard to say who will do the work or what capacity allocation can be applied. Fortunately, job size is a good proxy for the duration. (If I’m the only one mowing my lawn, and the front yard is three times bigger than the backyard, the front is going to take three times longer.) Taking job size, we have a straightforward calculation for comparing jobs via WSJF, as Figure 3 illustrates.

Figure 3. A formula for WSJF

Then we can create a simple table to compare jobs (three features, in this case), as shown in Figure 4.

Figure 4. A table for calculating WSJF

As with story estimating, the modified Fibonacci sequence is used as it better reflects the range of uncertainty in estimates as the size gets bigger. To use the table in Figure 4, the team estimates each feature relative to the others for each of the three components of CoD and job size. Start by reviewing one column at a time, setting the smallest item to a “one,” and then set the others relative to that. Then calculate and divide the CoD by job size. The job with the highest WSJF is the next most important job to do.

This model encourages splitting large jobs into multiple smaller ones to compete against other smaller jobs. Otherwise, critical big jobs might never get done. But that’s just Agile at work. Since the implementation is incremental, a different job will be selected whenever a continuing job doesn’t rank well against its peers.

Another advantage of SAFe’s WSJF model is that the actual value of CoD components is not needed. Instead, teams rate the components of each item against the other items from the same backlog. Finally, as the updated backlog estimates include only the remaining job size, frequent reprioritization means that the system will automatically ignore sunk costs.

Using Job Size as a Proxy for Duration

Job size does not always make a good proxy for the WSJF duration. Let’s consider two scenarios:

  • If the availability of specialty skills means that a larger job with a higher value may be delivered more quickly than would otherwise be the case, then it can be chosen because it provides more value in a shorter period.  (If three people are available to mow my large front lawn while I do the small backyard, these jobs will have approximately the same duration but not the same value.)
  • A small job may have a scarcity of resources or dependencies with other jobs that might take longer than a bigger job.

But rarely do we need to worry about these two exceptions. In most situations, fast, WSJF relative estimating is adequate. Since this is a flow-based system, small errors in the selection are not that critical, as that next important job will rise to the top of the backlog soon enough.

Using Job Costs as a Proxy for Epic Duration

When estimated job costs are known, it may be a better proxy than estimated job size for the denominator in WSJF. These costs are often known in the later stages of the portfolio Kanban after the Lean business case has been created. As these big jobs are selected for implementation, a more refined WSJF using estimated costs as the proxy, or better yet—an estimate for the duration—is warranted.

When estimated job costs are used for the denominator of WSJF, normalizing the cost of these epics simplifies the math. To do this, give the lowest estimated epic cost a ‘1.0,’ then divide the cost of subsequent epics by the lowest value (e.g., 1.5/ 0.5 = 3.0), as shown in Figure 5 below.

Figure 5. Applying normalized cost for the denominator of WSJF

Note: If you have good monetary estimates for CoD, use it for the numerator for all epics being prioritized. Likewise, if you have good estimates for the duration, use that instead of a proxy variable.


Learn More

[1] Leffingwell, Dean. Agile Software Requirements: Lean Requirements Practices for Teams, Programs, and the Enterprise. Addison-Wesley, 2011.

[2] Reinertsen, Don. Principles of Product Development Flow: Second Generation Lean Product Development. Celeritas Publishing, 2009.

Last update: 2 November 2020