If you only quantify one thing, quantify the Cost of Delay.

—Don Reinertsen

 

Weighted Shortest Job First

Weighted Shortest Job First (WSJF) is a prioritization model used to sequence jobs (ex., Features, Capabilities, and Epics) to produce maximum economic benefit. In SAFe, WSJF is estimated as the Cost of Delay (CoD) divided by job size.

Agile Release Trains (ARTs) provide an ongoing, continuous flow of work that makes up the Enterprise’s incremental development effort. It avoids the overhead and delays caused by the start-stop-start nature of traditional projects, where authorizations and phase gates control the program and its economics.

While this continuous flow model speeds the delivery of value and keeps the system Lean, priorities must be updated continuously to provide the best economic outcomes. In a flow-based system, job sequencing, rather than theoretical, individual job return on investment, produces the best result. To that end, WSJF is used to prioritize backlogs by calculating the relative CoD and job size (a proxy for the duration). Using WSJF at Program Increment boundaries continuously updates backlog priorities based on user and business value, time factors, risk, opportunity enablement, and effort. WSJF also conveniently and automatically ignores sunk costs, a fundamental principle of Lean economics.

Details

Reinertsen describes a comprehensive model, called weighted shortest job first, for prioritizing jobs based on the economics of product development flow [2]. Calculate WSJF by dividing the CoD by the duration. Jobs that can deliver the most value (or CoD) and are of the shortest length are selected first for implementation. When applied in SAFe, the model supports some additional principles of product development flow, including:

  • Taking an economic view
  • Ignoring sunk costs
  • Making financial choices continuously
  • Using decision rules to decentralize decision-making and control
  • If you only quantify one thing, quantify the Cost of Delay

Figure 1 shows the impact of correctly applying WSJF (see [2] for a full discussion).

The areas shaded in blue illustrate the total CoD in each case. Doing the weighted shortest job first delivers the best economics.

Calculating the Cost of Delay

In SAFe, our jobs are the epics and the features and capabilities we develop, so we need to establish both the Cost of Delay and the duration. Three primary elements contribute to the Cost of Delay:

  • User-business value – Do our users prefer this over that? What is the revenue impact on our business? Is there a potential penalty or other adverse consequences if we delay?
  • Time criticality – How does the user/business value decay over time? Is there a fixed deadline? Will they wait for us or move to another solution? Are there Milestones on the critical path impacted by this?
  • Risk reduction-opportunity enablement value – What else does this do for our business? Does it reduce the risk of this or a future delivery? Is there value in the information we will receive? Will this feature open up new business opportunities?

Moreover, since we are in a continuous flow and should have a large enough backlog to choose from, we needn’t worry about the absolute numbers. We can just compare backlog items relative to each other using the modified Fibonacci numbers we use in ‘estimating poker.’ Then the relative CoD is calculated as follows:

Figure 2. The relative CoD

Duration

Next, we need to understand job duration. That can be pretty difficult to determine, especially early on when we might not know who is going to do the work or the capacity allocation for the teams. Fortunately, we have a ready proxy: job size. In systems with fixed resources, job size is a good proxy for the duration. (If I’m the only one mowing my lawn, and the front yard is three times bigger than the backyard, it’s going to take three times longer) And we know how to estimate item size in Story points already (see Features). Taking job size, we have a reasonably straightforward calculation for comparing jobs via WSJF, as Figure 3 illustrates.

Figure 3. A formula for WSJF

Then, for example, we can create a simple table to compare jobs (three jobs, in this case), as shown in Figure 4.

Figure 4. A sample spreadsheet for calculating WSJF

To use the table in Figure 4, the team rates each Feature relative to others for each of the three parameters. (Note: With relative estimating, you look at one column at a time, set the smallest item to a “one,” and then set the others relative to that.) Then divide the CoD by job size. The job with the highest WSJF is the next most important item to do.

This model encourages splitting large items into multiple smaller ones that can compete against other smaller, low-risk items. But that’s just Agile at work. Since the implementation is incremental, whenever a continuing job doesn’t rank well against its peers, then you have likely satisfied that particular requirement sufficiently that you can move on to the next one.

As we have described, another advantage of the model is that it is not necessary to determine the absolute value of any of these numbers. Instead, you only need to rate the parameters of each item against the other items from the same backlog. Finally, as the backlog estimates should include only the remaining job size, then frequent reprioritization means that the system will automatically ignore sunk costs.

A Note on Job Size as a Proxy for Duration

Job size does not always make a good proxy for the duration of the WSJF algorithm. For example:

 

  • If availability of resources means that a more significant job may be delivered more quickly than some other item with about equal value, then we probably know enough about the work to use estimated duration to have a more accurate result. (If three people are available to mow my front lawn, while I do the back, then these items may have about the same duration, but not the same cost.)
  • A small job may have multiple dependencies with other things and may take longer than a bigger item.

But rarely do we need to worry about these edge cases because if there is some small error in selection, that next important job will make its way up soon enough.


Learn More

[1] Leffingwell, Dean. Agile Software Requirements: Lean Requirements Practices for Teams, Programs, and the Enterprise. Addison-Wesley, 2011.

[2] Reinertsen, Don. Principles of Product Development Flow: Second Generation Lean Product Development. Celeritas Publishing, 2009.

Last update: 17 November 2017