[kcdc 2025] The problems that arise when focusing on predictability instead of variability

Speaker: Joel Tosi

@joelstosi@mastodon.social

Deck

For more see the table of contents


General

  • Suppose a team does 5-15 days per story. What is normal?
  • Want to see variability/stop hiding it.
  • Can’t use number of story points to guarantee delivery date

Process behavior chart

  • Shewhart chart. Goal is to differentiate variation due to common causes and special causes
  • Understand variability; don’t hide this
  • If do nothing, a system with deliver results within a given range.
  • Stable system may have more variability than you would like, but is stable
  • If stretch goals outside range, won’t happen
  • Helps make better decisions
  • Two teams with same average have different stories and different problems
  • Can figure out max/min. Just because max happened once, doesn’t make it repeatable

Anti patterns

  • Always add more planning
  • Pre-planning meeting to planning
  • Tampering – make random decisions based on observable patterns without understanding why
  • Stretch goals are a lie.
  • If can’t test, adding devops doesn’t help
  • Adding points for silly activities so looks like less vulnerability – illusion of progress
  • Meetings as an attempt to fix variability. Some meetings give the illusion of control

Negative implications

  • Variability in different things compound
  • Negative feedback – pressure to do more work -> start more stuff -> more branches -> stability goes down -> quality goes down -> more work (fix quality and still do the features)
  • Gets worse – add people since too much work. If can’t ship 1 thing in a month, try for 5 things in 3 months. Plan further out, make teams think of more things.
  • When we don’t see the system, we operate with the best of intention in the worst ways

Decisions that reduce options

  • ex: Orgs want autonomous interconnected teams but make decisions that prevent it
  • Want experienced people but can’t pay
  • Get junior engineers in very different time zones
  • Now can’t meet and work together

Context

  • “It depends” – variables, context
  • Context includes company, industry, experience, people. Cumulative effect of decisions company has made to that point

Illusions of Progress

  • Stories – “story for standup because takes time”
  • Backlogs – multiple backlogs, hard to trace/see big pictures
  • Branches – one person working on multiple branches. No actual progress even though committing
  • Tests – flakey/brittle
  • Scheduling – scheduling tetris. “If that team does X by that day then that team can do ….”
  • Priorities – should be a priority
  • Not measuring impacts/wrong – need to measure what matters

Outliers

  • Be careful of perceived outliners. Could be looking at the wrong level of the system
  • If a lot of outages and each has different reasons, might be predictable if look at a different way.
  • Major releases
  • Security
  • Cost of delivery

Other notes

  • Reality is interconnected and non-linear
  • One choice is not an option. That’s not a strategy. Jerry Weinberg – rule of 3. If don’t have three options, haven’t thought about it enough.
  • Experiment early when cheap and easy. Minimize variability after decide.
  • To have zero variability, nothing can change. requirements perfect, codebase perfect, never change tech, can’t learn, can’t innovate, etc

What do

  • Stop hiding variability
  • Start measuring variability
  • You know best what to do next

https://sim.curiousduck.io – free simulator. can enter any email

My take

Good food for thought. I hadn’t hard of Shewhart charts. The answer to my question about where the max variability came from was three standard deviations from average and assumes a normal distribution.. There are alternate ways for different distributions. That’s interesting.

Leave a Reply

Your email address will not be published. Required fields are marked *