[2019 QCon] Using Bets, Boards and Missions to inspire Org Wide Agility

Posted on June 24, 2019 by Jeanne Boyarsky

Speaker: John Cuttler @johncutlefish

For other QCon blog posts, see QCon live blog table of contents

Goal: want to be a change agent and see what works. Want teams to see more impact i their work. Want to create nudges

General

People seem interested and want to try. Then there is fear and nothing happened. Then people give up.
Confusing own needs, continuous improvement and specific ideas
It’s hard for everyone.
Some companies are healthier than others. Range to get rid of a toxic employee is 12 months to forever
Companies thinking they want a magic tool or framework
Common Problems: Structural, culture/alignment, strategy, decision, making, revenue pressure, deal closing, feature factory, busyness high utilization, constraints
Angst is easy to trigger. People want certainty, impact and coherence.
Coherence != agreement. Coherence means understand.
Want flow of impactful stories
How know if in a feature factory: https://hackernoon.com/12-signs-youre-working-in-a-feature-factory-44a5b938d6a2

Key ideas

Product development is a beautiful mess
Efforts to simplify/standardize often backfire. If can reflect mess back, becomes a change agent. Mirrors are beautiful. Ad libs for bets: https://dazzling-allen-f0bcd2.netlify.com

Hacks

In response to a think doing. Say “oh, well that’s an interesting bet”. Starts conversation about risk and more. Bets can be of any size and risk
Work ranges n 1-3 hour/day/week/month/quarter/year/decade range. Make nesting of work more visible 1-3 year bets in PowerPoint so every one sees. Everything in month or less visible in Jira. In between not visible. See if developer can map their tasks to end result in less than 2 minutes.
Shift words.
- Problems/Solutions -> Opportunities/Interventions
- Projects -> Missions
- Experiments > Bets
- ”Done -> Decision point or review and measurement
- Dependency wrangling > Playing Tetris
- Debt -> Drag
Checklist of what need to know. Ok to not know as long as aware. Key is for list to be a one pager
A letter to the future
Make a map of work in progress when feels high.. Show impact and why it is terrible. Safe way to talk about anxiety
Use a board to show what’s next, what currently focusing on and what is in review. Also includes different levels (time frame) for bets. Board covers multiple teams
Weekly learning users – share learning that is consumed by others in last week
Broadcasted learnings chart/dashboard/notebook consumed by two people in a week
Consumption of learnings – total reach of broadcasted learnings

Q&A

What if org not ready? See if can do it on one project.
Social dynamics? First 5-10 minutes people think they will be measured and fail based on this. Showing what other companies does helps

My impression

I want a double green button to vote. This was great. It was interesting, relatable, actionable, easy to understand and can apply at many levels. So even if an org isn’t ready for all this, smaller parts can be done.

[QCon 2019] PID loops and the art of keeping systems stable

Posted on June 24, 2019 by Jeanne Boyarsky

Colm MacCarthaigh @colmmacc from Amazon

For other QCon blog posts, see QCon live blog table of contents

Control Theory

PID loops are from control theory
Feedback loop – present, observe, feedback, react
Hundred year old field
Different fields claim to have invented it. Then realized fields had same equations and approaches

Furnace example

Classic example is a furnace. Want to get tank to a desired temperature. Measure water temperature and react by raising/lowering heat.
Could just put over fire until done. But will cool off too fast.
Can overheat because lag
Focus on error – distance of error to desired state.

PID

P = proportionate. Make change proportional to the error
P controller not stable because oscillates a lot
I = Integral. Oscillate far less
D = Derivative. Prevents all oscillation
In real world, PI controllers are often sufficient.

Comes up in context of (but rarely applied)

Autoscaling and placement (instances, storage, network, etc) – daily or weekly load pattern/cycle. ML can infer what will happen in future. The I component forecasts what will happen next.
Fairness algorithms (TCP, queues, throttling). Can scale elastically by rest region. Figure out capacity of each site and peak usage. Ensure no sites overwhelmed. Cloudfront looks at how close to capacity and considers as error.
Systems stability

Anti-patterns

Open loops
- Should check that the requested action occurs.
- How would you know if something suddenly went wrong?
- “Sometimes they do things, but they don’t know why. So they pressed another magic button and it fixes everything”.
- Frequently occurs for infrequent actions or for historic reasons.
- Close loop by measuring everything can think of
- Make infrequent operations more frequent (Ex: chaos engineering)
- “if we have something that is happening once a year, it is doomed to failure” ex: 1-3 year certificate rotation. People forget or leave
- Declarative config is easier to formally verify
Power laws
- Errors propogate
- Make system more compartmentalized so failure stays as small as possible
- Exponential backoff – an integral. Limit retries
- Rate limiters – token buckets can be effective
- Working backpressure – AWS SDK retry strategy = Token buckets + rate limiters + persistent state
- Recommend AWS article
Liveness and lag
- Operating on old info can be worse than operating on no info. Environment changes so can be worse than an average.
- Temporary shocks such as spike or momentary outage can take time to recover
- Want constant time scaling as much as possible. Not always possible
- Short queues are safer
- LIFO good for prioritizing recent date. Can do out of order back fill to catch up
False functions
- Want to move in a predictable way that you control.
- UNIX load metric is evil. System and network latency aren’t good either. Need to look at underlying metrics. CPU is a good metric
Edge triggering
- Triggers only at edge
- Good for alerting humans.
- Bad for software as only kicks in at time of high stress.
- How ensure “deliver exactly once”

My impression

This talk was great. I encountered PID in robotics. Seeing it applied to our field was cool. All the things AWS thinks about in the environment was fascinating as well. Makes you happy as a user 🙂

[QCon 2019] ML Panel

Posted on June 24, 2019 by Jeanne Boyarsky

Hein Lu @Linked in, Brad Mitro @GoogleJeff Smith @ Facebook

For other QCon blog posts, see QCon live blog table of contents

Getting Started

People with other strong IT skills switched over
Can learn from books, coursera,, udacity, grad school
Look for specific applications
Domain is very large
Learn libraries, existing datasets
Understand where organization is at. Ex want to do ML vs specific problem
Focus on how will deliver business value

General

Many problems repeat so can get ideas from others
Important to have organizational alignment
Make sure to train on realistic data
Deep learning is very successful use case of ML
”AI is the new electricity”
Limits of Moore’s law. Physical limitations with Quantum
Research on how to get algorithms to train theselve

Tools

PyTorch Hub

Learning resources

Jeff’s book – Machine Learning Systems
Andrew Ng’s Coursera ML course
Coming out this year “AI is for everyone”

Q&A

How learn without business case? How know what don’t know? Many educational resources start generally. Can skip some core concepts and learn later.
How pick good training data? Iterate on testing. Important to keep training with new data
Data heurisitcs? How much data? How many labels?
How make more agile? Use a pretrained model to start. Exist as a service or pull in via code
How know when good enough? Sometimes you have to just try. Or look to those who solved similar problems
Tech stack? Hardware acceleration. Iibraries
Fraud? Retrain data

My impressions

This was a good panel. Interesting responses. One panelist was missing, but it came out well

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30

Down Home Country Coding With Scott Selikoff and Jeanne Boyarsky

Java/J2EE Software Development and Technology Discussion Blog

Daily Archives: June 24, 2019

[2019 QCon] Using Bets, Boards and Missions to inspire Org Wide Agility

[QCon 2019] PID loops and the art of keeping systems stable

[QCon 2019] ML Panel

Share this:

Share this:

Share this: