[kcdc 2022] reduce system fragility with terraform

Speaker: Scott McAllister @stmcallister

For more, see the table of contents

Notes

  • Problem: onboarding same thing dozens of time
  • Infrastructure as code – fast to configure/scale, consistent, reduce errors, self documenting
  • AWS CloudFormation, Azure ARM, Terraform and Pulumi in this space. (Pulumi has been rising and is 2)
  • Terraform is declarative, Pulumi is imperative; use existing programming language

Terraform

  • Declarative
  • Open source – most people use this oer enterprise
  • HCL – Hashicorp configuraton language
  • Manage infrastructure – build, change, version, single source of truth
  • No longer use UI; Terraform will overwrite changes
  • Hashicorp maintains Terraform engine

Providers

  • Hashicorp maintains a few large providers (ex: AWS)
  • Everything else run by community or other companies
  • Doc example https://registry.terraform.io/providers/PagerDuty/pagerduty/latest/docs

Flow

  • Practitioner writes infrastructure as code
  • init – takes definitions in directory, downloads providers
  • plan – want to do this so not billed before confirm
  • apply – changes pushed to environments. Runs plan first. Type ”yes” to confirm or use auto approve flag
  • destroy – wipe out everything have
  • Terraform state has data about config – ex; generated id. In JSON format

Good practices

  • Name service what is providing. Ex: ”Checkout API”
  • Version control system
  • Code review
  • Automated testing
  • Put tokens in environment variable rather than hard coding in script

HCL blocks

  • resource – you are going to manage it, create if not present, etc. Convention: providerName_endpoint. Then unique id – like a variable name within terraform. Ex: resource ”pagerduty_user” ”lisa”, Reference as pageruty_user.lisa.id
  • data – like a query. Get data about something that already exists in system. Reference as data.provider_user.id
  • required_providers – downloads binaries when run tf init. Recommend locking into a version or at least a major version

Data types

  • strings
  • numbers
  • [list, of, data]
  • { a: b, c:d } (complex object)

Can play for free: https://github.com/PagerDuty-Samples/pd-populate-dev-account

Q&A

  • Can find syntax and logic errors in plan. Depends on provider
  • Libraries to convert to HCL. Ex: LDAP to HCL

My take

This served as both a good overview and a good review of the basics. I like that it had a lot of code in it. I’m taking the Terraform cert this month so nice timing for me to attend this talk. I really appreciate the link/API to play for free. Testing on AWS is scary :).

[kcdc 2022] calculating your cloud co2e emissions

Speaker: Joel Lord @joel__lord

For more, see the table of contents

Code impact

  • data centers 2% global electricity demand and 3% greenhouse gasses
  • equivalent to irline ndustry
  • planet has SLO – limit to what we can put in it
  • Car 192g/km
  • Domestic fight 255g/km
  • SMS .014g
  • Email 3g [I googled why high. It can be. .3 to 26 depending on how long to read/write]
  • Tweet .2g
  • Googe search .2g
  • Fart – .2g

Factors

  • CPU/GPU – GPU use more energy, but can perform for consistently – 12.4/38.2
  • RAM – always used so more RAM you have, the more energy used – .3 per GB hour
  • DIsk storage – .002 – per GB hour
  • Network Transfer .027 – per GB
  • Other (cooling, lighting, etc)

Formula includes

  • Time
  • PUE (power usage efficiency) – AWS 1.2, GCP 1.1, Azure (1.125). Only GCP publishes number. Others found somewhere. By contast, average is 1.67 including private data farms
  • CI (carbon intensity) – region specific depending on source of power. NY 200g/kWh, Australlia (lots of coal) 880g/kWH, Quebec 14g/kWh
  • Server tier – CPU, RAM, hard drive
  • Utilization – ex: Atlas CLI/management API
  • M30 on US East 1 for 24 hours – 401g CO2e (72K farts)
  • https://github.com/joellord/atlas-co2

Notes

  • GCP has a carbon footprint dashboard to get bar charts with info for your services
  • Reduce energy using
  • Return only top results
  • Llower default quality
  • Don’t autoplay video
  • Use dark mode – less energy for dark pixels
  • Package size matters – use JAM (JavaScript, APIs and markup) stack
  • Derver rendered pages
  • Reduce complexity – data access together should be stored together
  • Green programming language – C, Rust, C++, Java, C#, JS best. TypeScript, PHP, Ruby, Python worse
  • Migrate to the public cloud – they try to reduce costs by saving energy
  • Serverless or autoscaling
  • Cloud region matters – 80x differences
  • Latency may be ok and then can use smaller nodes
  • Pause servers when not in use – ex: weekend
  • Better monitoring – can go to lower tier if needed
  • Leerage XaaS solutions – (anything as a service)
  • Use right tool – ex: do you need a database

My take

A lot of the beginning was about the impacts of climate change. I feel like the audience knew this part and would have liked to get to the part sooner. The comparisons were good and number of farts were fun. I would have liked more examples of the numbers. The end on things you can do was good.

[kcdc 2022] devops, 12-factor and open source

Speaker: Justin Reock @jreock

For more, see the table of contents


References

  • “It’s no longer the big beating the small, but the fast beating the slow”
  • Book: The Goal – Eliyahu Goldratt. Theory of Constraints for Business Productivity. Business fiction.
  • Book: Phoenix Project. Similar to The Goal but software business fiction
  • Book: The Machine that Changed the World
  • Book: Organizational physics – the science of growing a business. Short. Why businesses fail

Constraints

  • Change focus from costs to throughput
  • Layoffs reduce costs but decreasing costs
  • Need to both decrease costs and increase throughput
  • Cost = organizational cost
  • Inventory = Code
  • Throughout = Money
  • Doesn’t really matter what improve as long as constantly improving something because entropy makes things worse if do nothing.

DevOps

  • Chose supported free software and open first policy
  • Deploy in cloud/containers
  • If do container and not 12 factor, don’t see benefits
  • APIs are everything now. Govern APIs
  • Build fail-fast culture. Near instant release (and therefore patch)

Problems with Closed Development

  • Slow to obtain
  • Inflexibility in growth/scaling. ex: fixed number licenses
  • Can’t modify
  • Can’t benefit from others
  • Lose growth vs giving competitors features
  • Less oversight, less security

Path

  • Individual physical servers
  • Virtual machines
  • Containers (stripped down OS powered by one underlying OS)
  • Created ecosystem with proliferation of microservices – Kubernetes (Greek word for captain). Now can have virtual datacenter in a box

12-Factor

Series of characteristics to increase odds of success in cloud/containers. The less you do for an app, the more friction you will have going to cloud/containers.

https://12factor.net

  • Codebase – in version control, deploy often
  • Dependencies – explicitly declare and isolate
  • Config – store in env. Env variables popular again
  • Backing Services – treat as attached resources
  • Build. release, run – separate strategies
  • Processes – one or more stateless processes
  • Port binding – how to expose services
  • Concurrency – Docker gives for free
  • Disposability – fast startup, graceful shutdown
  • Dev/prod parity – keep as similar as possible
  • Logs – push events out to central system via event streams
  • Admin processes – manage as one offs

Coding

  • About flow
  • Use left and right brained activities
  • Problem solving – hypothesis and feedback from build system.
  • Feedback feels good and keeps in state of creative flow
  • The longer you wait for a build, the less happy you are
  • Few track local build times.
  • Waste waiting for and debugging local and CI builds
  • 10x developer – organizational culture matters more than individuals

Benefits of faster cycle time

  • Less idle time
  • Less content switching – people can’t multi task. Also, bad for brain to try
  • More focus
  • Build more often
  • Earlier quality checks
  • Few expensive downstream incidents
  • Smaller change sets
  • Fewer merge conflicts
  • More efficient troubleshooting
  • Faster mean time to recovery

Trends

  • 1970s – JIT manufacturing
  • 1980s – Business process reengineering
  • 1990s – Change management
  • 2000s – Agile, Lean Six Sigma
  • 2010s – DevOps
  • 2020+ DPE (developer productivity engineering)

DPE

  • Engineering approach to productivity
  • Acceleration and analytics tech to improve dev experience
  • build cache – Gradle has option to use. Also Gradle Enterprise brought to Maven
  • Aligns with management goals – faster TTM (time to market), reduced cost, improved quality
  • https://gradle.com/learning-center-by-objective/
  • https://gradle.com/developer-productivity-engineering/handbook/

My take

Good mix of current and historical examples from outside computing (I didn’t blog about the history part), I hadn’t heard of 12-Factor prior to reading the abstract for this talk. I would have liked more time on it since it is a third of the title, but the talk flowed well as is.