[kcdc 2022] lessons from reviewing a very big pull request

Speaker: Patrick McVeety-Mill @pmcvtm and @loudandabrasive

For more, see the table of contents

PR Principles

PR is to merge code. Typically involves gate checks and review

  • well organized – code resonably grouped, scope is defined and right sized, mostly related changes (not so small that annoying, not so big it is difficult to review. keep side stuff to a minimum)
  • well documented – high level description, story of the approach, issue/ticket/doc reference
  • well considered – planned before opening, open for discussion pragmatic not dogmatic
  • well reviewed – thorough (high and low level), ode efficient and matches style, tested (functionality over all else)
  • have empathy – for author, reviewers, future readers of code, users of product

Context

  • Open source software provider
  • This PR came from a partner – large changes in the past with reasonable success. Semi-rotating cast of devs
  • This change wasn’t made by the same people who learned how to do things first ti
  • Semi-collaborative – weekly syncs, shared chat room
  • Initial plan was for smaller chunks but hard to enforce
  • Doing free work for company because partner
  • Wound up being 900 lines

How approach looking at

  • Accept your fate – mentally prepare
  • Assess the scene
    • look at high evel and plan process
    • web UI doesn’t help with hundreds of files git diff —name-status main gives a list of the files and status. Redirecting to a file gives all of them
    • Dump into a spreadsheet – can sort by change type (add/modify/delete/rename, project or file name)
  • Translate into a checklist – track as go through files. Can do in markdown or Excel. Markdown lets you copy/paste into the response later
  • Look for smells – need to look at closely
    • R100 – means rename an no differences. (100% same.) Can be irectory rename or file name refactoring
    • Add/delete with similar names. Changed a lot.
    • Big chunks of adds – new features
    • ”C” s for copcying a file
    • At this point, leave feedback and ask to rename back to cut down the diff and can restore them in a follow up. This will make PR smaller
    • This got 900 file PR down to a 600 file PR
  • Disappear and review – look at each project/file. add/delete then changes.
    • git difftool -d main MyDir
    • Used BeyondCompare to see visual
    • Looking for critical differences
    • Quicky note format/style issues
    • Manually test related features before moving on
    • Don’t get stuck viewing diffs – want to understand big picture
    • Use diff tool most familiar with so not learning new tool while doing this. Your IDE may have a useful tool
    • If multiple commits, read comments to see journey
    • Use an IDE or tool that can highlight warnings – ex: unused code
    • track and note comments in your checklist
  • Give feedback
    • Be incremental – don’t wait til the end. Give feedback file by file/project by project. Can do in description with multiple posts.
    • Be specific and actionable list files/line numbers, allow author to indicate resolution, explain reasoning where can
    • For small PRs, comment inline during review (ex: GitHub). For super large PRs, crashed tab because so large. Gives top evel coment but reference the specific file names locations.
    • Prioritize with MoSCow – MUST do before merging, SHOULD do or create debt (task for later), COULD do to be nice, WON’T do even i we like (maybe an epic later). Also questions and comments.
    • Indicate nits. Still fix but clear what it is.
    • Compare against your notes to mae sure done
    • Saw ”when” – give tricky issues back to internal team. Can use epic/release branch. Can merge into there so not in main until fix everything.
    • Be diplomatic – remember everyone can see and don’t want to scare contributors away (free work, want to keep)
    • Don’t close PRs without comment

Timing

  • 1 month to merge to feature branch (1-2 weeks for first pass)
  • 2 months to finish all follow up issues

Preventative Measures

  • Formatting
    • pick a format and stick with it
    • setup auto-formatting
    • if strong arm, know it is annoying
    • report early rather than every instance
    • decide how important it is
    • .editorconfig files are cross language and cross-IDE
    • githooks
  • Repository Templates – placeholder text for PRs and issues to show what want
  • Automate checks and report back. Builds are for when code changes. Bots check things as time passes

Communication channels

  • PR interface good for tracking progress, leaving artifacts for others to look at, the original comments from squashing comits
  • PR interface not good for feature details, complex scenarios, brainstorming
  • Use shared issue tracking – just knowing the github issue is in progress for months isn’t helpful. Use a github board or trello
  • Get agreement on what PR will be
  • DIscuss offine and post result
  • Alternatives – Frequent pairing (still do PR at end), periodic review (ex: review monthly ”state of the union on the repo”), trust/testing/recovery (everyone merge to main)

My take

This was fun. It was very information heavy, but easy to follow. And I learned a few new techiques.

[kcdc 2022] level up with co-pilot

Speaker: Rizel Scarlett @blackgirlbytes

For more, see the table of contents

Notes

  • AI pair programmer
  • Not magic
  • Compare to Gmail smart compose – suggests continuations
  • Draws context from comments/code
  • Suggests lines/functions

Open AI

  • Powered by Open AI Codex – translates natural language into code
  • GPT-3 – generative pre-trained tranformer 3 – deep learning to produce human like text
  • Duolingo GPT-3 uses for grammar correction
  • Codex code based
  • Played with https://beta.openai.com/examples – Movie to Emoji and Mood to Color (many others).
  • There is also https://beta.openai.com/playground (if you login) where it can generate code from a comment. Even lets you specify a language
  • Also learned you can paste a hex code into google and have it show you the color

Copilot Labs

  • Hover over suggestion to get more
  • Experimental feature to include an explanation of what code does
  • Plugins – VS Code, JetBrains, Neovim
  • Can choose not to include public code
  • Can do some (human) language translation

Benefits

  • Code faster and clear – good at patterns, syntax (ex: regex) so don’t need to google, write better comments so copilot can give good suggestions
  • Write good docs – in markup
  • Be a better mentor – avoids nervousness of someone watching you type because don’t have to worry about syntax, helps mentor people in languages don’t know
  • Gain context for new concepts – studying for interviews (leetcode), explain new code base, create short demos in new languages as dev advocate

Tips

  • Turn off when writing initial structure. Turn on once have pattern going. Comments not useful at first.
  • Good when writing unit tests.

Cost

  • $10/month
  • Free if open source or student

My take

Rizel has a lot of energy and is very relatable. She also did “group play” with openai early. All of that helped engage the audience. I’ve read about co-pilot but it was really cool to see it and the features/benefits/use cases. I enjoyed seeing her passion for the tool and the examples. I also liked how she avoided it from devolving into an argument about the ethics of co-pilot. Rizel didn’t let the wifi problem throw her. It was unfortunate that the demo didn’t work even though other internet stuff did. [block? too bandwidth heavy?] The code to tweet was cool

For co-pilot, some looks cool. Some of the comments were longer than the code. So in real life, I imagine you wouldn’t use it for everything.

[kcdc 2022] reduce system fragility with terraform

Speaker: Scott McAllister @stmcallister

For more, see the table of contents

Notes

  • Problem: onboarding same thing dozens of time
  • Infrastructure as code – fast to configure/scale, consistent, reduce errors, self documenting
  • AWS CloudFormation, Azure ARM, Terraform and Pulumi in this space. (Pulumi has been rising and is 2)
  • Terraform is declarative, Pulumi is imperative; use existing programming language

Terraform

  • Declarative
  • Open source – most people use this oer enterprise
  • HCL – Hashicorp configuraton language
  • Manage infrastructure – build, change, version, single source of truth
  • No longer use UI; Terraform will overwrite changes
  • Hashicorp maintains Terraform engine

Providers

  • Hashicorp maintains a few large providers (ex: AWS)
  • Everything else run by community or other companies
  • Doc example https://registry.terraform.io/providers/PagerDuty/pagerduty/latest/docs

Flow

  • Practitioner writes infrastructure as code
  • init – takes definitions in directory, downloads providers
  • plan – want to do this so not billed before confirm
  • apply – changes pushed to environments. Runs plan first. Type ”yes” to confirm or use auto approve flag
  • destroy – wipe out everything have
  • Terraform state has data about config – ex; generated id. In JSON format

Good practices

  • Name service what is providing. Ex: ”Checkout API”
  • Version control system
  • Code review
  • Automated testing
  • Put tokens in environment variable rather than hard coding in script

HCL blocks

  • resource – you are going to manage it, create if not present, etc. Convention: providerName_endpoint. Then unique id – like a variable name within terraform. Ex: resource ”pagerduty_user” ”lisa”, Reference as pageruty_user.lisa.id
  • data – like a query. Get data about something that already exists in system. Reference as data.provider_user.id
  • required_providers – downloads binaries when run tf init. Recommend locking into a version or at least a major version

Data types

  • strings
  • numbers
  • [list, of, data]
  • { a: b, c:d } (complex object)

Can play for free: https://github.com/PagerDuty-Samples/pd-populate-dev-account

Q&A

  • Can find syntax and logic errors in plan. Depends on provider
  • Libraries to convert to HCL. Ex: LDAP to HCL

My take

This served as both a good overview and a good review of the basics. I like that it had a lot of code in it. I’m taking the Terraform cert this month so nice timing for me to attend this talk. I really appreciate the link/API to play for free. Testing on AWS is scary :).