[kcdc 2022] lessons from reviewing a very big pull request

Speaker: Patrick McVeety-Mill @pmcvtm and @loudandabrasive

For more, see the table of contents

PR Principles

PR is to merge code. Typically involves gate checks and review

  • well organized – code resonably grouped, scope is defined and right sized, mostly related changes (not so small that annoying, not so big it is difficult to review. keep side stuff to a minimum)
  • well documented – high level description, story of the approach, issue/ticket/doc reference
  • well considered – planned before opening, open for discussion pragmatic not dogmatic
  • well reviewed – thorough (high and low level), ode efficient and matches style, tested (functionality over all else)
  • have empathy – for author, reviewers, future readers of code, users of product

Context

  • Open source software provider
  • This PR came from a partner – large changes in the past with reasonable success. Semi-rotating cast of devs
  • This change wasn’t made by the same people who learned how to do things first ti
  • Semi-collaborative – weekly syncs, shared chat room
  • Initial plan was for smaller chunks but hard to enforce
  • Doing free work for company because partner
  • Wound up being 900 lines

How approach looking at

  • Accept your fate – mentally prepare
  • Assess the scene
    • look at high evel and plan process
    • web UI doesn’t help with hundreds of files git diff —name-status main gives a list of the files and status. Redirecting to a file gives all of them
    • Dump into a spreadsheet – can sort by change type (add/modify/delete/rename, project or file name)
  • Translate into a checklist – track as go through files. Can do in markdown or Excel. Markdown lets you copy/paste into the response later
  • Look for smells – need to look at closely
    • R100 – means rename an no differences. (100% same.) Can be irectory rename or file name refactoring
    • Add/delete with similar names. Changed a lot.
    • Big chunks of adds – new features
    • ”C” s for copcying a file
    • At this point, leave feedback and ask to rename back to cut down the diff and can restore them in a follow up. This will make PR smaller
    • This got 900 file PR down to a 600 file PR
  • Disappear and review – look at each project/file. add/delete then changes.
    • git difftool -d main MyDir
    • Used BeyondCompare to see visual
    • Looking for critical differences
    • Quicky note format/style issues
    • Manually test related features before moving on
    • Don’t get stuck viewing diffs – want to understand big picture
    • Use diff tool most familiar with so not learning new tool while doing this. Your IDE may have a useful tool
    • If multiple commits, read comments to see journey
    • Use an IDE or tool that can highlight warnings – ex: unused code
    • track and note comments in your checklist
  • Give feedback
    • Be incremental – don’t wait til the end. Give feedback file by file/project by project. Can do in description with multiple posts.
    • Be specific and actionable list files/line numbers, allow author to indicate resolution, explain reasoning where can
    • For small PRs, comment inline during review (ex: GitHub). For super large PRs, crashed tab because so large. Gives top evel coment but reference the specific file names locations.
    • Prioritize with MoSCow – MUST do before merging, SHOULD do or create debt (task for later), COULD do to be nice, WON’T do even i we like (maybe an epic later). Also questions and comments.
    • Indicate nits. Still fix but clear what it is.
    • Compare against your notes to mae sure done
    • Saw ”when” – give tricky issues back to internal team. Can use epic/release branch. Can merge into there so not in main until fix everything.
    • Be diplomatic – remember everyone can see and don’t want to scare contributors away (free work, want to keep)
    • Don’t close PRs without comment

Timing

  • 1 month to merge to feature branch (1-2 weeks for first pass)
  • 2 months to finish all follow up issues

Preventative Measures

  • Formatting
    • pick a format and stick with it
    • setup auto-formatting
    • if strong arm, know it is annoying
    • report early rather than every instance
    • decide how important it is
    • .editorconfig files are cross language and cross-IDE
    • githooks
  • Repository Templates – placeholder text for PRs and issues to show what want
  • Automate checks and report back. Builds are for when code changes. Bots check things as time passes

Communication channels

  • PR interface good for tracking progress, leaving artifacts for others to look at, the original comments from squashing comits
  • PR interface not good for feature details, complex scenarios, brainstorming
  • Use shared issue tracking – just knowing the github issue is in progress for months isn’t helpful. Use a github board or trello
  • Get agreement on what PR will be
  • DIscuss offine and post result
  • Alternatives – Frequent pairing (still do PR at end), periodic review (ex: review monthly ”state of the union on the repo”), trust/testing/recovery (everyone merge to main)

My take

This was fun. It was very information heavy, but easy to follow. And I learned a few new techiques.