Nihilist Reviewboard

03 Mar 2017

Let’s talk about another concept that’s as old as the hills - code review.

“Design and code inspections to reduce errors in program development” [Fagan 1976] (pdf) introduced the notion for a structured process of reviewing programs & designs. The central argument which Fagan presents is that it is possible to quantitatively review software for flaws early in the development cycle, and to iterate on development while the cost of change is low compared to cost of iterating on software which had been deployed to customers.

pic.twitter.com/LTj23IelwJ
— Reid 💩🐦 McKenzie (@arrdem) March 3, 2017

The terminology is a little archaic, but in all the elapsed time the fundamental idea holds. Code review for Fagan is as much an architectural design review as it is anything else.

This shouldn’t be terribly surprising, given some of the particular concerns Fagan’s process is designed with addressing. While many of these things haven’t intentionally changed, some of these concerns such as the specifics of register restoration reflect the paper’s age.

Man..... code reviews used to be weird pic.twitter.com/D0qk9VJwLA
— Reid 💩🐦 McKenzie (@arrdem) March 3, 2017

While the underlying goal of the code review process, to examine software for flaws early and often, has not changed meaningfully in the intervening decades many of the particulars of process described by Fagan reflect a rigidity of process and a scale of endeavor which is no longer reflective of the state of industry at large.

Fagan’s process is designed to prevent architecture level mistakes through intensive review, as well as to detect “normal” bugs en-masse and provide a natural workflow for iterative searching for and fixing of bugs until the artifact is deemed of sufficient quality. This is a work of process engineering optimized for code being slow & difficult to write, and for software being slow & risky to ship.

So what does a modern code review system look like? What makes for a good code review? What has changed?

With the cheapening of computer time, advent of integrated testing systems, generative testing, continuous integration and high level languages, many of the properties which previously required extensive deliberate human review can now be automatically provided. Likewise, modern linters & formatters can provide extensive stylistic criticism and enforce a degree of regularity across entire codebases.

Continuous delivery systems and incremental deployment methodologies also serve to mitigate the expensive “big bang” releases which informed Fagan’s process. Continuous or near continuous delivery capabilities mean that teams can be more focused on shipping & testing incremental products. Artifacts don’t have to be fully baked or finalized before they are deployed.

Similarly, linters & other automatic code inspection together with the advantages of high level languages at once make it possible to make meaningful change to artifacts much more rapidly and to automatically detect entire classes of flaws for remediation before an author even begins to engage others for review.

Ultimately, the job of every team is to deliver software. In many contexts, incomplete solutions, delivered promptly & iterated on rapidly, are superior to fuller solution on a longer release cadence.

So. What does this mean for code reviews?

Reid’s Rules of Review

True to the style of The Elements of Style, these rules are hard, fast and have exceptions. They’re deeply motivated by the tooling & engineering context described above. If your team has missile-launching or life ending reliability concerns, you’ll want a different approach. If you can’t trivially test or re-deploy or partially roll out your code you’ll also want a different approach. This is just the way I want to work & think people should try to work.

1. Ensure that the artifact is approachable.

If you are not able to understand a changeset or a new artifact, let alone the process by which its author arrived at the current choice of design decisions & trade-offs, that is itself a deeply meaningful criticism because it means that both the code is unclear and the motivational documents are deeply lacking.

As the reviewee the onus is on you to enable your reviewers to offer high level criticism by removing low level barriers to understanding.

1.1. Corollary: Write the docs.

As the reviewee, how are your reviewers supposed to understand what problem(s) you’re trying to solve or the approach you’re taking if you don’t explain it? Link the ticket(s). Write docstrings for your code. Include examples so that it’s obvious what and how. Write a meaningful documentation page explaining the entire project so that the why is captured.

1.2. Corollary: Write the tests.

Those examples you wrote? They should be tests.

I’m totally guilty of the “It’s correct by construction! Stop giving me this tests pls crap!” but it’s a real anti-pattern.

As the reviewee even to the extent that you may succeed in producing or composing diamonds, someone will eventually come along and refactor what you’ve written and if there aren’t tests covering the current behavior who knows what will happen then. You may even be the person who introduces that regression and won’t you feel silly then.

Furthermore tests help your reviewers approach your code by offering examples & demonstrations. Tests aren’t a replacement for documentation and examples, but they certainly help.

1.3. Corollary: Run the linter.

If you have a style guide, stick to it in so much as is reasonable. As the reviewee if you’ve deviated from the guide to which your coworkers are accustomed you’ve just made it harder for your coworkers to meaningfully approach and criticize the changes you’re proposing. You’ve decreased the review’s value for everyone involved.

2. Criticize the approach.

Algorithmic or strategic commentary is the most valuable thing you can offer to a coworker. Linters and automatic tooling can’t really help here. Insight about the future failings of the current choice of techniques, benefits of other techniques and available tools can all lead to deeply meaningful improvements in code quality and to learning among the team. This kind of review may be difficult to offer since it requires getting inside the author’s head and understanding both the problem and the motivations which brought them to the decisions they made, but this can really be an opportunity to prevent design flaws and teach. It’s worth the effort.

3. Don’t bike shed your reviewee.

If the code works and is close to acceptable, leave comments & accept it. The professional onus is on the reviewee to determine and make appropriate changes. It’s not worth your or their time to go around and around in a review cycle with a turn around time in hours over this or that bikeshed.

3.1. Corollary for the reviewer: Style guides are something you apply to yourself, not something you do to others.

⚠⚠ EXCEPT IN THE MOST EGREGIOUS CASES ⚠⚠

If someone clearly threw the style guide out the window or didn’t run the linter, then a style guide review is appropriate. Style guides should be automatically enforced, or if tooling is not available then they should be mostly aspirational.

What’s the point of wasting two or more humans’ time doing syntactic accounting for minor infractions? If it takes half an hour to review a change, maybe another hour before the reviewee can respond to changes, half an hour or more to make the requested changes and then the updated changeset has to be reviewed again syntax and indentation bike sheds in code review can easily consume whole work days.

3.2. Corollary for the reviewer: Don’t talk about performance concerns unless you have metrics in hand.

Seriously. Need to push thousands of requests per second? Yeah you may care about the performance of an inner loop somewhere. Performance criticisms are meaningful and you should have performance metrics already.

Got a service that’ll see a few hundred requests at the outside? Who cares! It can probably be quintic and still get the job done.

It is far better to write inefficient code which is easy to understand & modify, ship it and iterate. If legitimate performance needs arise, code which can be understood and modified can always be refactored and optimized.

Code which is optimized early at the expense of understanding and modifiability is a huge mistake because much as it may make the reviewee or the reviewer feel clever to find some speedup, that speedup may or may not add value and the semantic cost of the optimizations increases the maintenance or carrying cost of the codebase as a whole.

3.3. Corollary for the reviewer: Don’t be dogmatic.

There are exceptions to every rule. There are concrete time and morale costs to requesting change. Be mindful of these, and remember that you’re all in the same codebase together with the same goal of shipping.

4. Hold your coworkers accountable after the fact and ship accordingly.

If their service is broken, they who broke it get to be on the front lines of fixing it. The consequence of this is that you should trust your coworkers and prefer shipping their code thanks to a culture of ongoing responsibility. Your default should be to accept code and ship code more or less whenever possible.

In short, don’t be this guy

all code ends up as part of a negative diff at some point, some code earlier than others
— Nihilist Reviewboard (@NihilistReviewb) December 13, 2016