Why Do We Have Code Reviews Anyway?

Code Review Comment

We’ve spent the last six or so months at Sourcery focused mainly on code reviews - how can we make them faster, how can we make them more thorough, and generally how can we make them better. As we’ve gone through this process there’s been a question that’s started to gnaw at me - why do we even have code reviews in the first place? Are they actually a critical piece of the engineering process or are they, more often than not, an impediment to progress?

Usually when I talk to developers and managers they’ll tell me that reviews exist for one of a few purposes:

Make sure new code changes are making the code base healthier, higher quality, and cleaner than when it started.
Making sure new code is well designed, well architected, and that the design, architecture, and style lines up with the existing code.
Making sure the new code actually does what it’s supposed to.
Checking to see if there’s anything really bad happening in the new code (security risks, likely bug risks, etc).
Sharing new knowledge and keeping the team informed about how and why the codebase is changing in certain ways.
Being a mentoring opportunity for developers to help each other improve (most often to help junior devs learn from more senior devs - but knowledge can move in any direction).
Being a place for important discussions about decisions that were made as part of a project.

Before we look into any of these, it’s probably worth clarifying what I’m talking about when I’m saying code review here. There are lots of ways that we can review code - from pair programming, to checks in CI, to self reviewing our code before we commit it - but what I’m talking about here is the common approach among most software engineering teams of an async peer review after a pull request or merge request has been opened.

Now, let’s dive into each of these and see whether the current peer code review process actually makes sense for them.

Are these changes making the codebase healthier?

Codebase health has a number of factors - but at the end of the day it’s really a question of whether or not the changes that are being introduced are going to make future development slower (less healthy) or not. Future slowdowns can be caused by a number of different things - inconsistent code styles, poor design decisions, introducing duplication, creating confusing code, etc. Ideally we would be able to prevent most of these issues from being introduced before we get to a code review and there are different strategies we could use to tackle each of them. For example

Inconsistent code styling: Linters and static analysis tools can help to catch and fix inconsistent styles automatically in the IDE or at the pre-commit phase. Well defined and agreed upon internal style guides/contributing guides can help to set standards and expectations around code styles that can reduce the risk of stylistic issues reducing code health.
Poor design decisions: We’ll cover this more in the next section, but thorough upfront planning and pair programming can help to reduce the risk of design decisions being introduced that can damage future code health.
Introducing duplication: There are a number of static analysis tools (typically language specific) that can help flag duplicate or near duplicate code during development and help ensure you’re not introducing repetitive code.
Creating confusing code: This one doesn’t have a great automated solution today, but pairing can be a great way to flag potentially confusing code early in the development process and help fix it before it gets out of control.

One issue with all of the preventative approaches we’ve looked at above is that they’re not fail safe. We still probably need to have a downstream check on code health - and that naturally can come during a code review process. And that code review is also a natural place to have discussions and decisions about how critical we want to be about potential issues around code health. Should we be making sure the new solution that’s being introduced is refactored to an ideal state for future development, or is it in a place where it’s good enough? Or is the new change an emergency fix for a bug that absolutely needs to get merged in, even if we know it’s not an ideal solution and we’ll address cleaning it up later? These types of discussions can only really happen as part of the review process - ideally combined with some of the proactive approaches to improve code health outlined above.

Is the code well designed?

This question of code design broadly fits into two major categories - is the new code designed and architected well in isolation and does its design fit in with the design of the rest of the codebase? These two questions are incredibly difficult to do in an automated fashion and typically merit a peer review, but there are also ways we can help ensure that design is being thought about at every step of the development process and can prevent issues from actually popping up that would need to get flagged and fixed during code reviews.

The first place we can really tackle design issues is by having an established style/contribution guide for our codebase. We can create standards and expectations for how everyone on our team is looking to craft their code in different situations (this can also make discussions about design during reviews more objective and less combative). What a style guide looks like can vary a ton from team to team - these can be high level and relatively vague, just establishing core principles, or it can be extremely detailed covering things like specific code style decisions in different sections of the codebase. It all depends what will work best for your team and your project.

The planning process gives another great opportunity to establish plans for the design of the implementation. It gives a chance for people to discuss and agree on an approach that is both appropriate for the new change in isolation and also fits in within the broader codebase. You still will want to check during a review to make sure the resulting code lines up the with planned design, but 95+% of the time it should and that piece of the review should be quick and painless.

Does the code do what it’s supposed to?

This might be the most important question about any change that code is introducing. If it’s not doing what it’s supposed to do then why is the change even happening in the first place? Whether or not this determination needs to come out of an async code review really depends on the type of change that’s happening. Ideally the changes are well scoped and you can introduce tests that can ensure that everything is happening in the right way. That way you can be confident the right things are being accomplished if the tests are passing.

But, even if you do have perfect test coverage, you still probably want someone else reviewing to make sure that the tests are right and are actually covering all the relevant cases. We could certainly be in a position where we have 100% test coverage vs the code that’s been developed, but we’ve missed out on a big set of necessary functionality both in our code and in our tests.

On top of this some changes should be manually checked to make sure the actual system behaves the way it’s expected to. Maybe it’s frontend changes that you want to make sure are looking appropriate on different devices. Maybe it’s making sure external API calls and your database updates are playing nice with each other in practice. Whatever it is, there’s a pretty clear need for human review of code functionality changes that can’t really be avoided. Now, whether we want to think about that as a part of a QA process, a testing process, or code review can be debated, but it needs to be a step in the development workflow.

One other note on functionality and changes being introduced being desirable. The needs from these types of checks differ significantly between open source projects with public contributors and projects being worked on by a closed team. Within a team it is much easier to come to specific alignment about what new changes are being introduced, but with open source projects you have to contend with contributors occasionally bringing things in out of the blue. In that case it is critical that maintainers are able to review the changes and make sure they fit in with the direction and goals of the project.

Is any bad stuff being introduced?

Bad stuff is a pretty broad topic - but let’s think about everything from major security vulnerabilities to potential bugs, to logical issues, etc. We don’t want to have any of these sneaking into our code and should be looking to take every precaution to make sure they’re avoided. Thankfully there are a wide array of tools out there that can help us to flag these early in the development process. Static analysis and security analysis tools can help us catch security and logical issues early on in development. Bugs can be trickier to catch because they can be more dependent on our project, but having robust and up to date test coverage can help us catch most of these while we’re writing new code.

Unfortunately, the reality is we're not every going to be able to catch every bug that's out there. But, the time cost of a bug getting into production can be as much as 100x that of finding it earlier in the development process. So every effort we can make to catch them as early on in the process as possible is critical.

Knowledge sharing

This is my least favorite argument for why code reviews should happen. Typically the argument centres around the idea that development is done in relative isolation and reviews let other members of the team learn how and why new code was introduced in a certain way. But, I have two big issues with this theory:

It rarely actually results in knowledge being shared with the broader team. Typically you will have one or two reviewers on a pull/merge request. While that does mean that we’re potentially expanding the number of eyes on this code from 1 -> 3 it doesn’t mean that the majority of the team/full team will actually be exposed to the changes that were made or the discussion about why they were made. Typically you'll also want a product or code owner of the section of code to be kept up to date of changes - so maybe that will happen in addition to the reviewers. But still, if our goal is to improve knowledge sharing we should be having broader team discussions about changes to review what and why changes were made so that the entire team who is working with that code has that knowledge, not just a handful of individuals.
It implies that work is really being done in isolation and that changes may be arising in a way that others might disagree with or be confused by. If this is a serious risk then you can look to mitigate it by having more up front discussions about how a change should be implemented/designed/etc and can look to share knowledge and approach in a more ongoing way by collaborating/pairing/mobbing on certain changes.

Mentoring opportunities

Code reviews are a great time to give feedback, point out issues with a potential approach, suggest alternative approaches, and otherwise help a teammate learn. But, it’s far from the only time in the development process in which this can happen. If we want to make the most of mentoring opportunities we should be looking to do that throughout the development process - at the onset of a project to help a more junior developer think through different approaches, during the implementation through pair or mob programming, and again during the review phase. Otherwise we’re missing out on a variety of times to help each other learn.

Communication about the project

Ok, maybe I’m starting to sound like a broken record here, but the review process isn’t the only place and it almost certainly isn’t the best place to be having detailed communication about the project. By the time you get to a code review there has already been time spent in development so any discussion about how something was implemented could have happened earlier and saved time on something being implemented in the “wrong way”. Up front discussions during the planning process can start to dig into technical implementation details and make sure there is alignment on the broad plan rather than waiting till later. And pair programming (sensing a bit of a theme here?) gives great opportunities for ongoing communication between multiple team members during the development process to help flag and decide on approaches.

That said - I recognise that we don’t live in this platonic ideal of software development where those questions always get surfaced correctly early in the process. Sometimes something comes up during development that couldn’t have gotten flagged earlier and you do need to have a discussion about it during the code review process. But, that on it’s own doesn’t mean that reviews are a great place for handling project discussions.

So are reviews worth it?

A few weeks back the Sourcery team was having a chat about the different steps in the lifecycle of code and one of my colleagues said “if we were trying to have the highest short-term velocity possible we wouldn’t have any of these checks and processes and would just merge straight to main. But all of these processes have emerged as deliberate slowdowns with a purpose”.

He was spot on - all of these discussions, all of these areas of our code we want to check during the review process are things that need to happen to make sure we’re able to keep being effective and productive long term - even if they come with short term slowdowns. And, code reviews aren’t always the best place for these reviews and conversations. Ideally they’re happening earlier in the development process or are automated checks - and when they are then the review can be a quick final check to make sure everything looks good.

But, at the end of the day reviews are an important piece of the puzzle of software development.

Now the only question is how can we make them way better?

We’re working to build a better automated version of code reviews at Sourcery. Try it out on any of your GitHub repos or reach out to hello@sourcery.ai to learn more.