Helper functions make your code more expressive and less repetitive. Assuming that you really use them.
Dec 05, 2022
In this new series of blog posts, we're introducing some custom rules. Our goal is to give you some ideas what kind of rules might be useful for your codebase. Let's start with a simple but popular rule that we use in the Sourcery core repository.
Testing is an awesome concept, yet we often have mixed feelings and experience about our actual test suite. Tests can catch 🐛s, but they can also give you a false sense of confidence. A comprehensive test suite can enable faster changes, but it can also slow you down when you need to adjust several, seemingly unrelated tests. Tests can give you a great insight into the expected behavior. But they can also be quite confusing.
Erik Kuefler provides some great guidance in the book Software Engineering at Google on how to write tests that make software engineers more productive. The chapter about Unit Testing has a whole section about clarity. It starts with pointing out that test failures happen for 2 reasons:
When a test fails, an engineer’s first job is to identify which of these cases the failure falls into and then to diagnose the actual problem. The speed at which the engineer can do so depends on the test’s clarity. A clear test is one whose purpose for existing and reason for failing is immediately clear to the engineer diagnosing a failure.
Software Engineering at Google / Chapter 12 Unit Testing / page 239
In order to achieve clarity, Kuefler emphasizes two high-level properties. Tests must be complete and concise. Let's take a look at the definition of both:
Software Engineering at Google / Chapter 12 Unit Testing / page 240
To achieve conciseness, test helpers are very useful. They can hide boilerplate code, setting dummy values for fields that are required but irrelevant for the current test. With an expressive name, they can provide information about the test setup. For example:
use_valid_config
create_customer_with_eur_and_usd_accounts
verify_order_has_been_shipped
In his classical Refactoring book, Martin Fowler discusses rule of thumbs when to extract a function:
Fowler clearly favors the third approach.
The argument that makes most sense to me, however, is the separation between intention and implementation. If you have to spend effort looking at a fragment of code and figuring out what it’s doing, then you should extract it into a function and name the function after the “what.” Then, when you read it again, the purpose of the function leaps right out at you, and most of the time you won’t need to care about how the function fulfills its purpose (which is the body of the function).
Fowler, Martin: Refactoring 2nd edition, 2018 November / Chapter 6: A First Set of Refactorings / Extract Function
In test code, a good question to ask is: Is this relevant for the current test case?
Kuefler provides an excellent example (in Java) to demonstrate this:
Calculator calculator = new Calculator(new RoundingStrategy(),
"unused", ENABLE_COSINE_FEATURE, 0.01, calculusEngine, false);
int result = calculator.calculate(newTestCalculation());
The test is passing a lot of irrelevant information into the constructor, and the actual important parts of the test are hidden inside of a helper method. The test can be made more complete by clarifying the inputs of the helper method, and more concise by using another helper to hide the irrelevant details of constructing the calculator.
Calculator calculator = newCalculator();
int result = calculator.calculate(newCalculation(2, Operation.PLUS, 3));
Software Engineering at Google / Chapter 12 Unit Testing / page 240
In the Sourcery core repo, we have currently more than 5000 tests. The majority of those are unit tests focusing on a single rule. But we also have a big suite of integration tests, that check how multiple Sourcery rules work together.
These integration tests follow the schema:
Sourcery
refactoring and review procedure on it.To avoid duplication and make our tests more readable, we introduced a test helper for this very common schema:
def check_refactoring(
code: str,
expected: str,
)
Of course, we also have several negative tests: Tests that ensure that a specific piece of code doesn't get refactored.
For a while, we used the check_refactoring
helper for these negative test
cases as well. Our tests looked like this:
def test_no_code_change:
original_code = "print(42)"
check_refactoring(original_code, original_code)
By providing the same value to code
and expected
, we implicitly assumed that
this piece of code doesn't get refactored.
Then we recognized that the test case is more explicit with a dedicated negative helper:
def test_no_code_change:
original_code = "print(42)"
check_no_refactoring(original_code)
So, we introduced a 2nd helper check_no_refactoring
:
def check_no_refactoring(
code: str
):
"""Verify that Sourcery doesn't refactor a piece of code."""
# We run the same refactoring logic as in check_refactoring,
# and verify that the original code and the refactored code are the same.
return check_refactoring(code, code)
After introducing the check_no_refactoring
helper, we had 2 tasks:
check_refactoring
where the original and
expected code are the same with check_no_refactoring
.It turns out that a custom rule is useful for both.
- id: check_no_refactoring
description: Use the check_no_refactoring helper
pattern: check_refactoring(${code}, ${code})
This rule ensures that we use the appropriate helper in our integration tests:
check_refactoring
helper.check_no_refactoring
helper.With this rule, we could solve both tasks above:
sourcery review
with only this new rule to find all negative tests
where check_no_refactoring
should be used. We updated these tests in the
same PR where we introduced the helper..sourcery.yaml
config file. This way, we
see immediately in our IDE and also in our CI if we should pick the other
helper.This is one of the most frequent use cases of custom rules in our repo: Ensuring that a helper is used. When we encounter repetitive code, we surprisingly often learn that there's already a helper somewhere to replace this repetitive code. But people aren't aware of it. Or it didn't come to their mind when they were enthusiastically trying to finalize that a PR on a very long Friday evening.
When you create a cool, new helper, don't forget to "promote" it. Slack channels, team meetings, internal wikis are all great ways for this. A custom rule doesn't replace but complements those communication channels. It brings the helper into people's mind exactly at the moment when they should use it.
How can you create a custom rule for your new helper?
It depends on your helper, but often, the following steps work:
Let's say you have a helper print_twice
:
def print_twice(sth):
print(sth)
print(sth)
For that, you can create a rule like this:
- id: use-print-twice
description: Use the `print_twice` helper
pattern: |
print(${obj})
print(${obj})
sourcery skip
to the helper function itself.def print_twice(sth):
# The rule is about using this helper function,
# so we skip it here.
# sourcery skip: use-print-twice
print(sth)
print(sth)
For more details about sourcery skip
, consult
this tutorial in the docs
Do you have a custom rule in your project? Or did you try to create a custom rule but didn't succeed? We'd love to hear from you.
Reach out at hello@sourcery.ai or on Twitter @SourceryAI. Join the Sourcery Discord Community.