How to ensure that some names just don't appear in your codebase?
Jan 05, 2023
https://tenor.com/view/harry-potter-dont-speak-his-name-gif-12688511
In Harry Potter witches and wizards are too afraid to speak Voldemort's name - instead they call him "You-Know-Who" or "He Who Must Not Be Named".
Assuming that there aren't words you and your teammates don't dare to utter: What are reasons to avoid a name in your codebase?
This post won't come up with a deny-list of generally bad names. Rather, it will show some patterns and situations where a name is unclear or misleading in that specific context.
It's easy to underestimate the importance of names. Usually, changing names won't change the code's behavior, so why bother? Well, it turns out that names greatly influence how well humans can understand a code. As Adam Tornhill (software engineer, psychologist, and the founder of CodeScene) puts it in Software Design X-Rays:
We use the same mental processes to understand code as those we use in everyday life beyond our keyboards (evolution wasn’t kind enough to equip our brains with a coding center).
That ease of understanding in turn influences crucial metrics of a software project: How long does it take to fix a critical bug? How long does it take to add a new feature? How probable is that adding a new feature will introduce a bug?
In our survey last year, "inconsistent or confusing terminology" was mentioned as the main source of technical debt. More than 55% of the respondents regarded this as an issue in their codebase.
As the code evolves, some names can become out-of-sync with the current
architecture. Let's say your first implementation used the command pattern, but
later, you changed it. Now, it makes sense to ensure that there are no
remainders of the old implementations, like variable names with a "command"
suffix or execute
functions.
Let's say your company started out as "Great Groceries". After some years, it expanded and got renamed to "Super Store". Now (or at least soon after the big push), it makes sense to reflect this change in the codebase as well. The longer you wait with this, the bigger the confusion. And the bigger the chance that some new team members have no idea that the company was once called "Great Groceries".
This exercise is more difficult if the old brand name exists in various ways:
For example, sometimes it's GreatGroceries
spelled out, sometimes it's just a
gg_
prefix. See also the section about synonyms.
There are several business changes that make a renaming necessary:
In all these cases, it makes sense to ensure that your codebase keeps up with the changes.
Synonyms make human languages more vivid. In code, however, they only cause
confusion. Why do we have an object called holiday_policy
and a property
vacation_days_left
? Do "holiday" and "vacation" refer to the same concept? 🤔
If not what's the difference and where is it documented?
In Domain-Driven Design, the very first recommendation after the introductory chapter is to create a "ubiquitous language". Communication gets much smoother if you ensure that the same terminology is used throughout the project. This includes code, documentation, but also emails and meetings.
Often, it's difficult to tell whether one name or another is a more suitable choice. But it's clear that consistency is valuable.
Some examples of synonyms where you might decide to keep only one of them:
A special case is abbreviations.
Again, the recurring theme: There's no general rule when you should use an
abbreviation. But consistency matters. If one field is called ordered_pieces
,
don't call the other one delivered_pcs
.
The book Refactoring in 1999 introduced the concept of "code smells". One change in the 2018 edition of the book: A new smell "mysterious name" was added and it appears as the very first element of the list.
It's difficult to provide a general example. But if a name keeps popping up in the codebase and you keep asking yourself what it means, it might be worth creating a Voldemort rule for it.
Sometimes, suffixes are attached to a name without really enhancing its meaning. Sometimes, the reason is that the class does several loosely related things and should be probably split up.
Some candidates for such too general names:
Again, this is very context-dependent. If you have a sophisticated system of
Manager
and Coordinator
classes with clear responsibilities, leave them as
they are. But if that's not the case, this rule might spot some god classes.
This is a tricky one. Some names are completely fine per se, but they are easy to mix up with another concept in the codebase.
Some tips:
An example from our own Sourcery codebase is Proposer
and Proposal
. Using
derivations of the same root word has a big advantage: It communicates that
those concepts are closely related. A disadvantage is that they look similar and
they sound even more similar. During a discussion where both proposers and
proposals are mentioned multiple times, it's easy to lose the thread.
To create a rule avoiding a name, you can use the Sourcery Rules Generator.
You can install it with:
pip install sourcery-rules-generator
To create "voldemort" naming rules, run the command:
sourcery-rules voldemort create
Enter the name that should be avoided. For example: annual
You'll be prompted to provide:
=>
5 rules will be generated:
rules:
- id: no-annual-function-name
description: Don't use the name annual
pattern: "\ndef ${function_name}(...):\n ...\n"
condition: function_name.contains("annual")
tags:
- naming
- no-annual
- id: no-annual-function-arg
description: Don't use the name annual
pattern: "\ndef ...(...,${arg_name}: ${type?} = ${default_value?},...):\n ...\n"
condition: arg_name.contains("annual")
tags:
- naming
- no-annual
- id: no-annual-class-name
description: Don't use the name annual
pattern: "\nclass ${class_name}(...):\n ...\n"
condition: class_name.contains("annual")
tags:
- naming
- no-annual
- id: no-annual-property
description: Don't use the name annual
pattern: '${var}: ${type}'
condition: var.contains("annual")
tags:
- naming
- no-annual
- id: no-annual-variable
description: Don't use the name annual
pattern: ${var} = ${value}
condition: var.contains("annual")
tags:
- naming
- no-annual
Let's see some situations where creating such a Voldemort rule might be a good idea.
Good names make it much easier to read and understand a codebase. Regarding that code gets read much more often than it's written, it's worth investing some effort into clear and consistent naming. This doesn't just include coming up with good names when you add a feature. It's also essential to keep the names up-to-date with the various business and technology changes.
Do you have some more examples? We are curious to hear your stories with awesome and horrible names.
Reach out at hello@sourcery.ai or on Twitter @SourceryAI. Join the Sourcery Discord Community.