Six more examples of ways to refactor your Python code, and why they are improvements
Mar 09, 2021
Writing clean, Pythonic code is all about making it as understandable, yet concise, as possible. This is the sixth part of a series on Python refactorings, based on those that can be done automatically by Sourcery. Here are the first, second, third, fourth and fifth parts.
The focus of this series is on why these changes are good ideas, not just on how to do them.
The most straightforward way to concatenate strings in Python is to just use the
+
operator:
hat_description = hat.colour + hat.type
This is perfectly fine when you are joining together small numbers of strings (though f-strings are the best choice for doing more complicated string handling).
The problem with using +
or +=
comes when they are used to concatenate large
lists of strings. For example you might use them in a for loop like this:
all_hats = ""
for hat in my_wardrobe.hats:
all_hats += hat.description
This is cumbersome to read, and also isn't very performant. A new string has to be created for every iteration in the for loop, which slows things down.
Luckily Python strings come with the join
method to solve this problem.
all_hats = "".join(hat.description for hat in my_wardrobe.hats)
This accomplishes the same task in one line and is quite a lot faster to boot. You can also add a separator in between each string easily, without having to worry about extra separators being added at the beginning or end of the result.
all_hats = ", ".join(hat.description for hat in my_wardrobe.hats)
A small tip - when you want to check whether something is one of multiple different types, instead of this:
if isinstance(hat, Bowler) or isinstance(hat, Fedora):
self.wear(hat)
you can merge the calls into:
if isinstance(hat, (Bowler, Fedora)):
self.wear(hat)
This is shorter while staying nice and easy to read.
We often need to work out the smallest or largest of two values, and might think of using a pattern like this one to do so:
if first_hat.price < second_hat.price:
cheapest_hat_price = first_hat.price
else:
cheapest_hat_price = second_hat.price
A quicker way to do this in Python is to use the built-in min
and max
functions. This code is a shorter and clearer way to achieve the same result:
cheapest_hat_price = min(first_hat.price, second_hat.price)
The same functions offer a shortcut for when we want to put a cap or a floor on the value of a variable:
Before:
if sale_price >= 10:
sale_price = 10
After:
sale_price = min(sale_price, 10)
Sometimes in a for loop we just want some code to run a certain number of times, and don't actually make use of the index variable.
For example take this code:
for hat in my_wardrobe.hats:
self.shout("Hurrah!")
We have introduced a new variable, hat
, which we have to note when reading the
code, but actually we don't need it and could replace it with _
:
for _ in my_wardrobe.hats:
self.shout("Hurrah!")
It is a convention in Python to use _
as a throwaway name for unused
variables. This means your brain can learn to safely ignore these, reducing the
overhead to understand the code. Where you see this in a for
loop it it
immediately clear that the loop is just used to repeat a block of code and we
don't care about the value being iterated over.
A very common requirement when coding is to check whether or not a sequence (a
string, list or tuple) is empty before doing some processing with it. One
straightforward way of doing this is to use the len
built-in function like
this:
if len(my_wardrobe.hats) == 0:
self.shout("Alarm!")
There is a shorter and more Pythonic way of doing this however, which is one of
the style recommendations given in
PEP 8. This is to use the fact that
in Python, empty sequences evaluate to False
. This means that you can rewrite
the above like this:
if not my_wardrobe.hats:
self.shout("Alarm!")
The equivalent case for sequences with at least one element is:
if len(my_wardrobe.hats) > 0:
self.shout("I have hats!")
which can be turned into this since these evaluate to True
:
if my_wardrobe.hats:
self.shout("I have hats!")
This simplification takes a small amount of getting used to, but once you're acclimated to it the code reads very naturally.
Quite often while reading code you will see a pattern like this:
if isinstance(hat, Sombrero) and hat.colour == "green":
self.wear(hat)
elif isinstance(hat, Sombrero) and hat.colour == "red":
destroy(hat)
This duplication of one of the conditions makes things more difficult to read,
and carries with it all the usual problems of code duplication. While normally
we try to avoid adding nesting to the code, in this case it makes sense to lift
the duplicated conditional into its own if
statement:
if isinstance(hat, Sombrero):
if hat.colour == "green":
self.wear(hat)
elif hat.colour == "red":
destroy(hat)
It is now clearer at a glance that the whole if..elif
chain relates only to
sombreros and not other types of hat.
As mentioned, each of these is a refactoring that Sourcery can automatically perform for you. We're planning on expanding this blog series out and linking them in as additional documentation, with the aim of turning Sourcery into a great resource for learning how to improve your Python skills. You can read the next part in the series here.
If you have any thoughts on how to improve Sourcery or its documentation please do email us or hit me up on Twitter