Python Refactorings - Part 6

Six more examples of ways to refactor your Python code, and why they are improvements

Date

Mar 09, 2021

Python code
Photo by Chris Ried on Unsplash

Writing clean, Pythonic code is all about making it as understandable, yet concise, as possible. This is the sixth part of a series on Python refactorings, based on those that can be done automatically by Sourcery. Here are the first, second, third, fourth and fifth parts.

The focus of this series is on why these changes are good ideas, not just on how to do them.

Use str.join() instead of for loop

The most straightforward way to concatenate strings in Python is to just use the + operator:

hat_description = hat.colour + hat.type

This is perfectly fine when you are joining together small numbers of strings (though f-strings are the best choice for doing more complicated string handling).

The problem with using + or += comes when they are used to concatenate large lists of strings. For example you might use them in a for loop like this:

all_hats = ""
for hat in my_wardrobe.hats:
    all_hats += hat.description

This is cumbersome to read, and also isn't very performant. A new string has to be created for every iteration in the for loop, which slows things down.

Luckily Python strings come with the join method to solve this problem.

all_hats = "".join(hat.description for hat in my_wardrobe.hats)

This accomplishes the same task in one line and is quite a lot faster to boot. You can also add a separator in between each string easily, without having to worry about extra separators being added at the beginning or end of the result.

all_hats = ", ".join(hat.description for hat in my_wardrobe.hats)

Merge isinstance calls

A small tip - when you want to check whether something is one of multiple different types, instead of this:

if isinstance(hat, Bowler) or isinstance(hat, Fedora):
    self.wear(hat)

you can merge the calls into:

if isinstance(hat, (Bowler, Fedora)):
    self.wear(hat)

This is shorter while staying nice and easy to read.

Replace comparison with min/max call

We often need to work out the smallest or largest of two values, and might think of using a pattern like this one to do so:

if first_hat.price < second_hat.price:
    cheapest_hat_price = first_hat.price
else:
    cheapest_hat_price = second_hat.price

A quicker way to do this in Python is to use the built-in min and max functions. This code is a shorter and clearer way to achieve the same result:

cheapest_hat_price = min(first_hat.price, second_hat.price)

The same functions offer a shortcut for when we want to put a cap or a floor on the value of a variable:

Before:

if sale_price >= 10:
    sale_price = 10

After:

sale_price = min(sale_price, 10)

Replace unused for index with underscore

Sometimes in a for loop we just want some code to run a certain number of times, and don't actually make use of the index variable.

For example take this code:

for hat in my_wardrobe.hats:
    self.shout("Hurrah!")

We have introduced a new variable, hat, which we have to note when reading the code, but actually we don't need it and could replace it with _:

for _ in my_wardrobe.hats:
    self.shout("Hurrah!")

It is a convention in Python to use _ as a throwaway name for unused variables. This means your brain can learn to safely ignore these, reducing the overhead to understand the code. Where you see this in a for loop it it immediately clear that the loop is just used to repeat a block of code and we don't care about the value being iterated over.

Simplify testing of sequence comparisons

A very common requirement when coding is to check whether or not a sequence (a string, list or tuple) is empty before doing some processing with it. One straightforward way of doing this is to use the len built-in function like this:

if len(my_wardrobe.hats) == 0:
    self.shout("Alarm!")

There is a shorter and more Pythonic way of doing this however, which is one of the style recommendations given in PEP 8. This is to use the fact that in Python, empty sequences evaluate to False. This means that you can rewrite the above like this:

if not my_wardrobe.hats:
    self.shout("Alarm!")

The equivalent case for sequences with at least one element is:

if len(my_wardrobe.hats) > 0:
    self.shout("I have hats!")

which can be turned into this since these evaluate to True:

if my_wardrobe.hats:
    self.shout("I have hats!")

This simplification takes a small amount of getting used to, but once you're acclimated to it the code reads very naturally.

Lift repeated conditional into its own if statement

Quite often while reading code you will see a pattern like this:

if isinstance(hat, Sombrero) and hat.colour == "green":
    self.wear(hat)
elif isinstance(hat, Sombrero) and hat.colour == "red":
    destroy(hat)

This duplication of one of the conditions makes things more difficult to read, and carries with it all the usual problems of code duplication. While normally we try to avoid adding nesting to the code, in this case it makes sense to lift the duplicated conditional into its own if statement:

if isinstance(hat, Sombrero):
    if hat.colour == "green":
        self.wear(hat)
    elif hat.colour == "red":
        destroy(hat)

It is now clearer at a glance that the whole if..elif chain relates only to sombreros and not other types of hat.

Conclusion

As mentioned, each of these is a refactoring that Sourcery can automatically perform for you. We're planning on expanding this blog series out and linking them in as additional documentation, with the aim of turning Sourcery into a great resource for learning how to improve your Python skills. You can read the next part in the series here.

If you have any thoughts on how to improve Sourcery or its documentation please do email us or hit me up on Twitter