Six more examples of ways to refactor your Python code, and why they are improvements
Aug 12, 2020
Writing clean, Pythonic code is all about making it as understandable, yet concise, as possible. This is the third part of a series on Python refactorings, based on those that can be done automatically by Sourcery. Catch the first part here, and the second part here.
The focus of this series is on why these changes are good ideas, not just on how to do them.
Deeply nested functions can be very difficult to understand. As you read them you have to remember the conditions that hold for each level of nesting. This can be even more difficult in Python, given that there are no brackets to help define conditional blocks. An easy way to reduce nesting is to convert conditionals into guard clauses.
As an example let's look at this function:
def should_i_wear_this_hat(self, hat):
if isinstance(hat, Hat):
current_fashion = FASHION_API.get_fashion(FASHION_TYPE.HAT)
weather_outside = self.look_out_of_window()
is_stylish = self.evaluate_style(hat, current_fashion)
if weather_outside.is_raining:
print("Damn.")
return True
else:
print("Great.")
return is_stylish
else:
return False
This is quite hard to parse, given the two layers of nesting. When I get to the
else
at the bottom I need to flick back and forth a bit between that and the
if
test at the top before I've understood what's going on. This if
condition
is a check for an edge case, where something that isn't a hat is passed in.
Since we just return False
here it's a perfect place to introduce a guard
clause:
def should_i_wear_this_hat(self, hat):
if not isinstance(hat, Hat):
return False
current_fashion = get_fashion()
weather_outside = self.look_out_of_window()
is_stylish = self.evaluate_style(hat, current_fashion)
if weather_outside.is_raining:
print("Damn.")
return True
else:
print("Great.")
return is_stylish
We add a guard clause by inverting the condition and returning immediately. The edge case is now taken care of at the start of the function and I don't have to worry about it any more.
Having reduced the level of nesting, the rest of the function is now easier to read. Whether to add a guard clause is sometimes subjective. For really short functions it may not be worth doing. Where functions are longer or more complex it can often be a useful tool.
One pattern we sometimes see is a conditional where nothing happens in the main
body, and all of the action is in the else
clause.
if location == OUTSIDE:
pass
else:
take_off_hat()
In this case we can make the code shorter and more concise by swapping the main
body and the else
around. We have to make sure to invert the conditional, then
the logic from the else
clause moves into the main body.
if location != OUTSIDE:
take_off_hat()
else:
pass
We then have an else
clause which does nothing, so we can remove it.
if location != OUTSIDE:
take_off_hat()
This is easier to read, and the intent of the conditional is clearer. When reading the code I don't have to mentally invert it to understand it, since that has been done for me.
When declaring a list and filling it up with values one way that can come naturally is to declare it as empty and then append to it.
hats_i_own = []
hats_i_own.append("panama")
hats_i_own.append("baseball_cap")
hats_i_own.append("bowler")
This can be done in place, shortening the code and making the intent more explicit. Now I just need to glance at one line to see that I'm filling a variable with hats, rather than four.
hats_i_own = ["panama", "baseball_cap", "bowler"]
Doing it this way is also slightly more performant since it avoids the function
calls to append
. The same holds true for filling up other collection types
like sets and dictionaries.
The scope of local variables should always be as tightly defined as possible and practicable.
This means that:
You don't have to keep the variable in your working memory through the parts of the function where it's not needed. This cuts down on the cognitive load of understanding your code.
If code is in coherent blocks where variables are declared and used together, it makes it easier to split functions apart, which can lead to shorter, easier to understand methods.
If variables are declared far from their usage, they can become stranded. If the code where they are used is later changed or removed unused variables can be left sitting around, complicating the code unnecessarily.
Let's take another look at the earlier example.
def should_i_wear_this_hat(self, hat):
if not isinstance(hat, Hat):
return False
current_fashion = get_fashion()
weather_outside = self.look_out_of_window()
is_stylish = self.evaluate_style(hat, current_fashion)
if weather_outside.is_raining:
print("Damn.")
return True
else:
print("Great.")
return is_stylish
Here the is_stylish
variable isn't actually needed if the weather is rainy. It
could be moved inside the else
clause. This means we can also move the
current_fashion
variable, which is only used here. You do need to check that
these variables aren't used later in the function, which is easier if functions
are kept short and sweet.
Moving the assignment to current_fashion
also avoids a function call when the
weather is raining, which could lead to a performance improvement if it's an
expensive call.
def should_i_wear_this_hat(self, hat):
if not isinstance(hat, Hat):
return False
weather_outside = self.look_out_of_window()
if weather_outside.is_raining:
print("Damn.")
return True
else:
print("Great.")
current_fashion = get_fashion()
is_stylish = self.evaluate_style(hat, current_fashion)
return is_stylish
We could actually then go one step further and inline the is_stylish
variable.
This shows how small refactorings can often build on one another and lead to
further improvements.
def should_i_wear_this_hat(self, hat):
if not isinstance(hat, Hat):
return False
weather_outside = self.look_out_of_window()
if weather_outside.is_raining:
print("Damn.")
return True
else:
print("Great.")
current_fashion = get_fashion()
return self.evaluate_style(hat, current_fashion)
When iterating over a dictionary a good tip is that you can use items()
to let
you access the keys and values at the same time. This lets you transform this:
hats_by_colour = {"blue": ["panama", "baseball_cap"]}
for hat_colour in hats_by_colour:
hats = hats_by_colour[hat_colour]
if hat_colour in self.favourite_colours:
think_about_wearing(hats)
into this:
hats_by_colour = {"blue": ["panama", "baseball_cap"]}
for hat_colour, hats in hats_by_colour.items():
if hat_colour in self.favourite_colours:
think_about_wearing(hats)
This saves us the line that we used to assign to hats
, incorporating it into
the for loop. The code now reads more naturally, with a touch less duplication.
Something we often do is check whether a list or sequence has elements before we try and do something with it.
if len(list_of_hats) > 0:
hat_to_wear = choose_hat(list_of_hats)
A Pythonic way of doing this is just to use the fact that Python lists and
sequences evaluate to True
if they have elements, and False
otherwise.
This means we can write the above code more simply as:
if list_of_hats:
hat_to_wear = choose_hat(list_of_hats)
Doing it this way is a convention, set out in Python's PEP8 style guide. Once you've gotten used to doing it this way it does make the code slightly easier to read and a bit less cluttered.
As mentioned, each of these is a refactoring that Sourcery can automatically perform for you. We're planning on expanding this blog series out and linking them in as additional documentation, with the aim of turning Sourcery into a great resource for learning how to improve your Python skills. You can read the next part in the series here.
If you have any thoughts on how to improve Sourcery or its documentation please do email us or hit me up on Twitter