Six more examples of ways to refactor your Python code, and why they are improvements
Dec 18, 2020
Writing clean, Pythonic code is all about making it as understandable, yet concise, as possible. This is the fifth part of a series on Python refactorings, based on those that can be done automatically by Sourcery. Catch the first part here, and the second, third and fourth parts here, here and here.
The focus of this series is on why these changes are good ideas, not just on how to do them.
with
when opening file to ensure closureA common way of opening and using files in Python is:
file = open("welcome.txt")
data = file.read()
print(data)
file.close()
However if an exception is thrown in between the file being opened and closed
the call to file.close() may end up being skipped. One way to resolve this would
be to use code like this. Here the try...finally
structure ensures that the
file will be closed.
file = open("welcome.txt")
try:
data = file.read()
print(data)
finally:
file.close()
While this has resolved the file closure issue it is quite verbose. Here's where
Python's with
context manager comes to the rescue. Under the hood this behaves
in the same way as the try...finally
example - the file is closed for you as
soon as the block is exited, even where an exception has been thrown.
with open("welcome.txt") as file:
data = file.read()
print(data)
This code is slightly shorter and easier to read - letting you focus on the logic that matters rather than the details of file closure.
We often want to pick something from a dictionary if the key is present, or use a default value if it isn't. One way of doing this is to use a conditional statement like this one:
def pick_hat(available_hats: Dict[Label, Hat]):
if self.favourite_hat in available_hats:
hat_to_wear = available_hats[self.favourite_hat]
else:
hat_to_wear = NO_HAT
return hat_to_wear
A useful shortcut is that Python dictionaries have a get()
method which lets
you set a default value using the second parameter. We can therefore shorten the
above code to this:
def pick_hat(available_hats: Dict[Label, Hat]):
hat_to_wear = available_hats.get(self.favourite_hat, NO_HAT)
return hat_to_wear
This has slimmed the code down and removed some duplication. A point to note is
that if you don't pass in a default value it will use None
.
In Python you can access the end of a list by using negative indices. So
my_list[-1]
gets the final element, my_list[-2]
gets the penultimate one and
so on.
This means that you can turn this:
a = [1, 2, 3]
last_element = a[len(a) - 1]
into the simpler:
a = [1, 2, 3]
last_element = a[-1]
A common code pattern is to have some guard clauses at the start of a function, to check whether certain conditions have been fulfilled and return early or raise an exception if not.
def f(a=None):
if a is None:
return 42
else:
# some long calculations
var = (i**2 for i in range(a))
return sum(var)
While this is perfectly valid code, it can run into problems with excessive nesting, particularly if the rest of the function is fairly long.
Here we can take advantage of the fact that we don't need the else
if the main
body of the if
breaks the control flow by ending with return
or raise
.
Rewriting the function as shown here is logically equivalent.
def f(a=None):
if a is None:
return 42
# some long calculations
var = (i**2 for i in range(a))
return sum(var)
Using a guard condition, or multiple guard conditions, in this way now doesn't cause the rest of the function to be indented. In general the less we have to deal with indents the easier the code is to understand.
Much of programming is about adding up lists of things, and Python has the
built-in sum()
function to help with this.
You can rewrite for
loops which sum lists in this way:
Before:
total = 0
for hat in hats:
total += hat.price
After:
total = sum(hat.price for hat in hats)
This is much shorter, which is a definite bonus. The code also now explicitly tells you what it is trying to do - sum the price of all the hats.
This is a quick way to streamline code slightly. Where a value is set on each branch of an if and then immediately returned, instead return it directly from each branch.
This means that code like this:
def f():
if condition:
val = 42
else:
val = 0
return val
is converted into:
def f():
if condition:
return 42
else:
return 0
This has removed an unnecessary intermediate variable which we had to mentally track.
As mentioned, each of these is a refactoring that Sourcery can automatically perform for you. We're planning on expanding this blog series out and linking them in as additional documentation, with the aim of turning Sourcery into a great resource for learning how to improve your Python skills. You can read the next part in the series here.
If you have any thoughts on how to improve Sourcery or its documentation please do email us or hit me up on Twitter