Six more examples of ways to refactor your Python code, and why they are improvements
Oct 07, 2020
Writing clean, Pythonic code is all about making it as understandable, yet concise, as possible. This is the fourth part of a series on Python refactorings, based on those that can be done automatically by Sourcery. Catch the first part here, and the second and third parts here and here.
The focus of this series is on why these changes are good ideas, not just on how to do them.
We should always be searching out opportunities to remove duplicated code. A
good place to do so is where there are multiple identical blocks inside an
if..elif
chain.
def process_payment(payment):
if payment.currency == "USD":
process_standard_payment(payment)
elif payment.currency == "EUR":
process_standard_payment(payment)
else:
process_international_payment(payment)
Here we can combine the first two blocks using or
to get this:
def process_payment(payment):
if payment.currency == "USD" or payment.currency == "EUR":
process_standard_payment(payment)
else:
process_international_payment(payment)
Now if we need to change the process_standard_payment(payment)
line we can do
it in one place instead of two. This becomes even more important if these blocks
involve multiple lines.
Let's see if we can refine the previous example a little bit further.
We often have to compare a value to one of several possible others. When written out like this we have to look through each comparison to understand it, as well as mentally processing the boolean operator.
def process_payment(payment):
if payment.currency == "USD" or payment.currency == "EUR":
process_standard_payment(payment)
else:
process_international_payment(payment)
By using the in
operator and moving the values we are comparing to into a
collection we can simplify things.
def process_payment(payment):
if payment.currency in ["USD", "EUR"]:
process_standard_payment(payment)
else:
process_international_payment(payment)
This has avoided a little bit of duplication, and the conditional can now be taken in and understood with one glance.
Here we find ourselves setting a value if it evaluates to True
, and otherwise
using a default.
This can be written long-form as:
if input_currency:
currency = input_currency
else:
currency = DEFAULT_CURRENCY
or using an if expression:
currency = input_currency if input_currency else DEFAULT_CURRENCY
Both can be simplified to the following, which is a bit easier to read and
avoids the duplication of input_currency
.
currency = input_currency or DEFAULT_CURRENCY
It works because the left-hand side is evaluated first. If it evaluates to
True
then currency will be set to this and the right-hand side will not be
evaluated. If it evaluates to False
the right-hand side will be evaluated and
currency
will be set to DEFAULT_CURRENCY
.
A pattern that is often used in Python for loops is to use range(len(list))
to
generate a range of numbers that can be iterated over.
for i in range(len(currencies)):
print(currencies[i])
If the index i
is only used to do list access this code can be improved by
iterating over the list directly, as in the example below.
for currency in currencies:
print(currency)
This code is easier to understand, and a lot less cluttered. In particular being
able to use a meaningful name for currency
greatly improves readability.
Sometimes when iterating over a dictionary you only need to use the dictionary keys.
for currency in currencies.keys():
process(currency)
In this case the call to keys()
is unnecessary, since the default behaviour
when iterating over a dictionary is to iterate over the keys.
for currency in currencies:
process(currency)
The code is now slightly cleaner and easier to read, and avoiding a function call will yield a (small) performance improvment.
When iterating over a list you sometimes need access to a loop counter that will let you know the index of the element you are utilising. It can be tempting to just write one yourself like this:
i = 0
for currency in currencies:
print(i, currency)
i += 1
However there's a built-in Python function, enumerate
, that lets you generate
an index directly, removing two unneeded lines of code.
for i, currency in enumerate(currencies):
print(i, currency)
When reading this we don't have to worry about the book-keeping of the i
variable, letting us focus in on the code that really matters.
As mentioned, each of these is a refactoring that Sourcery can automatically perform for you. We're planning on expanding this blog series out and linking them in as additional documentation, with the aim of turning Sourcery into a great resource for learning how to improve your Python skills. You can read the next part in the series here.
If you have any thoughts on how to improve Sourcery or its documentation please do email us or hit me up on Twitter