Six more examples of ways to refactor your Python code, and why they are improvements
Mar 29, 2021
Writing clean, Pythonic code is all about making it as understandable, yet concise, as possible. This is the seventh part of a series on Python refactorings, based on those that can be done automatically by Sourcery. Here are parts 1, 2, 3, 4, 5 and 6.
The focus of this series is on why these changes are good ideas, not just on how to do them.
Don't Repeat Yourself (DRY) is an important tenet of writing clean, maintainable code. Duplicated code bloats the code base, making it harder to read and understand. It often also leads to bugs. Where changes are made in only some of the duplicated areas unintended behaviour will often arise.
One of the main ways to remove duplication is to extract the common areas into another method and call that.
def extraction_example():
self.speed_slider = Scale(
self.parent, from_=1, to=10, orient=HORIZONTAL, label="Speed"
)
self.speed_slider.pack()
self.speed_slider.set(DEFAULT_SPEED)
self.speed_slider.configure(background="white")
self.force_slider = Scale(
self.parent, from_=1, to=10, orient=HORIZONTAL, label="Force"
)
self.force_slider.pack()
self.force_slider.set(DEFAULT_FORCE)
self.force_slider.configure(background="white")
Here you can see that the code to create a slider is repeated twice. In each
case the Scale
object is created and then the same three methods are called on
it, with some differing arguments.
You can extract this out into a method like this:
def extraction_example():
self.speed_slider = create_slider(self, "Speed", DEFAULT_SPEED)
self.force_slider = create_slider(self, "Force", DEFAULT_FORCE)
def create_slider(self, label, default_value):
result = Scale(self.parent, from_=1, to=10, orient=HORIZONTAL, label=label)
result.pack()
result.set(default_value)
result.configure(background="white")
return result
The original method is now shorter and clearer, and the new method is focused on doing one thing - creating a slider.
If more sliders are needed later the create_slider
method can be called
instead of duplicating the code again. Also if something changes in the way
sliders are created, we only need to change the code in one place instead of two
or more.
Recognising where you can extract duplicate, or more commonly almost-duplicate, code into its own method is a key skill to learn in improving your code quality.
A pattern we often see when setting the value of boolean variables is one like this (these two code snippets are equivalent):
if hat_string.startswith("hat"):
starts_with_hat = True
else:
starts_with_hat = False
starts_with_hat = True if hat_string.startswith("hat") else False
The aim is to set starts_with_hat
to True
if hat_string.startswith('hat')
returns True
, and False
if it returns False
.
This can be done much more directly by just setting the variable straight from the function call. To do this you must be sure that it returns a boolean!
starts_with_hat = hat_string.startswith("hat")
This is much shorter and clearer.
The Pythonic way to create a list, set or dictionary from a generator is to use comprehensions.
This can be done for lists, sets and dictionaries:
squares = list(x * x for x in y)
squares = set(x * x for x in y)
squares = dict((x, x * x) for x in xs)
squares = [x * x for x in y]
squares = {x * x for x in y}
squares = {x: x * x for x in xs}
Using the comprehensions rather than the methods is slightly shorter, and the dictionary comprehension in particular is easier to read in comprehension form.
Comprehensions are also slightly more performant than calling the methods.
One pattern that we sometimes see is that filtered collections are created by copying and deleting unwanted elements.
my_hats = {"bowler": 1, "sombrero": 2, "sun_hat": 3}
for hat in my_hats.copy():
if hat not in stylish_hats:
del my_hats[hat]
Using comprehensions is a much nicer way to achieve the same aim.
my_hats = {"bowler": 1, "sombrero": 2, "sun_hat": 3}
my_hats = {hat: value for hat, value in my_hats.items() if hat in stylish_hats}
Doing it this way round lets you define what you want to be in your output
collection rather than deleting what isn't needed. This reads much more
naturally than the for
loop version.
a is not b
and a not in b
to not a is b
and not a in b
Python's PEP 8 style guide is a nice source of little improvements you can make to your code.
One tip is that it's easier to read, and reads more like English prose, if you
write not in
and is not
rather than negating the whole in
or is
expression.
if not hat in hats:
scream("Where's my hat?")
if not hat is None:
scream("I have a hat!")
if hat not in hats:
scream("Where's my hat?")
if hat is not None:
scream("I have a hat!")
Where the same for
or while
loop is defined in all branches of a
conditional, the code can be considerably shortened and clarified by hoisting.
Instead of looping inside each branch, we can have one loop and move all
branches inside the loop.
def sing_song(choice):
style = STYLES[choice]
if style == "classical":
while i_like_singing():
sing_opera()
elif style == "folk":
while i_like_singing():
sing_folk_song()
elif style == "rock":
while i_like_singing():
sing_rock_ballad()
For example the above code can be shortened to the below:
def sing_song(choice):
style = STYLES[choice]
while i_like_singing():
if style == "classical":
sing_opera()
elif style == "folk":
sing_folk_song()
elif style == "rock":
sing_rock_ballad()
By doing this we've removed two lines of duplicated code, and made the intent of the code clearer.
As mentioned, each of these is a refactoring that Sourcery can automatically perform for you. We're planning on expanding this blog series out and linking them in as additional documentation, with the aim of turning Sourcery into a great resource for learning how to improve your Python skills. You can read the next part in the series here.
If you have any thoughts on how to improve Sourcery or its documentation please do email us or hit me up on Twitter