Python Refactorings

Python code — Photo by Chris Ried on Unsplash

Writing clean, Pythonic code is all about making it as understandable, yet concise, as possible. This is the seventh part of a series on Python refactorings, based on those that can be done automatically by Sourcery. Here are parts 1, 2, 3, 4, 5 and 6.

The focus of this series is on why these changes are good ideas, not just on how to do them.

Extract duplicate code into a method

Don't Repeat Yourself (DRY) is an important tenet of writing clean, maintainable code. Duplicated code bloats the code base, making it harder to read and understand. It often also leads to bugs. Where changes are made in only some of the duplicated areas unintended behaviour will often arise.

One of the main ways to remove duplication is to extract the common areas into another method and call that.

def extraction_example():
    self.speed_slider = Scale(
        self.parent, from_=1, to=10, orient=HORIZONTAL, label="Speed"
    )
    self.speed_slider.pack()
    self.speed_slider.set(DEFAULT_SPEED)
    self.speed_slider.configure(background="white")

    self.force_slider = Scale(
        self.parent, from_=1, to=10, orient=HORIZONTAL, label="Force"
    )
    self.force_slider.pack()
    self.force_slider.set(DEFAULT_FORCE)
    self.force_slider.configure(background="white")

Here you can see that the code to create a slider is repeated twice. In each case the Scale object is created and then the same three methods are called on it, with some differing arguments.

You can extract this out into a method like this:

def extraction_example():
    self.speed_slider = create_slider(self, "Speed", DEFAULT_SPEED)
    self.force_slider = create_slider(self, "Force", DEFAULT_FORCE)


def create_slider(self, label, default_value):
    result = Scale(self.parent, from_=1, to=10, orient=HORIZONTAL, label=label)
    result.pack()
    result.set(default_value)
    result.configure(background="white")

    return result

The original method is now shorter and clearer, and the new method is focused on doing one thing - creating a slider.

If more sliders are needed later the create_slider method can be called instead of duplicating the code again. Also if something changes in the way sliders are created, we only need to change the code in one place instead of two or more.

Recognising where you can extract duplicate, or more commonly almost-duplicate, code into its own method is a key skill to learn in improving your code quality.

Simplify assignment of boolean variables

A pattern we often see when setting the value of boolean variables is one like this (these two code snippets are equivalent):

if hat_string.startswith("hat"):
    starts_with_hat = True
else:
    starts_with_hat = False

starts_with_hat = True if hat_string.startswith("hat") else False

The aim is to set starts_with_hat to True if hat_string.startswith('hat') returns True, and False if it returns False.

This can be done much more directly by just setting the variable straight from the function call. To do this you must be sure that it returns a boolean!

starts_with_hat = hat_string.startswith("hat")

This is much shorter and clearer.

Use list, set or dictionary comprehensions directly instead of calling list(), dict() or set()

The Pythonic way to create a list, set or dictionary from a generator is to use comprehensions.

This can be done for lists, sets and dictionaries:

Before:

squares = list(x * x for x in y)

squares = set(x * x for x in y)

squares = dict((x, x * x) for x in xs)

After:

squares = [x * x for x in y]

squares = {x * x for x in y}

squares = {x: x * x for x in xs}

Using the comprehensions rather than the methods is slightly shorter, and the dictionary comprehension in particular is easier to read in comprehension form.

Comprehensions are also slightly more performant than calling the methods.

Filter using comprehensions rather than copying and deleting

One pattern that we sometimes see is that filtered collections are created by copying and deleting unwanted elements.

my_hats = {"bowler": 1, "sombrero": 2, "sun_hat": 3}
for hat in my_hats.copy():
    if hat not in stylish_hats:
        del my_hats[hat]

Using comprehensions is a much nicer way to achieve the same aim.

my_hats = {"bowler": 1, "sombrero": 2, "sun_hat": 3}
my_hats = {hat: value for hat, value in my_hats.items() if hat in stylish_hats}

Doing it this way round lets you define what you want to be in your output collection rather than deleting what isn't needed. This reads much more naturally than the for loop version.

Prefer `a is not b` and `a not in b` to `not a is b` and `not a in b`

Python's PEP 8 style guide is a nice source of little improvements you can make to your code.

One tip is that it's easier to read, and reads more like English prose, if you write not in and is not rather than negating the whole in or is expression.

Before

if not hat in hats:
    scream("Where's my hat?")

if not hat is None:
    scream("I have a hat!")

After:

if hat not in hats:
    scream("Where's my hat?")

if hat is not None:
    scream("I have a hat!")

Hoist loop from if

Where the same for or while loop is defined in all branches of a conditional, the code can be considerably shortened and clarified by hoisting. Instead of looping inside each branch, we can have one loop and move all branches inside the loop.

def sing_song(choice):
    style = STYLES[choice]
    if style == "classical":
        while i_like_singing():
            sing_opera()
    elif style == "folk":
        while i_like_singing():
            sing_folk_song()
    elif style == "rock":
        while i_like_singing():
            sing_rock_ballad()

For example the above code can be shortened to the below:

def sing_song(choice):
    style = STYLES[choice]
    while i_like_singing():
        if style == "classical":
            sing_opera()
        elif style == "folk":
            sing_folk_song()
        elif style == "rock":
            sing_rock_ballad()

By doing this we've removed two lines of duplicated code, and made the intent of the code clearer.

Conclusion

As mentioned, each of these is a refactoring that Sourcery can automatically perform for you. We're planning on expanding this blog series out and linking them in as additional documentation, with the aim of turning Sourcery into a great resource for learning how to improve your Python skills. You can read the next part in the series here.

If you have any thoughts on how to improve Sourcery or its documentation please do email us or hit me up on Twitter