Building blocks of a robust CLI App and how we've implemented them with Rich in the Sourcery CLI.
Nov 02, 2022
Over the last couple of months, we've been working to make the Sourcery CLI more robust. There were a couple of different factors driving this effort.
Rich appeared on our radar when we were searching for a way to display Markdown in the terminal. We had just added the custom rules. The fields description and explanation support Markdown and often contain links, listings, etc. In the IDEs, it was straightforward to display those Markdown elements, but what about the command line? We considered implementing the Markdown functionality ourselves but recognized that it's far from our core competence. We had heard some positive reviews about Rich and decided to give it a try.
This turned out to be an excellent choice. Introducing the
Markdown
element for
custom rules was only the first step. Over the course of the last months, our
CLI has kept growing. We've added several building blocks that make our CLI
application more robust, user-friendly, and modern. Displaying feedback during
long-running operations, structuring the output, showing both a detailed info
and a summary. It turned out that Rich has a neat solution for all of these. In
this post, we would like to share our implementation for these common CLI use
cases and some lessons we've learned on the way.
While it sounds obvious that you need to show some status during long-running operations, it's often not so clear which operations fall into this category. For example, it took us a while to recognize that authentication is a tricky step. In the local environment with a local server, it happens instantaneously. Connecting to the production server is a different story.
Tim observed that the Sourcery CLI was unusually unresponsive while testing it
during a train ride. It turned out that the authentication could take multiple
minutes with a wobbly internet connection. We fixed that issue by adding some
timeouts and used the
status
method of the Rich console
to indicate that the authentication is in progress. It's handy that the Status
can be used with a context manager:
def _authenticate(self) -> bool:
with self.stderr_console.status("Authenticating"):
... # Authentication logic here.
During the crucial step, the review of the code, we went one step further: We
added a
Progress
element to
show how many files have already been processed and how many are waiting. We
defined custom columns but for many use cases the default Progress
element
will be fine:
with Progress(
TextColumn("{task.description}"),
BarColumn(),
MofNCompleteColumn(),
TimeRemainingColumn(),
console=progress_console,
disable=not progress_console.is_terminal or os.getenv("PRE_COMMIT") == "1",
expand=True,
transient=True,
) as progress:
review_files_task = progress.add_task("Reviewing files", total=len(file_asts))
for file_ast in file_asts:
progress.update(
review_files_task,
description=f"Reviewing {file_ast.module_path}",
refresh=True,
)
# File review logic called here.
When introducing such elements, it makes sense to consider which factors influence the execution time. Some examples from the Sourcery CLI:
The first two principles of the Command Line Interface Guidelines are human-first design and composability. A good CLI tool is practical both for humans and automations. It can be easily called in scripts and combined with other commands.
We started to pay more attention to composability after we had added the
Status
and Progress
elements mentioned above. Will they mess up the output
if we've redirected it to a file?
With the Progress
element, you can tweak various options depending on the
environment:
console
: Do you display it on stdout or stderr?disable
transient
For example, the Progress
during sourcery review
is displayed on stdout or
stderr depending on whether stdout is a terminal:
progress_console = (
self.stdout_console if self.stdout_console.is_terminal else self.stderr_console
)
One caveat is that Rich, by default, resizes the content to fit the available
width. This can lead to nicer output for human users but also to unexpected
behaviour if the output gets redirected. The sourcery review
command has a
--csv
option. When it's used, the output often gets redirected into a csv
file. For this reason, we ensure that each violation is displayed in exactly 1
line via setting soft_wrap
:
self.stdout_console.print(
f"{file_name}:{affected_line}:0: {proposal.id()} line {line_nr_in_diff}/{diff_length}",
# This output should be machine-readable.
# We need to ensure that Rich doesn't introduce any line breaks.
soft_wrap=True,
)
In the case of the sourcery review
command, the --csv
option is an obvious
candidate for redirected output. Even if your application doesn't contain such a
feature, it's highly recommended to ensure that it works well with redirected
input and output. Some common scenarios your users might try:
grep
wc -l
(Often with the underlying
assumption that each line represents one item.)As the Command Line Interface Guidelines puts it: "Whatever software you're building, you can be absolutely certain that people will use it in ways you didn't anticipate. Your software will become a part in a larger system — your only choice is over whether it will be a well-behaved part."
Besides displaying some status, it's also helpful to show partial results, as soon as you have them. This way, users get some information and can even detect some errors, before the whole operation has finished. A frequent pattern to implement this:
The sourcery review
command displays a violation as soon as it has been
detected. This way the users can see quickly which issues were found in the
first files. With that information, they can also decide better what's a
reasonable next step, if they interrupt a long-running review. E.g. run a review
only for a subdirectory, exclude some noisy rules, include only a subset of
high-priority rules.
The more and more features you add, the higher the chance that the output becomes overwhelming. This might be difficult to notice, because individually each piece makes a lot of sense. But you need some structure to form a coherent output of these various pieces.
After adding several new features and tweaks, we recognized that the
sourcery review
command had become quite overwhelming. As a first step, we
introduced a
Console.rule
that
draws a line to separate the in-progress output from the summary:
self.stderr_console.rule("Overview")
Structuring the summary was relatively straightforward, as long as we added each
new piece to the same
Markdown
element. When
we introduced a separate
Table
showing the number
of issues per rule, we needed a small workaround to ensure that their stylings
match:
table = Table(title="Issues by Rule ID", title_style="bold", title_justify="left")
In the output of the sourcery login
command, we include some tips about
possible next steps. These are displayed in a
Panel
to separate them
from the command output:
def _show_tip(self, message: str) -> None:
if not (is_ci_environment() or is_pre_commit()):
self.stderr_console.print(Panel(Markdown(message), style="gray46", title="tip"))
After we had added the GPSG rules, we were surprised to see that the
sourcery review
command took much longer than before. The difference wasn't
that big for test files with dummy code, but as soon as we reviewed a small repo
with real code, the execution time went up.
After some experimentation with various test cases, we found the culprit: the
no-long-functions
rule. In that initial version, whenever Sourcery detected a
violation of this rule, we printed the whole function. Which was, by definition,
long. It turned out that while the
Syntax
element of Rich is
awesome, it isn't optimized for printing several hundred lines of code. Which is
understandable, because such an output would be unreadable anyway. :-)
We revisited our more complex and function-level rules to ensure that we print only a sensible amount of code for each of them. If your application displays code coming from user input, validation and cropping are probably good ideas.
The terminal is supposed to be quite consistent. But the more fancy elements and custom styling you use, the more it might look different with various color schemes.
Rich supports a style
option for various elements. Use that with care. :-) If
you start defining hard-coded colours, you might learn that they don't look that
great for example with the "Green on Black" color scheme. The default styling
usually looks well with various color schemes.
This is probably the most important of all lessons. Introducing Rich was a major improvement benefiting both the users of the CLI and the developers working on it.
For example, we were able get rid of a lot of custom code we used to display
coloured diffs. The Syntax
element is both more convenient and more reliable
accross various terminals and color schemes.
Try out the new & improved Sourcery CLI and let us know what you think. Have any ideas what we should improve? Rich features we should be aware of? Reach out in an email at hello@sourcery.aior on Twitter @SourceryAI.