Delivering and maintaining AI-generated code.
Jun 19, 2023
Photo by Paul Felberbauer on Unsplash
GPT and other large language models can produce huge volumes of code quickly.
This allows for faster prototyping and iterative development, trying out
multiple solutions. But it can also leave us with a bigger amount of mess
code to maintain.
"People have talked about technical debt for a long time, and now we have a brand new credit card here that is going to allow us to accumulate technical debt in ways we were never able to do before," said Armando Solar-Lezama, a professor at the MIT's Computer Science & Artificial Intelligence Laboratory to the Wall Street Journal.
This article explores several ways how to improve the code generated by these powerful tools and how to fit it into your project.
GPT can produce high quality code, but it often needs some "encouragement" to do so. The GPT best practices described by OpenAI can help with code generation as well.
if
. If the condition is unlikely to happen. => Use a try-except
."As an example, let's start with a quite general prompt:
Generate a Python function that takes a url and determines whether it's https.
from urllib.parse import urlparse
def is_https(url):
try:
result = urlparse(url)
return result.scheme == "https"
except Exception as e:
print(f"An error occurred: {e}")
return False
except Exception
is way too broad. Let's try to improve it.
Please rewrite the code above, so that you specify which exceptions can actually occur.
And we get indeed a better result:
from urllib.parse import urlparse
def is_https(url):
try:
result = urlparse(url)
return result.scheme == "https"
except ValueError as e:
print(f"Invalid URL, could not parse: {e}")
return False
ChatGPT was trained with a "diverse range of internet text" with the last update of September 2021. (as stated by ChatGPT itself)
The logical consequence is that it doesn't know about libraries, language constructs etc. that were released after September 2021. For example, the last Python release ChatGPT knows about is 3.9. Thus, it won't use any language features introduced in 3.10 or 3.11 You can workaround this problem by explicitly explaining these new features via prompts.
A less obvious consequence is that ChatGPT might prefer an older syntax or library, even if the newer one was released before the training cutoff. A plausible reason: The "diverse range of internet text" contains a ton of tutorials using the "old option", but only a few texts with the new one.
For example, if you ask ChatGPT to write a Python script communicating with an
API, it usually provides an answer using the requests
library. However, if you
explicitly ask for using httpx
, ChatGPT provides an answer using various async
features.
Just like human coders, ChatGPT tends to meet expectations better, if it knows what those expectations are. As the GPT Best Practices guide in the OpenAI docs writes:
GPTs can't read your mind.
If your project follow widespread standards, like PEP-8 in Python, chances are good that ChatGPT will generate code that fits into your project. If your project has some unusual conventions, you need to teach ChatGPT "your way" first.
If you have explicit coding standards, you can use them in your prompts:
Some examples for such prompts:
A crucial question is: What happens to the code after ChatGPT has generated it? In the next sections, we'll explore two main aspects of this:
A frequent cause of tech debt is code duplication. And a frequent cause of code duplication is that the developer didn't recognize that a functionality already exists.
Whether code has been written by a human or by our AI friend, it's important to put it at the right place. A clear project structure means:
Let's say your project includes both a something.util
and a
something.whatever.util
.
And something.util
contains a function like this:
def is_https(url: str) -> bool:
scheme, _, _, _, _ = urlsplit(url)
return scheme == "https"
Now, you're looking only in something.whatever.util
and don't see this
function. So, you're asking ChatGPT:
Generate a Python function that takes a url and determines whether it's https.
Sure, you get this result:
from urllib.parse import urlparse
def is_https(url):
try:
result = urlparse(url)
return result.scheme == "https"
except Exception as e:
print(f"An error occurred: {e}")
return False
=> Now, you have 2 functions for the same functionality using different libraries.
A 4 year long research leading to the book Accelerate came to similar conclusion. They identified two architectural characteristics that correlated with high performance:
- We can do most of our testing without requiring an integrated environment.
- We can and do deploy or release our application independently of other applications/services it depends on.
📖 Forsgren, Nicole - Humble, Jez - Kim, Gene: Accelerate March 2018, IT Revolution Press Chapter 5
In the book Accelerate, Nicole Forsgren and her team dedicate a chapter to Technical Practices and how they influence the performance of software teams. The chapter focuses on a single concept: continuous delivery.
According to their findings, continuous delivery decreases lead time, change fail rates, and the time necessary to restore the service. It also reduces deployment pain and (probably related to that) burnout.
The research team identified key capabilities that drive continuous delivery. The practices they found beneficial include:
📖 Forsgren, Nicole - Humble, Jez - Kim, Gene: Accelerate March 2018, IT Revolution Press Chapter 4 and Appendix A
While there hasn't been similar research yet on code generated by AI tools, our prediction is that the factors above will continue to matter. In fact, as our tools generate more code faster, we'll need even more reliable systems to verify and deliver that code.
GPT and the other large language models provide an amazing new way to write a lot of code quickly. The code created by our virtual friend faces the same challenges as code created by humans. It has to:
Getting better and more robust code out of these tools is an intriguing puzzle. It combines new elements like prompt engineering with established practices for continuous delivery and maintenance.