AI-assisted coding
In the last year, I've been using AI coding agents to help me write software. While they have in some cases improved my productivity, we're still quite far from having AI agents write complete applications. In this article I hope to provide some insight into the agents that I've used so far and the results they've given me. In addition, I'd like to share some tips and tricks that I've picked up along the way. Hopefully, some of these tips will also help you when you get started on integrating AI coding agents into your daily work.
This post is a follow-up to a Podcast I recently participated in about the subject of AI-assisted coding. If you're interested, you can listen to the Zorya podcast on Spotify or SoundCloud.
Let's start with an overview of the agents I've been using so far and the results I've gotten.
An overview of the agents I've used
Goose
URL: https:
An AI agent that allows you to use all the most popular LLMs (Large Language Models). I learned about Goose while visiting RenderATL and participating in one of the workshops. Goose is great for developing a quick prototype without having to worry too much about the quality of the underlying code. I used Goose together with Claude Sonnet 3.5 and got mixed results with it. My biggest frustration was that I continuously kept hitting rate limits at some point, which admittedly might have been an issue with my Claude subscription, but I didn't experience similar issues with other coding agents.
Pros
- Quick to get started with
- Great for prototyping
Cons
- Not code-oriented
- Limited context windows and no context summarization (at least not in the version I used)
- Generated applications are hit-or-miss
- Doesn't come with its own subscription
Cursor
URL: https:
An AI coding agent that comes as a VSCode extension. Overall, I got good results with it, and I was quite pleased with the speed at which it completed some tasks. My main issue with Cursor is primarily that it comes with VSCode, and while I don't have anything against VSCode, I still prefer to use other editors instead for JVM development. I used Cursor to write big parts of the personal finance application that I talked about in the podcast.
Pros
- Code-oriented
- Fast
- Integrated IDE
- The cheapest monthly subscription didn't feel limiting
Cons
- Can get stuck in loops on more complex tasks
GitHub Copilot
URL: https:
GitHub Copilot comes as a VSCode extension but is available in Jetbrains IDEs as well. I used it primarily from IntelliJ IDEA. So far I've used GitHub Copilot a lot in my professional work, and I've found it to be quite helpful. Since the developments were in a professional context, I limited the use of Copilot to well-defined tasks that had to get done or to tasks for which sufficient example code was available. For example, writing extra unit tests, generating small functions, writing documentation snippets and so on.
Pros
- Integrated in IDE
- Pricing is reasonable compared to other coding agents
Cons
- Not as smooth as Cursor and Junie
Jetbrains Junie
URL: https:
So far, Junie has been my favorite coding assistant. I've used it primarily on personal projects, but I'm quite happy with the results I've gotten so far. I've used it to write a lot of the code in the personal finance application mentioned earlier and have used it successfully on some other personal projects as well.
Pros
- Good integration with Jetbrains IDEs
- Feels slightly less performant than Cursor, but still faster than Copilot
Cons
- More expensive than other coding agents
My AI-assisted coding process
When thinking about how you want to use an AI coding agent, as part of your daily work, it's important to consider that LLMs are probabilistic models and that your prompt pushes the coding agent in a certain direction. The shorter your prompt, the more of a gamble you're making about the code that will get generated by the agent. Short prompts can be great for experimentation and prototyping, but for writing production-quality code, you will often find yourself providing more context and writing more detailed prompts.
Typically, the bigger the task at hand, the more detailed the prompt should be and the more context you should provide for the coding agent to perform the task.
For writing unit tests for an existing class, a prompt like this can be a good starting point:
Write extra unit tests for the BalanceCalculationService.
This, however, gives the AI agent a lot of freedom on how it writes those tests. Does it need to add them to an existing test class? What naming convention and code style should be followed? Should the agent run the tests after writing the test code? What are test cases that should definitely be covered?
To avoid a lot of back-and-forth with the AI agent, it is often better to provide a more detailed prompt from the start:
Write extra unit tests for the BalanceCalculationService.
Add these unit tests to the existing BalanceCalculationServiceTest class.
The BalanceCalculationService is responsible for calculating the balance of an account. Accounts are identified by a
unique account name and can be retrieved using the AccountRepository. Transactions for an account can be retrieved
using the TransactionRepository.
The main method to test is the calculateBalance method, which takes an account name, as a String, and returns the
balance of the account as a BigDecimal.
Some test cases that should be covered include:
* Calculating the balance of an account with no transactions
* Calculating the balance of an account with a single transaction
* Calculating the balance of an account with multiple transactions
* Calculating the balance of an account with multiple transactions, resulting in a negative balance
* Calculating the balance for an account that does not exist (should result in a failure)
Add extra test cases as needed to cover scenarios that I might have missed.
Use the same coding style, as in the existing test cases.
Frameworks that are being used for testing: kotest and mockk
Build the code with Gradle and fix any linting issues.
For developing a completely new feature, it is usually better to create an implementation plan first in a Markdown file and then using that Markdown file as part of the prompt. AI agents typically are not very good at completing big unbounded tasks, so it's important to split the implementation plan into clear subtasks and to ask the AI agent to implement each subtask individually.
My workflow for more complex tasks:
- Work with the AI agent to come up with an initial feature description
- Work with the AI agent to turn the feature description into an implementation plan
- Distill the implementation plan into subtasks or stories
- Ask the AI agent to implement each subtask one by one
- Provide prompts requesting changes to the code as needed
Note that while I use the agent in each of the above steps, I do typically manually modify the generated Markdown files. Coding agents tend to dream up extra features that you never wanted and never needed as well. Often, though, I find that the results are quite good and that I've missed some obvious corner cases or functionalities.
Tips & tricks
Context window
Coding agents have a limited context window. When having small conversations, you will typically not run into this, but when building more complicated features, you might notice that the AI agent at some point starts summarizing the previous parts of the conversation. The longer a conversation goes on, the more likely you are to run into weird results. Often the best solution is to start a new conversation when the agent starts freewheeling.
Small fixes throughout a codebase take a lot of time
A typical example of this is when you ask the coding agent to change something related to code style in the tests that it has written. The agent will often go test by test trying to fix the issue, every time going through a 'thinking' step to evaluate the next action to take. Sometimes this starts taking a lot of time, even for seemingly trivial changes.
Use version control
Most agents offer rollback functionality, but even with this taken into consideration I do recommend using version control on any reasonably sized project. Sometimes going back to the version you had three or four prompts ago is just a lot more convenient via version control. This becomes even more important when you run the coding agent in Brave or unrestricted mode.
Use unrestricted mode (at your own risk)
If you keep the agent in restricted mode, you're missing out on a lot of the productivity benefits. You often have to approve every command that the agent wants to run. In many agents you have an allowlist, but even with these I feel like I'm missing out on a lot of the benefits. Make use of proper backups and version control, and you shouldn't have to worry too much about the agent doing something wrong.
1 + 1 = 3
As the name Large Language Model implies, LLMs are quite powerful when it comes to language and text manipulation. When it comes to mathematics, though, I've seen some interesting results. The agent not being able to implement a simple Pythagorean theorem, or the agent giving weird explanations to questions that involve mathematics are some examples of this.
Debugging CSS or other styling issues
So far I haven't had a great track record with coding agents when it comes to, for example, CSS styling issues. I've even at some point tried to provide images to the agent, but this costs more tokens and the results are mixed. In many cases, I think this comes back to the earlier point about mathematics and real reasoning, which is not something most of the agents excel at today.
Token budgets
When using several of the above coding agents I ran into limitations on the most basic plans when trying to develop any non-trivial application. If you really want to benefit from using a coding agent without interruptions you might want to consider one of the more expensive plans, so you don't run out of tokens in the middle of developing a new feature.
Limited AI agent iterations
All the AI agents that I've tried so far have been limited in some way in the number of attempts or iterations they perform on the task at hand. This has to do with the fact that if left unchecked, the AI agent might iterate on the code an infinite number of times. In addition, if the agent gets stuck in a loop, it might not ever finish at all. For more complicated tasks, you might have to ask the agent to 'continue' several times and even in some cases, provide a new prompt to list the steps that still need to be completed.
On several occasions I had code that didn't compile, tests that didn't pass and linting errors after the agent told me it had completed all the steps in the task.
So always double-check if all checks pass after the agent has finished.
Conclusion
AI coding agents are a great productivity booster. However, if used in the wrong way, they can also generate extra work by introducing hard to track bugs or requiring a lot of iterations before providing you with a good result.