AI Coding is Barely Good

I've spent the last couple months trying to get more and more comfortable working with Clojure, a functional programming language built for the JVM (and other things, but I'm starting from the basics). It's one of those languages with a nearly cult-like following: Clojure devs love to talk about how they use Clojure. It's also the first time I've used a lisp language and it's been a lot of fun getting the basics down. With whatever free time I find during the week I've been trying to level up my knowledge so that I could build a toy web service.

Meanwhile, in my day job as a software engineer, the rise of AI tools has been impossible to ignore. You may have heard of Cursor, an editor build around LLMs so that it can collaborate with you as you build projects. It's massively popular, and I'm concerned that if I don't pick up some of these tools soon, I'll be left behind.

But... these tools are all still just ok, and so regularly wrong that I can't justify using them as an everyday tool in my life. Here are the things I need improved to change my mind.

Dependency Roulette

LLMs have the attention span of a goldfish on espresso. Ask it to wire up a quick HTTP server and it will happily give you a working, but basic, solution using compojure. Maybe you don't know what that is, so I'll pivot to Python or Javascript frameworks. It spits out a solution with fastapi. A couple messages later while you debug something, and suddenly the dependency is flask. How? Is Flask really better for that exact thing I was looking at? Why would it be?

Neither library is wrong, but it's hard to trust a tool that consistently changes its mind and doesn't tell you. I've experienced even things like subtle function renames of code its written because whatever new information it's received and processed made it change its mind. Sometimes the new function or library is better, but not letting the user know about the swap is not a good standard.

The consequences at scale at pretty severe, in my opinion. With everyone using subtly different prompts and descriptions of problems, even an established codebase with well defined choices in its dependency list will find them polluted with duplicated functionality. I've seen this at my work: clearly AI-generated code is pushed up that includes 2 new dependencies with the same functionality as each and something that was already in the repository. Aside from making maintenance over such a repository difficult, it also increases the security risk. Each library has its own set of vulnerabilities, and it may be an undertaking to resolve security problems in all of them.

Bad at Build Tools

Simple service logic is the lifeblood of LLMs; they do just as good a job as a very junior level engineer at most tasks like that: well defined and in a constrained problem space. But one thing that I spend an unreasonable amount of time working on and improving is something that LLMs really struggle with: build tools.

As I learn(ed) Clojure, I spent a huge amount of time trying to understand how projects were built. I poked around at several larger, public repositories to see how things were laid out on disk and what lessons I could learn from that. I took some of those and applied them to my toy project, and then turned to a suite of LLMs to see if they could help me solve one very specific build tool that I want in every web service I worked on: a copy database and API schemas as a static file in the repository.

No LLM could do it, and they were all wrong in several fundamental ways. Not understanding Clojure's tools.build concept, particularly its isolation from everything else in the dependency list (and this despite apparently claiming to have read the documentation), brought me an immense amount of confusion. I didn't have the knowledge at the time to immediately call out the incorrect use of the tools, but I was highly suspicious of the claims since it seemed obvious from the documentation that this wasn't possible. Clearly I had misunderstood! Unfortunately, that was not the case and I spent a significant amount of time trying to work with different LLMs to find a solution. As a beginner to a language, this interaction was slow and increasingly frustrating as my trust in the tools deteriorated.

Code for Me, Not for Thee

I tell junior developers and interns some version of this phrase pretty consistently: you're going to learn to stop writing homework code and start writing production code. This means well-named variables, consistent with terminology found elsewhere in the codebase. It means following the patterns others set of comments, when to split out new modules/classes/whatever, what the test-writing strategy should look like, and many other things that make codebases maintainable even after 10, 15, or 20 years of development.

Homework code is about being correct, no matter the cost in readability or consistency. It's about following the textbook, itself a tool attempting to demonstrate concepts with a rather rigid set of constraints. This is the kind of code that LLMs are trained on, not production-grade code.

LLM code is easy for me to identify now: unnecessary comments (to an extreme), non-human separation of functionality or code "paragraphs", inconsistent variable and function names in the context of the rest of the codebase. This is fine for the vibe coder just getting their project off the ground, but as someone who has had to work in a multi-million line, 15 year old codebase before I will say very plainly: that code will need to be rewritten.

Where to Now?

LLMs are powerful tools. Or, that's what I keep telling myself. For writing code, they're right around the level of an intern or junior developer, and for certain topics may punch a bit higher. But this is only for specific tasks: once the context window expands to include the rest of a system or project, they fall apart. Pretending they’re senior engineers is wishful thinking at best and negligent more often than not.

So, where to now? I'll continue to use LLMs as thought partners. They can (usually) tell me if something is possible, or maybe give me an alternative to my initial solution. That kind of feedback is something that I value. But for actual execution, my skepticism runs deep, and only deeper since I tried using these tools more often when learning Clojure.

Still, the future is, in some way or another, going to involve AI. I don't want to be fully left behind when that happens, so be prepared for many more mini-rants from me on the topic in the future.