February 28, 2026

An AI agent coding skeptic tries AI agent coding, in excessive detail

Source: An AI agent coding skeptic tries AI agent coding, in excessive detail

The main lesson I learnt from working on these projects is that agents work best when you have approximate knowledge of many things with enough domain expertise to know what should and should not work. Opus 4.5 is good enough to let me finally do side projects where I know precisely what I want but not necessarily how to implement it.

Max Woolf (@minimaxir), a data scientist at BuzzFeed, produces a long and thorough post putting Claude Code and Codex through their paces from the perspective of someone who hasn't been super impressed by these offerings historically. It's one of a growing number of posts by agentic coding skeptics that acknowledge that, no, really things have changed.

The real annoying thing about Opus 4.6/Codex 5.3 is that it’s impossible to publicly say “Opus 4.5 (and the models that came after it) are an order of magnitude better than coding LLMs released just months before it” without sounding like an AI hype booster clickbaiting, but it’s the counterintuitive truth to my personal frustration.

I really appreciate Max’s willingness to update his priors and post this and wholeheartedly agree with his conclusion that the discourse has become mostly toxic and unhelpful.

All we can do is keep open minds and keep experimenting.

Loading...

Loading...

An AI agent coding skeptic tries AI agent coding, in excessive detail

Send your thoughts