The agent said SUCCESS. The linter disagreed.

There’s a repair agent inside go-tool-base now. When you run gtb generate command, it doesn’t just spit out a file and wish you luck. An agent takes the generated code, builds it, runs the tests, and fixes whatever it broke, looping until the thing actually works (or until it’s tried the same fix five times and admits defeat). The whole point is that the generator hands you code that’s ready, not code that’s nearly ready and quietly now your problem.

So it stung a bit when I realised the agent had been holding itself to a lower bar than I’d hold any junior to. And I was the one who’d set the bar.

What “done” meant to the agent

The agent is a loop with real tools: it can build, test, read files, write files, tidy the module, and run golangci-lint. It works through them, and when it’s happy it replies with the word “SUCCESS” and the loop stops. On the Go side, the check is exactly that blunt:

if strings.Contains(strings.ToUpper(resp), "SUCCESS") {
    return nil
}

That’s the whole gate (agent.go). There’s no clever verification on my end that the agent actually did its homework. It does the work, it tells me it’s done, and I believe it. Which is fine, as long as the agent and I agree on what “done” means.

We didn’t.

The instruction that made lint optional

The agent decides it’s finished by following a numbered list in its system prompt. Here’s the line that did the damage:

If there are lint issues, use ‘golangci_lint’.

Read that the way the agent would. “If there are lint issues”… well, how would it know? The only way to find out is to run golangci-lint. But the instruction makes running golangci-lint the thing you do once you already know there are issues. It’s a chicken with no egg. And the SUCCESS condition at the bottom of the list never mentioned lint at all:

When the project builds successfully and tests pass, reply with “SUCCESS”.

So the agent did the sensible thing, given its orders. It built the code, ran the tests, saw both go green, and declared victory. golangci-lint was sat right there in its toolbox, unused, because nothing ever told it the job wasn’t finished until lint was clean too. I’d handed it a linter and then written a prompt that let it walk straight past it.

The galling part is that the linter was never the missing piece. The golangci_lint tool had been registered the whole time, and it even runs with --fix, so it’ll quietly clear the trivial stuff and only surface what actually needs a decision. The capability was there. The instructions just never required it.

The fix was words, not code

Here’s the part I find genuinely interesting. I didn’t add a check. There is no new gate in the Go. The fix is four lines of English:

Run ‘go_build’, ‘go_test’ and ‘golangci_lint’ in the project directory… Run all three; a clean build and passing tests do not imply clean lint.
Reply with “SUCCESS” only once ‘go_build’, ‘go_test’ AND ‘golangci_lint’ all pass with no errors and no reported issues.

That’s it. Lint moves from a remediation step you reach for once you somehow already know there’s a problem, into the gate itself. “Done” now means three green lights, not two.

It nags at me a little, that one. The reliability of an agent that writes and fixes real code came down to whether one sentence of instructions was precise enough. When your success criteria are a paragraph of prose, vagueness in that paragraph is a bug, the same as a vague type or an off-by-one. The spec just happens to be written in English, and the thing reading it is a language model that will cheerfully take the cheap reading if you leave it lying around. That’s the same lesson the goblin who wouldn’t stay dead taught me from the other direction: with these tools, what you say is what you get, and what you don’t say is fair game.

Leave it better, not just building

The Boy Scout Rule is the whole reason this blog exists, and I’d quietly exempted the robot from it. “Leave the campsite cleaner than you found it” had become “leave it building”, which is not the same thing and never was. If I’m going to put an agent in the loop precisely so it tidies up after the generator, then “tidy” has to mean what it would mean for a person on my team. Build, test and lint. No walking past the bin because nobody told you to pick it up.