Featured image of post The AI provider that isn't an API

The AI provider that isn't an API

go-tool-base’s chat package puts five AI providers behind one interface. Four of them are exactly what you’d guess: HTTP calls to OpenAI, Claude, Gemini, and anything OpenAI-compatible. The fifth one isn’t an API at all. It shells out to a binary.

That sounds like a slightly mad thing to want, right up until you’ve worked somewhere the network says no.

The fifth provider shells out

The chat package speaks to five providers through one ChatClient interface. Four of them are what you’d expect: HTTP requests to OpenAI, to Claude, to Gemini, to any OpenAI-compatible endpoint. The tool author picks one in config, and the rest of the code never knows the difference.

The fifth, ProviderClaudeLocal, is different in kind. It doesn’t make an HTTP request at all. It shells out. It runs the claude CLI binary as a child process, passes the prompt in, and reads the answer back from the binary’s output.

That sounds like an odd thing to want until you’ve been stuck in the environment it was built for.

Why you’d want that

Picture a corporate network with its egress locked right down. Outbound HTTPS to api.anthropic.com is blocked by policy. A tool built on go-tool-base that uses AI would simply fall over there. It tries to reach the API, there’s no route, and that’s the end of the feature.

But the developer at that machine has the claude CLI installed, and has run claude login. That binary is permitted. It’s an approved, managed tool, and it has its own sanctioned path out. The direct API call is blocked; the claude command is not.

ProviderClaudeLocal is what bridges those two facts. If your tool’s AI calls go through that already-blessed binary instead of straight at the API, they work, in an environment where the direct call cannot. That’s the whole reason the provider exists. It isn’t faster (a real API call has lower latency) and it isn’t more capable. It’s for the place where the API call simply isn’t an option, and “isn’t an option” is a surprisingly common place to find yourself inside a large organisation.

What it costs, honestly

It’s worth being straight about the trade, because ProviderClaudeLocal is the reduced-capability provider.

It doesn’t do tool calling. It doesn’t do parallel tools. It doesn’t stream. Those need a live, structured connection to the model’s API, and a subprocess that runs once and prints an answer is not that. What it does support is plain chat and structured output, the latter through the binary’s own --json-schema flag.

So the honest positioning, and the package’s documentation says exactly this, is: prefer the API providers when you can reach them, because they’re lower latency and feature-complete. Reach for ProviderClaudeLocal when API access is restricted. You accept the narrower capability set as the price of working at all. For a tool whose AI feature is “answer a question” or “return a structured analysis”, that price is often nothing you’d even notice. For one built on an agentic tool-calling loop, it’s a real limitation, and you’d know to expect it.

How it stays behind the same interface

Here’s the part that makes it pleasant rather than a special case to maintain. Despite being a subprocess and not an API, ProviderClaudeLocal is still a ChatClient. Your feature code calls Chat and Ask exactly the way it would for any other provider.

Everything that makes a subprocess provider awkward stays inside the provider. Spawning the binary, feeding it the prompt, parsing its output, capturing stderr and surfacing it when the binary exits non-zero, and threading multi-turn continuity through session identifiers passed back on the next call with --resume: all of that is the provider’s problem, and all of it sits behind the interface. The code in your tool that uses AI doesn’t know, and has no way to find out, that this particular provider is a child process rather than an HTTPS call.

That’s a unified interface genuinely earning its place. It’s easy to put a uniform face on four things that already work the same way underneath. The real test of the abstraction is whether something that works in a completely different way, a subprocess instead of a socket, can still slot in without the caller changing a line. Here it can. You swap one config value, and a tool that talked to an API now talks through a binary, and nothing downstream so much as blinks.

The bottom line

go-tool-base’s chat package puts five providers behind one ChatClient interface, and ProviderClaudeLocal is the one that isn’t an API. It runs the locally installed, pre-authenticated claude CLI as a subprocess.

It exists for the locked-down environment where outbound HTTPS to the AI API is blocked but the claude binary is allowed: there, AI features keep working where a direct call would fail. The trade is a narrower capability set (no tool calling, no streaming, plain chat and structured output only) so you prefer the API providers when you can reach them and fall back to this when you can’t. And because it’s still a ChatClient, all the subprocess machinery stays hidden, and your code uses it without knowing it’s there. That last part is the real test of an abstraction: a provider that works in an entirely different way still slots in unchanged.

Built with Hugo
Theme Stack designed by Jimmy