<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Services on PHP Boy Scout</title><link>https://blog-570662.gitlab.io/tags/services/</link><description>Recent content in Services on PHP Boy Scout</description><generator>Hugo -- gohugo.io</generator><language>en-gb</language><copyright>Matt Cockayne</copyright><lastBuildDate>Tue, 24 Mar 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://blog-570662.gitlab.io/tags/services/index.xml" rel="self" type="application/rss+xml"/><item><title>Lifecycle management for when your CLI grows up into a service</title><link>https://blog-570662.gitlab.io/lifecycle-management-for-long-running-go-services/</link><pubDate>Tue, 24 Mar 2026 00:00:00 +0000</pubDate><guid>https://blog-570662.gitlab.io/lifecycle-management-for-long-running-go-services/</guid><description>&lt;img src="https://blog-570662.gitlab.io/lifecycle-management-for-long-running-go-services/cover-lifecycle-management-for-long-running-go-services.png" alt="Featured image of post Lifecycle management for when your CLI grows up into a service" /&gt;&lt;p&gt;There&amp;rsquo;s a moment in the life of a lot of CLI tools where they stop being a CLI tool. Nobody quite decides it. It just happens. Someone needs the thing to also expose a little HTTP endpoint, or poll a queue, or run a scheduler, so it grows a &lt;code&gt;serve&lt;/code&gt; command&amp;hellip; and the honest command-line utility you wrote is suddenly a long-running service wearing a CLI as a hat.&lt;/p&gt;
&lt;p&gt;And a service needs a whole pile of production plumbing that a one-shot command never did.&lt;/p&gt;
&lt;h2 id="the-command-that-stops-being-a-command"&gt;The command that stops being a command
&lt;/h2&gt;&lt;p&gt;go-tool-base is CLI-first. It is not CLI-&lt;em&gt;only&lt;/em&gt;, and the reason is a pattern I&amp;rsquo;ve watched play out more times than I can count.&lt;/p&gt;
&lt;p&gt;A tool starts its life as an honest command-line utility. It runs, it does its thing, it exits. Then someone needs it to expose a small HTTP endpoint. Or poll a queue. Or run a scheduler. So it grows a &lt;code&gt;serve&lt;/code&gt; command, or a &lt;code&gt;run&lt;/code&gt; command, and the moment it does, the thing that was a CLI tool is now a long-running service that happens to have a CLI bolted on the front.&lt;/p&gt;
&lt;p&gt;And a long-running service needs a whole category of plumbing a one-shot command never did. It has to start things up in a sensible order. It has to shut them down &lt;em&gt;gracefully&lt;/em&gt; when someone sends a &lt;code&gt;SIGTERM&lt;/code&gt;, finishing in-flight work rather than dropping it on the floor. It has to tell an orchestrator whether it&amp;rsquo;s alive, and whether it&amp;rsquo;s ready. It has to do something sensible when one of its internal services quietly falls over at 3am.&lt;/p&gt;
&lt;p&gt;Hand-rolled, that&amp;rsquo;s a few hundred lines of goroutine choreography, channel-wrangling and signal handling that every such tool reinvents, slightly differently and slightly wrong each time. It&amp;rsquo;s the first-afternoon problem all over again, just turning up later in the project&amp;rsquo;s life. So go-tool-base ships it: &lt;code&gt;pkg/controls&lt;/code&gt;.&lt;/p&gt;
&lt;h2 id="a-controller-and-the-things-it-controls"&gt;A controller and the things it controls
&lt;/h2&gt;&lt;p&gt;The model is small. A &lt;code&gt;Controller&lt;/code&gt; manages any number of services, each of which satisfies a &lt;code&gt;Controllable&lt;/code&gt; interface, which at heart is just a &lt;code&gt;StartFunc&lt;/code&gt; and a &lt;code&gt;StopFunc&lt;/code&gt;. An HTTP server, a background worker, a scheduler, anything with a &amp;ldquo;begin&amp;rdquo; and an &amp;ldquo;end&amp;rdquo;.&lt;/p&gt;
&lt;p&gt;You register your services with the controller and it owns their collective lifecycle. They share a common set of channels (errors, OS signals, health, control messages) so the whole set can react together. A &lt;code&gt;SIGTERM&lt;/code&gt; doesn&amp;rsquo;t get caught by one service off in a corner; it reaches the controller, and the controller takes everything down in order, each &lt;code&gt;StopFunc&lt;/code&gt; handed a context with a deadline so that one sulking service can&amp;rsquo;t wedge the whole shutdown forever.&lt;/p&gt;
&lt;p&gt;That ordering and timeout handling is the bit nobody enjoys writing and everybody needs. Centralising it means a tool that adds a second service later inherits correct coordinated shutdown for free, rather than discovering on its first production &lt;code&gt;SIGTERM&lt;/code&gt; that it only half shuts down.&lt;/p&gt;
&lt;h2 id="probes-because-something-is-usually-watching"&gt;Probes, because something is usually watching
&lt;/h2&gt;&lt;p&gt;If the service ends up in Kubernetes (and a lot of them do) the orchestrator wants to ask two different questions, and they really are different questions.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Liveness:&lt;/em&gt; are you alive, or are you wedged and in need of a kill? &lt;em&gt;Readiness:&lt;/em&gt; are you alive &lt;em&gt;and&lt;/em&gt; able to take traffic right now? A service can quite easily be live but not ready&amp;hellip; still warming a cache, still waiting on a dependency. Conflate the two and you get yourself killed during a slow startup, or sent traffic before you can actually serve it.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;controls&lt;/code&gt; keeps them separate. You attach a &lt;code&gt;WithLiveness&lt;/code&gt; probe and a &lt;code&gt;WithReadiness&lt;/code&gt; probe to a service, each just a function returning a health report, and the controller exposes them. The tool answers Kubernetes honestly, in Kubernetes&amp;rsquo; own terms, without you hand-wiring two more HTTP handlers.&lt;/p&gt;
&lt;h2 id="self-healing-but-only-if-you-ask"&gt;Self-healing, but only if you ask
&lt;/h2&gt;&lt;p&gt;The last piece is what happens when a service fails. A worker&amp;rsquo;s &lt;code&gt;StartFunc&lt;/code&gt; returns an error. Health checks start failing. In a hand-rolled setup this is where you either crash the whole process or write yourself a bespoke restart loop.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;controls&lt;/code&gt; has a supervisor that can restart a failed service for you, and the important word in that sentence is &lt;em&gt;can&lt;/em&gt;. It&amp;rsquo;s off by default. A service is only supervised if you hand it a &lt;code&gt;RestartPolicy&lt;/code&gt; at registration:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-go" data-lang="go"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nx"&gt;controls&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WithRestartPolicy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;controls&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;RestartPolicy&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;MaxRestarts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;InitialBackoff&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Second&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;MaxBackoff&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Second&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;HealthFailureThreshold&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;With a policy in place, the controller restarts the service if its &lt;code&gt;StartFunc&lt;/code&gt; errors out, or if it racks up more consecutive health-check failures than the threshold allows. Restarts back off exponentially, from &lt;code&gt;InitialBackoff&lt;/code&gt; up to a &lt;code&gt;MaxBackoff&lt;/code&gt; ceiling, so a service that&amp;rsquo;s failing because its database is down doesn&amp;rsquo;t sit there hammering that database flat with a tight restart loop. &lt;code&gt;MaxRestarts&lt;/code&gt; caps the attempts, because a service that&amp;rsquo;s failed five times in a row is not going to be rescued by a sixth go, and at that point honest failure beats a thrashing pretence of health.&lt;/p&gt;
&lt;p&gt;Opt-in matters here. Automatic restarts are exactly right for a resilient daemon and exactly &lt;em&gt;wrong&lt;/em&gt; for a tool where a failure should stop the line and get a human&amp;rsquo;s attention. The framework doesn&amp;rsquo;t make that call for you. It gives you the supervisor and lets you point it at the services that genuinely want it.&lt;/p&gt;
&lt;h2 id="the-bottom-line"&gt;The bottom line
&lt;/h2&gt;&lt;p&gt;A surprising number of CLI tools become long-running services the day they grow a &lt;code&gt;serve&lt;/code&gt; command, and the day they do, they need coordinated startup, graceful ordered shutdown, real liveness and readiness probes, and a considered answer to a service falling over. That&amp;rsquo;s a few hundred lines of fiddly, easy-to-get-wrong plumbing.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;pkg/controls&lt;/code&gt; provides it: a &lt;code&gt;Controller&lt;/code&gt; over &lt;code&gt;Controllable&lt;/code&gt; services with shared channels and deadline-bounded graceful shutdown, separate Kubernetes-style liveness and readiness probes, and an opt-in supervisor that restarts failed services with exponential backoff and a restart ceiling. Your tool can start as a command and grow into a daemon without that growth turning into a rewrite.&lt;/p&gt;
&lt;p&gt;CLI-first, but not stuck there.&lt;/p&gt;</description></item></channel></rss>