[{"content":"Some bugs are interesting because they\u0026rsquo;re subtle. These two were interesting because they were the exact opposite\u0026hellip; in each case the tool had a hard rule I simply didn\u0026rsquo;t know about, and its error message couldn\u0026rsquo;t be bothered to tell me what that rule was. Both came out of building the infrastructure toolchain, both cost me a good deal more time than they had any right to, and both are the sort of thing that looks blindingly obvious the moment you know it and utterly baffling until you do.\nSo here they are, written down, partly to save you the bother and partly so I don\u0026rsquo;t go and forget them myself.\nBug one: the rule-less job that skips your merge requests The cicd gate components, in their first cut, shipped with no rules: block. They were dead simple jobs: lint, scan, validate. No conditions, because they should just always run. Obviously.\nThey ran on branch pipelines. On merge requests, they didn\u0026rsquo;t run at all! The gates that were the entire point of the components were simply absent from the one place you\u0026rsquo;d most want to see them\u0026hellip; the merge request.\nThe cause is a GitLab CI rule that\u0026rsquo;s remarkably easy to go years without ever learning: a job with no rules: block runs only on branch and tag pipelines. It does not run on merge-request pipelines. So \u0026ldquo;no conditions\u0026rdquo; doesn\u0026rsquo;t mean \u0026ldquo;runs everywhere\u0026rdquo; at all. It means \u0026ldquo;runs everywhere except a merge request\u0026rdquo;, which is about the least intuitive default I can think of.\nThe fix is faintly absurd, and that\u0026rsquo;s exactly what makes it stick. You add an unconditional rule: rules: [{ when: on_success }]. The content of that rule does precisely nothing. It always matches. What actually matters is that the job now has a rules: block at all, because merely having one is what makes a job eligible for merge-request pipelines. A rule whose content is meaningless, added solely so the block exists. That\u0026rsquo;s the fix. I\u0026rsquo;ll admit I stared at it for a moment.\nBug two: the import block that only works at the root The second one came from terraform-aws-security-baseline. The account-hardening module needed to adopt a resource that already existed in the account, which is exactly what OpenTofu\u0026rsquo;s import {} block is for. So an import block went into the account-hardening module, right next to the resource it was adopting. The natural home for it, surely.\nOpenTofu disagreed, and rejected it outright. The rule: an import block is only allowed in the root module. It can\u0026rsquo;t live inside a child module. A module that wants one of its own resources imported can\u0026rsquo;t declare that import itself\u0026hellip; the import has to be declared up at the root, and the root caller does the adopting.\nThe fix was to take the import block out of the module and document caller-side adoption instead. The module describes the resource, and the root configuration that calls the module is where the import actually lives.\nThe shape they share Two unrelated bugs, in two completely different tools, and the same shape sitting underneath both of them.\nIn each case the tool has a hard structural rule. Where a block is allowed to live. What makes a job eligible for a particular kind of pipeline. And in each case the error told me the tool was unhappy without telling me which rule I\u0026rsquo;d broken, so the obvious next move (debugging my own logic) was the wrong move entirely. There was nothing wrong with the logic. The thing was simply in a place the tool doesn\u0026rsquo;t allow, or missing a block the tool quietly insists on.\nThe lasting lesson here isn\u0026rsquo;t the two specific rules, useful as they are to know. It\u0026rsquo;s the reflex. When something that should obviously work just doesn\u0026rsquo;t, and the error is unhelpful, stop debugging your logic and start suspecting a structural rule about where something is allowed to be, or whether a thing is eligible in the first place. GitLab CI and OpenTofu both have a handful of these, and you mostly learn them the hard way, by tripping over them. Knowing the shape of the category at least means the next one costs you an hour instead of a whole afternoon.\nWorth remembering Two bugs from building the toolchain, one shape. A GitLab CI job with no rules: block runs on branches and tags but silently not on merge requests, and the fix is an unconditional rules: block whose content does nothing and whose mere existence is the entire point. An OpenTofu import block gets rejected inside a child module, because imports are only legal at the root, so the caller adopts and the module just describes.\nNeither error named the rule it was enforcing, and that\u0026rsquo;s the category to watch for. When sound logic fails against an unhelpful error, suspect a structural rule about where a thing may live or whether it\u0026rsquo;s even eligible\u0026hellip; not a bug in what you actually wrote. It\u0026rsquo;ll save you an afternoon. It certainly cost me a couple.\n","date":"2026-05-20T00:00:00Z","image":"/two-bugs-that-taught-me-the-rules/cover-two-bugs-that-taught-me-the-rules.png","permalink":"/two-bugs-that-taught-me-the-rules/","title":"Two bugs that taught me the rules"},{"content":"The genuinely dangerous moment in infrastructure-as-code isn\u0026rsquo;t the apply. It\u0026rsquo;s the gap between the plan a human read and approved, and the change that actually runs a moment later. If those two are different computations (and by default they are) then nobody really reviewed the thing that touched your account. The infra repo closes that gap from both ends.\nThe gap between \u0026ldquo;reviewed\u0026rdquo; and \u0026ldquo;ran\u0026rdquo; Here\u0026rsquo;s the moment in infrastructure-as-code where things go wrong.\nSomeone opens a merge request. CI runs tofu plan and the output is there to review: these three resources change, this one is destroyed. A human reads it, decides it\u0026rsquo;s correct, approves, merges. Then apply runs.\nThe trap is in what apply actually applies. If apply does its own fresh tofu plan and then applies that, the change that runs is not necessarily the change that was reviewed. State can have moved. A provider can have drifted. Someone else can have applied something in between. The reviewed plan and the applied change are two separate computations done at two different moments, and every difference between those moments is a change nobody looked at.\ninfra closes that gap from both ends.\nPlan as an artifact The first end is making the reviewed plan and the applied plan the same object.\nThe tofu-plan component runs the plan and saves it. It writes tfplan.cache, OpenTofu\u0026rsquo;s binary plan file, as a CI artifact. It also writes tfplan.json, which GitLab renders as a plan widget right in the merge request: the add, change and destroy summary, there to review without leaving the MR.\nThe tofu-apply component then does not re-plan. It applies that saved tfplan.cache. And OpenTofu itself enforces the safety net: applying a stale plan file, one captured against a state that has since moved, is rejected by the tool. So what reaches the account is provably the plan that was reviewed, or it\u0026rsquo;s nothing at all. There\u0026rsquo;s no third option where something unreviewed slips through.\nApplying is a human decision The second end is when apply runs.\ninfra is trunk-based: it dropped the develop branch and works on main. But a naive trunk setup auto-applies every push to main, which means there\u0026rsquo;s no human gate at all, just whatever the last merge happened to contain.\nSo the gate is built explicitly. releaser-pleaser keeps a release merge request open against main. Ordinary merges to main run plans but apply nothing. The apply happens only when a person merges the release MR. Merging it cuts a release tag, and the tag pipeline is what runs tofu-apply, against the plan banked by the latest main pipeline.\nThe effect is that the act of applying to the account is the deliberate, visible act of merging the release request. Nothing reaches the account because a commit landed. It reaches the account because a person decided a release should go out and merged it. (Which, after the accidental v2.0.0 that kicked off the whole GitLab move, is a discipline I\u0026rsquo;d freshly relearned the value of.)\nThe guard on the gate There\u0026rsquo;s one more piece, because a gate is only as good as its precondition.\nA verify-main-plan job blocks the release MR from being mergeable unless the latest main pipeline is green. You can\u0026rsquo;t cut a release, and therefore can\u0026rsquo;t apply, on top of a main whose plan didn\u0026rsquo;t even succeed. The human gate has its own gate: the thing you\u0026rsquo;re about to merge has to be standing on a known-good plan before you\u0026rsquo;re allowed to merge it.\nThe bottom line The risk in infrastructure-as-code is the gap between the plan a human reviewed and the change that runs, because a re-plan at apply time is a different computation from the one that was approved.\ninfra closes it twice over. tofu-plan saves the plan as a tfplan.cache artifact and renders it as a merge-request widget; tofu-apply applies that exact artifact, and OpenTofu rejects it outright if the state has moved underneath it. And applying is gated on a human merging a releaser-pleaser release request, not on a push, with a verify-main-plan check making sure that request can only be merged on top of a green plan. What gets applied is what was reviewed, when a person decided it should be.\n","date":"2026-05-18T00:00:00Z","image":"/reviewed-then-applied/cover-reviewed-then-applied.png","permalink":"/reviewed-then-applied/","title":"Reviewed, then applied"},{"content":"Once an infrastructure repo has a few concerns in it (account hardening, the security baseline, the signing stack still to come) there\u0026rsquo;s a steady pressure to split them into separate stacks with separate state, and Terragrunt is right there to help you do it. The infra repo keeps everything in one OpenTofu graph instead. The reason comes down to who enforces your dependency ordering: the engine, or you.\nThe pressure to split The infra repo\u0026rsquo;s src/ has several concerns in it, and more coming, the signing stack among them. Once a repo reaches that point, there\u0026rsquo;s a steady pressure to split: one stack per concern, each with its own state file.\nIt\u0026rsquo;s an appealing pressure. Separate stacks feel modular. Each apply touches less, so the blast radius of any one run is smaller. And Terragrunt exists, popular and well-regarded, precisely to orchestrate a fleet of separate stacks. The path is well trodden.\ninfra didn\u0026rsquo;t take it. src/ is a single OpenTofu root stack: each concern is a module block, in its own main.\u0026lt;concern\u0026gt;.tf file, all sharing one state and one graph.\nWhat one graph gives you The thing a single graph gives you is engine-enforced truth about ordering and data.\nInside one OpenTofu graph, the tool builds the full dependency DAG itself. When the signing stack needs a value the security baseline produced, you reference it directly, module.baseline.something, and OpenTofu guarantees two things: the baseline is created before the thing that depends on it, and the value handed across is the current one from this same apply. Ordering and data-passing aren\u0026rsquo;t things you arranged. They\u0026rsquo;re facts the engine checks and enforces, every plan, every apply.\nWhat splitting costs Split src/ into per-concern stacks with separate state, and that guarantee is the thing you spend.\nNow one stack reads another\u0026rsquo;s outputs through terraform_remote_state. That\u0026rsquo;s a lookup of a snapshot: the other stack\u0026rsquo;s last applied state, whatever it was, whenever that was. It\u0026rsquo;s not a live edge in a graph. Ordering is no longer enforced by the engine either; it becomes something you arrange yourself, in CI stage sequencing or in Terragrunt\u0026rsquo;s own dependency blocks.\nThat\u0026rsquo;s the trade, stated plainly. You give up a strong, engine-checked guarantee, and you buy back a weaker, hand-arranged imitation of it. Terragrunt is a good tool for managing that weaker world tidily. But the question worth asking first is whether you should be in the weaker world at all.\nWhen splitting is genuinely right This isn\u0026rsquo;t an argument that splitting is always wrong. Separate states genuinely earn their place when concerns have different change cadences, different access boundaries, or different teams owning them: when you actively want an apply of one to be unable to touch another, and you want different people holding different state.\ninfra has none of those. It\u0026rsquo;s a single account, a single operator, one cohesive set of concerns. The only thing splitting would buy here is a smaller per-apply blast radius, and that\u0026rsquo;s better handled by reviewing the plan before it applies, which the next post is about, than by fragmenting the dependency graph. So src/ stays one graph, and Terragrunt was considered and deliberately not adopted.\nIf ordering between graphs is ever needed If infra ever does genuinely need more than one stack, the plan isn\u0026rsquo;t Terragrunt. It\u0026rsquo;s to keep each stack a single strong graph internally, and to sequence the stacks with CI stages. Keep the engine-enforced guarantee where it\u0026rsquo;s strongest, inside each graph, and reach for hand-arranged ordering only at the one seam where it\u0026rsquo;s unavoidable.\nBoiling it down A multi-concern infrastructure repo feels like it should be split into per-concern stacks, and Terragrunt is right there to manage the result. infra keeps src/ as one OpenTofu graph instead.\nInside one graph, OpenTofu enforces dependency ordering and passes current values across module boundaries as checked facts. Split into separate states and that becomes a terraform_remote_state snapshot lookup plus ordering you arrange by hand: a weaker version of what you gave up. Splitting is right when concerns have different cadences, boundaries or owners; for a single-account, single-operator repo none of that applies, so the strong guarantee is worth keeping, and Terragrunt is the tool for a problem infra chose not to have.\n","date":"2026-05-17T00:00:00Z","image":"/one-graph-not-micro-stacks/cover-one-graph-not-micro-stacks.png","permalink":"/one-graph-not-micro-stacks/","title":"One graph, not micro-stacks"},{"content":"Every infrastructure repo runs the same CI: lint the OpenTofu, scan it, validate it, plan, apply. The first repo, you write that .gitlab-ci.yml by hand. The second, you copy it. By the third, you\u0026rsquo;ve got three copies of the same pipeline quietly drifting apart, which is the exact problem you\u0026rsquo;d never tolerate in application code. The cicd repo is the fix, and it\u0026rsquo;s just the library-first instinct pointed at the pipeline.\nThe .gitlab-ci.yml you keep copying The infrastructure repos in this series all run the same CI gate jobs: format and validate the OpenTofu, lint it, scan it for security issues and secrets, and on the deploy side, plan and apply.\nThe first repo, you write that .gitlab-ci.yml by hand. The second repo needs the same jobs, so you copy it. The third repo, you copy it again. Now there are three copies of the same pipeline, and they do what copies always do. They drift. A fix you make in one repo\u0026rsquo;s CI doesn\u0026rsquo;t reach the other two. A tightened scan rule lands in the repo you were working in and nowhere else. It\u0026rsquo;s the copy-paste problem, exactly as it shows up in application code, just written in YAML and therefore that bit easier to pretend isn\u0026rsquo;t code.\nGitLab has a feature for exactly this GitLab CI/CD Components are the answer to that problem. A component is a reusable, versioned piece of pipeline that you publish, and other projects pull in with an include: pinned to a version:\ninclude: - component: gitlab.com/phpboyscout/cicd/tofu-lint@v0.5.0 That\u0026rsquo;s a library import, for pipeline. The component has a defined interface, a version, and a home in GitLab\u0026rsquo;s CI/CD Catalog. A consuming repo includes it instead of carrying its own copy, and when the component improves, the consumer moves a version pin rather than re-copying YAML.\nWhy a monorepo of components The cicd repo holds all of the components together: tofu-lint, tofu-security, tofu-validate, tofu-plan, tofu-apply, and more. One project, not one project per component.\nThat\u0026rsquo;s a deliberate call, and the reason is how GitLab versions things. A version is a tag, and a tag belongs to a project. A component\u0026rsquo;s version is its project\u0026rsquo;s tag. So a monorepo of components, versioned together as one tag stream, is the natural unit: a consumer pins @v0.5.0 and gets a known-good set of components that were tested together, rather than juggling a separate version for each one.\nAuthoring discipline A component is a file under templates/, and it opens with a spec: inputs: block: the typed inputs, their defaults, the component\u0026rsquo;s public interface.\nThe discipline that keeps the library usable is that a component must be consumer-agnostic. It never hardcodes a token, and it never names a particular consumer\u0026rsquo;s variable. Inputs have sensible defaults, and a consuming repo overrides them. A component that reaches out and assumes something about the repo including it is a component that works in one repo and surprises the next. An authoring guide in the repo keeps that consistent across everyone who adds a component.\nThe self-test you cannot fully write The cicd repo tests its own components with a self-test pipeline. It\u0026rsquo;s worth knowing where that self-test stops.\nWhen a repo tests its own components by running them in child pipelines, GitLab masks $CI_PIPELINE_SOURCE as parent_pipeline. A component\u0026rsquo;s rules:, which often branch on the pipeline source to behave differently for a merge request than for a branch or a tag, therefore can\u0026rsquo;t be exercised honestly by the self-test: the source they\u0026rsquo;d branch on has been flattened. The self-test covers what it can, and the component rules: are, in the end, validated by real consumers using them for real. That\u0026rsquo;s a genuine limit, and naming it is better than pretending the self-test proves more than it does. (It\u0026rsquo;s also, not coincidentally, the exact rules: quirk that bit me in one of the two bugs I closed the series with.)\nThe same instinct, again This blog keeps circling the same instinct. go-tool-base exists because the same CLI scaffolding kept getting rewritten, so it was extracted into a library. cicd is that instinct pointed at the pipeline: the same gate jobs kept getting copied between repos, so they were extracted into a versioned, included library.\nStop copy-pasting. Publish, version, include. It\u0026rsquo;s true for CLI code, and it turns out to be just as true for the YAML that builds and ships it.\nThe gist Every infrastructure repo needs the same CI, and copying the .gitlab-ci.yml between them produces copies that drift apart. GitLab CI/CD Components fix it: reusable, versioned pipeline that a repo include:s and pins, instead of carrying its own copy.\ncicd is a monorepo of those components, versioned together as one tag stream, because GitLab tags a project and a component\u0026rsquo;s version is its project\u0026rsquo;s tag. Components are authored consumer-agnostic, with typed spec: inputs: and no hardcoded assumptions, and their rules: are validated by real use because the self-test can\u0026rsquo;t see the pipeline source. It\u0026rsquo;s the library-first instinct, applied to CI: publish it once, include it everywhere, fix it in one place.\n","date":"2026-05-16T00:00:00Z","image":"/ci-you-include-not-copy/cover-ci-you-include-not-copy.png","permalink":"/ci-you-include-not-copy/","title":"CI you include, not copy"},{"content":"Every CI gate job across the infrastructure repos reaches for the same pile of tools: OpenTofu, tflint, trivy, checkov, gitleaks, terraform-docs, the AWS CLI. Installing that pile per job is both slow and quietly dangerous, because nothing pins it consistently. infra-tools is the obvious fix (one image, one source of truth for versions), but two of its build decisions are less obvious and worth a look: it publishes with crane instead of a second build, and it deliberately lets its own vulnerability scan fail.\nThe same pile of tools, in every repo Every infrastructure repo in this series runs the same CI gate jobs: format and validate the OpenTofu, lint it, scan it for security problems and secrets, check the docs. Those jobs need a specific set of tools, and it\u0026rsquo;s the same set in every repo.\nInstall them per job and you pay twice. You pay in time, because every pipeline downloads and installs the whole set again. And you pay in drift, because unless every repo pins every tool identically, the repos slowly diverge on which version of trivy or tflint they actually run, and a check that passes in one repo fails in another for no reason anyone can see.\nOne image, one source of truth infra-tools is the answer: a single Debian-based container image with the whole toolchain baked in. Every CI job in every repo uses it with one image: line.\nThe real value isn\u0026rsquo;t the convenience. It\u0026rsquo;s that the image is the one place tool versions are pinned. The Go-based tools are pinned in a mise.toml. checkov, which has no mise plugin, is pinned in a requirements file installed with pipx. The AWS CLI is pinned by a build argument. Three mechanisms, because the tools come from three kinds of source, but one image, and every pin wired to Renovate so a version bump arrives as a reviewable pull request. There\u0026rsquo;s exactly one answer to \u0026ldquo;what version of trivy does the toolchain use\u0026rdquo;, and it lives here.\nPublishing with crane, not a second build A build-pipeline detail that took a real bug to discover.\nThe pipeline builds the image with kaniko, which builds images without a privileged Docker daemon, something that matters a great deal on shared CI runners. Then it scans the image, then it publishes it.\nThe obvious way to write the publish stage is \u0026ldquo;build the image and push it\u0026rdquo;. But kaniko has no mode for \u0026ldquo;just push this tarball I already built\u0026rdquo;. A second kaniko invocation re-executes the entire Dockerfile from the top, including a second mise install, which makes a fresh round of calls to GitHub\u0026rsquo;s API to fetch tools. GitHub\u0026rsquo;s anonymous API limit is low and shared by IP, so on a CI runner that second install reliably trips a 403 rate-limit. (Yes, another 403. They do get everywhere.)\nSo the publish stage doesn\u0026rsquo;t rebuild. It uses crane to push the exact image tarball the build stage already produced. The image is built once. And because the published bytes are the same bytes the scan stage scanned, there\u0026rsquo;s no gap between \u0026ldquo;the image we checked\u0026rdquo; and \u0026ldquo;the image we shipped\u0026rdquo;.\nSoft-failing the scanner on purpose The decision that looks wrong until you see the reasoning: the pipeline scans the image with trivy, and trivy is allowed to fail without failing the pipeline.\nA vulnerability scanner that doesn\u0026rsquo;t gate the build sounds like a scanner switched off. It isn\u0026rsquo;t. It\u0026rsquo;s a scanner pointed at something it can\u0026rsquo;t helpfully gate.\nThe tools in the image are prebuilt Go binaries. trivy inspects them, reads the version of the Go runtime each was compiled with, and reports every known CVE in that Go runtime. Those findings are real, but they aren\u0026rsquo;t mine to fix. The only fix is the upstream tool rebuilding itself against a patched Go. With seven such tools in the image, at any given moment one of them is usually a little behind on its Go version.\nA hard gate would mean the image becomes unpublishable whenever any single upstream lags, over a CVE in code I don\u0026rsquo;t own and can\u0026rsquo;t patch. That\u0026rsquo;s not a security control; it\u0026rsquo;s a way to be unable to ship. So the scan is allow_failure. The findings stay fully visible, and the residual count is genuinely useful as a metric for how far behind upstream the toolchain has drifted. It just doesn\u0026rsquo;t block shipping an image whose only \u0026ldquo;vulnerabilities\u0026rdquo; are other people\u0026rsquo;s build timelines.\nWhat it comes down to The infrastructure repos all run the same CI gate jobs, needing the same tools, so infra-tools bakes the whole toolchain into one image and pins every version in one place, wired to Renovate.\nTwo build choices are worth copying. The publish stage uses crane to push the already-built, already-scanned tarball, because a second kaniko build would re-run mise install and hit GitHub\u0026rsquo;s anonymous rate limit, and because pushing the scanned bytes means shipping exactly what was checked. And the trivy scan is deliberately allow_failure, because it reports Go-runtime CVEs in prebuilt upstream binaries that no change to this repo can fix, so a hard gate would only make the image unshippable over someone else\u0026rsquo;s lag.\n","date":"2026-05-15T00:00:00Z","image":"/one-image-for-the-whole-toolchain/cover-one-image-for-the-whole-toolchain.png","permalink":"/one-image-for-the-whole-toolchain/","title":"One image for the whole toolchain"},{"content":"The OIDC post explained the handshake that lets a GitLab pipeline deploy to AWS with no stored key. This is the story of the first time I got it wrong, and spent an afternoon fixing the wrong thing. The error was a flat 403 from AWS, and the maddening part is that no amount of editing the IAM policy was ever going to fix it.\nA 403 on the first real run The OIDC post covered the handshake: GitLab CI mints a signed token, AWS exchanges it for short-lived credentials against a role whose trust policy names the pipeline. During the GitLab migration I wired exactly that up for the infra repo, including a trust policy condition meant to let merge-request pipelines run a plan.\nThe first merge request that should have triggered tofu-plan didn\u0026rsquo;t run it. The job failed, and the error from AWS was a flat AccessDenied. A 403.\nThe instinct, and why it wastes an afternoon The instinct on an IAM 403 is immediate and almost always right: the policy\u0026rsquo;s wrong, so go and edit the policy. Tighten the condition. Loosen the condition. Check the wildcard. Re-read the sub pattern character by character.\nAll of that was wasted, and it was wasted for a reason that took me far too long to see. The trust policy wasn\u0026rsquo;t matching the wrong value. It was matching a value that does not exist. No amount of editing a condition makes it match a thing that\u0026rsquo;s never present.\nWhat is actually in the token GitLab\u0026rsquo;s OIDC token has a sub claim that encodes the pipeline\u0026rsquo;s context, and part of that encoding is a ref_type. I\u0026rsquo;d assumed ref_type could be branch, tag, or mr, because a pipeline can certainly be a branch pipeline, a tag pipeline, or a merge-request pipeline. So the trust policy, for the plan job, matched a sub containing ref_type:mr.\nThat assumption was wrong. GitLab\u0026rsquo;s ref_type is branch or tag. That\u0026rsquo;s the entire set. There is no mr.\nA merge-request pipeline doesn\u0026rsquo;t run against a merge-request ref. It runs against the source branch. So its token\u0026rsquo;s sub carries ref_type:branch, like any other branch pipeline. The trust policy condition asked for ref_type:mr, GitLab never puts mr in a token, the condition was therefore never true, and every merge-request pipeline got a 403. Forever, until the policy stopped asking for a claim that isn\u0026rsquo;t real.\nThe fix, and the lesson worth more than the fix The fix is small once it\u0026rsquo;s visible: match ref_type:branch and narrow it down by branch name or project path instead. An afternoon of policy edits, and the actual change is one word.\nThe lesson is the part worth keeping. When an OIDC trust fails, the useful question is never \u0026ldquo;is my policy clever enough\u0026rdquo;. It\u0026rsquo;s \u0026ldquo;what\u0026rsquo;s actually in the token\u0026rdquo;. An OIDC trust policy can only ever match the claims the identity provider genuinely asserts, and the gap between what a provider asserts and what you assumed it asserts is precisely where this class of bug lives.\nSo the move, when an OIDC handshake 403s, is to get hold of a real token and decode it. Look at the actual sub, the actual claims, the actual values. Match what\u0026rsquo;s there. A 403 that survives every sensible edit to the policy is usually not a policy that\u0026rsquo;s too loose or too strict. It\u0026rsquo;s a policy matching a claim that was never going to be in the token.\nThe habit it left behind I wired an OIDC trust policy to let merge-request pipelines plan, by matching a sub claim with ref_type:mr. The first real merge request got a 403, and no edit to the policy fixed it, because GitLab\u0026rsquo;s ref_type is only ever branch or tag. A merge-request pipeline runs on a branch ref, so the mr value the policy demanded was never in any token.\nThe fix was one word. The habit it left behind is the valuable bit: when an OIDC trust fails, stop editing the policy and go and read a real token. A trust policy can only match what the provider actually asserts, and \u0026ldquo;what I assumed it asserts\u0026rdquo; is where the 403 was hiding the whole time. (If this shape of bug feels familiar by the end of the series, that\u0026rsquo;s not an accident: I come back to it with two more from exactly the same family.)\n","date":"2026-05-14T00:00:00Z","image":"/a-403-you-cant-fix-in-iam/cover-a-403-you-cant-fix-in-iam.png","permalink":"/a-403-you-cant-fix-in-iam/","title":"A 403 you can't fix in IAM"},{"content":"go-tool-base\u0026rsquo;s VCS support has two halves that get confused for one. One half talks to forge APIs (GitHub, GitLab) for releases and pull requests. The other talks to the .git directory on disk: clone, history, diff, status. This post is mostly about the second half, and specifically about a question that turns out to have three answers in Rust, only one of which I\u0026rsquo;d recommend: how do you actually do Git from inside a program?\nA VCS subsystem with two halves go-tool-base has a VCS subsystem, and it does two distinct jobs.\nThe first is forge APIs. GitHub and GitLab, Enterprise and nested group paths included. It authenticates, lists releases, fetches release assets, manages pull requests. The self-update machinery sits on this half, and it\u0026rsquo;s what a tool uses to ask \u0026ldquo;what\u0026rsquo;s the latest release?\u0026rdquo;\nThe second is local Git. go-tool-base also carries a RepoLike object, an abstraction over an actual Git repository on disk: clone it, read its commit history, diff two trees, check its status. This half doesn\u0026rsquo;t talk to a hosting service at all. It talks to the .git directory.\nIt would be easy to assume the second half grew out of the first. It didn\u0026rsquo;t, and where it actually came from is the part worth telling.\nA capability ahead of its consumer The RepoLike object wasn\u0026rsquo;t built for go-tool-base. It came from another project, where it had already proved itself, and it was pulled into go-tool-base on purpose, with a specific future consumer in mind: the code generator.\nThe plan is for the generator to use Git directly. When it scaffolds a new tool, that tool should start life as a Git repository, with a git init and an initial commit. When you later regenerate, the generator should diff the regenerated template output against your working tree to detect drift, the same idea as respecting your edits. Both of those are local Git operations, not API calls, so the generator needs a repository abstraction to call into.\nThat wiring isn\u0026rsquo;t finished yet. The generator doesn\u0026rsquo;t drive RepoLike today. But the capability is in place, deliberately, ahead of the consumer that will use it, because the alternative is bolting Git support on later under deadline pressure, and that\u0026rsquo;s how you end up with the wrong abstraction.\nSo when rust-tool-base was built, a repository abstraction was never in question. The Rust port carries the same capability for the same reason: a Repo type with init, open, clone, walk, diff, blame, status, commit, fetch and checkout, present and ready for the generator to wire into. The open question was never whether to have it. It was how to do Git from inside a Rust program, and there are three answers, only one of which is any good.\nThree ways to do Git, and the one worth picking Shell out to git. Run the git binary as a subprocess and parse its output. It works until it doesn\u0026rsquo;t. The binary might not be installed. It might be a different version with different output. Its output is formatted for humans and changes between releases, so parsing it is a standing liability. You\u0026rsquo;ve made an undeclared dependency on a program you don\u0026rsquo;t ship.\nLink libgit2. libgit2 is the C library that reimplements Git as something you can call from code, and git2 is the Rust binding to it. It\u0026rsquo;s solid and widely used. But it\u0026rsquo;s a C dependency, which means a C toolchain in the build, and it\u0026rsquo;s consistently the single biggest source of cross-compilation pain in the Rust Git ecosystem. The musl builds, the Windows builds, the static linking: libgit2 is where they tend to break.\nUse gix. gix is a reimplementation of Git in pure Rust. No C library, no subprocess. It\u0026rsquo;s just Rust code, and it compiles and cross-compiles like any other crate, because that\u0026rsquo;s all it is. It\u0026rsquo;s also generally faster, and being pure Rust it fits the no-unsafe-in-first-party-code story far more comfortably than dragging a C library along.\nrtb-vcs is gix-first. The Repo type is built on it. There\u0026rsquo;s no git binary dependency, and there\u0026rsquo;s no libgit2 in a default build.\ngix is still maturing, and a few write paths, push in particular, aren\u0026rsquo;t ready in it yet. For those, git2 stays available as an opt-in fallback behind a Cargo feature. Off by default, so the libgit2 C dependency and its cross-compile cost only land in builds that explicitly ask for push support. The common case, a tool that clones, reads history, diffs and commits, pays none of it. (Which is exactly the feature-flag story from a couple of weeks back, doing real work.)\nRepo is a foundation, not a façade One design decision is worth calling out, because it came straight from a go-tool-base lesson.\nIt would have been easy to build Repo as a narrow façade exposing exactly what the scaffolder and the release-notes feature need today, and nothing else. That was rejected on purpose. go-tool-base\u0026rsquo;s RepoLike is itself the cautionary tale: it arrived from another project, settled into a sensible abstraction, and is already lined up to carry a consumer, the generator, that wasn\u0026rsquo;t driving its design when it was first written. A repository abstraction gets used by code that doesn\u0026rsquo;t exist yet. Build one as a narrow façade around today\u0026rsquo;s needs and you\u0026rsquo;ve guaranteed a rewrite the first time a downstream tool wants something slightly different.\nSo rtb-vcs\u0026rsquo;s Repo is built as a foundation: a sensible, reasonably complete vocabulary of Git operations that a tool author can compose richer behaviour on, without re-importing gix directly and re-deriving the framework\u0026rsquo;s auth and concurrency conventions. The errors back this up. gix\u0026rsquo;s error types aren\u0026rsquo;t leaked through the public API; they\u0026rsquo;re wrapped in semantic RepoError variants, so the backend could be swapped, gix to git2, or to something else entirely, without breaking a single downstream caller.\nStepping back go-tool-base\u0026rsquo;s VCS support has two halves: forge-API calls for releases and pull requests, and a RepoLike object for local Git operations. The repo half arrived from another project and is wired in ahead of its intended consumer, the code generator, which will use it to initialise repositories for scaffolded tools and to diff regenerated output for drift.\nrust-tool-base carries the same capability on purpose. Its Repo type is built on gix, a pure-Rust Git implementation, so there\u0026rsquo;s no dependency on an installed git binary and no libgit2 C library in a default build, which keeps cross-compilation clean. git2 stays an opt-in fallback for the few write paths gix can\u0026rsquo;t do yet. And Repo is built as a foundation for downstream tools, with the backend wrapped behind its own error type so it can be replaced without breaking callers.\n","date":"2026-05-13T00:00:00Z","image":"/pure-rust-git-no-git-binary/cover-pure-rust-git-no-git-binary.png","permalink":"/pure-rust-git-no-git-binary/","title":"Pure-Rust Git, no git binary"},{"content":"Turning on GuardDuty and Security Hub gives you threat detection. It also gives you a firehose. And an alert system that dutifully forwards everything in that firehose isn\u0026rsquo;t monitoring, it\u0026rsquo;s a very efficient way of training your team to ignore alerts. So the alerts module\u0026rsquo;s real job isn\u0026rsquo;t detection at all. It\u0026rsquo;s deciding what\u0026rsquo;s actually worth interrupting a human for, and the interesting part is everything it deliberately throws away.\nDetection is the easy half Switching on threat detection in an AWS account is a few resources. GuardDuty, Security Hub with its standards, IAM Access Analyzer: the security baseline does exactly that. From then on, the account is generating findings.\nAnd it generates a lot of them. Plenty are low-severity, informational, or simply the normal texture of a cloud account. If you wire every finding to an email or a pager, you haven\u0026rsquo;t built monitoring. You\u0026rsquo;ve built noise. And noise has a specific failure mode: people stop reading it, and the one finding that genuinely mattered scrolls past unread alongside two hundred that didn\u0026rsquo;t.\nSo the valuable work isn\u0026rsquo;t detection. It\u0026rsquo;s routing: deciding what\u0026rsquo;s worth interrupting a human for, and letting the rest sit quietly in a console for whenever someone reviews it.\nForward the severe, leave the rest The alerts module routes findings with EventBridge rules into an SNS topic that emails out. The rules are deliberately picky. GuardDuty findings are forwarded only at severity 7 and above. Security Hub findings are forwarded only at HIGH and CRITICAL.\nEverything below those thresholds isn\u0026rsquo;t discarded. It\u0026rsquo;s still in GuardDuty and Security Hub, where someone doing a review will see it. It just doesn\u0026rsquo;t get to interrupt anyone\u0026rsquo;s day. The threshold is the line between \u0026ldquo;look at this now\u0026rdquo; and \u0026ldquo;look at this sometime\u0026rdquo;.\nThe duplicate you would otherwise send twice Here\u0026rsquo;s the subtle one, and it\u0026rsquo;s the kind of thing you only find by looking closely at where findings come from.\nSecurity Hub is an aggregator. It pulls findings in from other services, GuardDuty among them. So a single GuardDuty finding can show up in two places: in GuardDuty itself, and again in Security Hub as an aggregated copy.\nA rule on GuardDuty findings and a rule on Security Hub HIGH/CRITICAL findings would therefore both fire for the same underlying GuardDuty finding. One event, two emails. Do that across an account and a meaningful fraction of your alert volume is just the same findings counted twice, which is its own kind of noise.\nSo the Security Hub rule explicitly excludes findings whose ProductName is GuardDuty, with an anything-but match. GuardDuty findings come through the GuardDuty rule. The Security Hub rule handles everything Security Hub adds that GuardDuty didn\u0026rsquo;t already report. One finding, one alert, regardless of how many services it passed through.\nTwo tripwires on the root account Findings are about threats the detectors recognise. The module adds two alarms about something simpler: the root account doing anything at all.\nOne CloudWatch alarm fires on a root console sign-in. The other fires on any root API call that isn\u0026rsquo;t a console login. In a well-run AWS account, the root user does almost nothing after initial setup: day-to-day work happens through roles. So root activity isn\u0026rsquo;t a \u0026ldquo;finding\u0026rdquo; to be assessed for severity. It\u0026rsquo;s a tripwire. Any of it, in an account that should be silent, is worth an immediate look, and the two alarms say so directly.\nWhy a quiet alert stream matters here This is monitoring for the account that\u0026rsquo;s going to hold the release-signing key, and that raises the stakes on getting the routing right.\nIf a key-bearing account ever does come under attack, the alert that says so has to be seen. An alert stream that\u0026rsquo;s mostly noise and duplicates is, functionally, no alerting at all, because the people who\u0026rsquo;d act on it have long since tuned it out. Routing the stream down to \u0026ldquo;severe, deduplicated, plus root tripwires\u0026rdquo; is what keeps it something a human will still read on the day it finally matters.\nThe short version GuardDuty and Security Hub make detection easy. The hard, valuable part is routing: forwarding what deserves to interrupt someone and leaving the rest in a console.\nThe alerts module forwards GuardDuty at severity 7-plus and Security Hub at HIGH/CRITICAL, and it drops the duplicate that aggregation creates by excluding GuardDuty-sourced findings from the Security Hub rule, so one finding is one alert. Two CloudWatch alarms act as tripwires on root-account activity, which should be near-zero. For the account that will hold the signing key, a quiet, trustworthy alert stream isn\u0026rsquo;t a nicety. It\u0026rsquo;s the difference between monitoring and theatre.\n","date":"2026-05-12T00:00:00Z","image":"/routing-security-findings-without-the-noise/cover-routing-security-findings-without-the-noise.png","permalink":"/routing-security-findings-without-the-noise/","title":"Routing security findings without the noise"},{"content":"A botched version bump made me stop and actually look at where go-tool-base lived, and I didn\u0026rsquo;t much like what I saw. GitHub had spent months quietly falling over, and when Mitchell Hashimoto (GitHub user #1299, no less) publicly walked Ghostty off the platform, it stopped feeling like just my problem. I\u0026rsquo;ve been a GitLab fan for years, so the move was less a leap and more an overdue nudge. This is the why, not the how.\nIt started with a wrong number Every migration has a trigger, and mine was embarrassingly small. A commit landed on main carrying a BREAKING CHANGE: footer it didn\u0026rsquo;t really deserve. Semantic-release did exactly what it\u0026rsquo;s told to do with that footer: it cut a major version. go-tool-base lurched from the v1 line straight to v2.0.0, and a chain of things that keyed off the version went sideways with it.\nIt was fixable. It wasn\u0026rsquo;t a disaster. But it was the kind of small, stupid breakage that makes you stop and actually look at your setup instead of just patching it and moving on. And when I looked, the version bump wasn\u0026rsquo;t the thing that bothered me. It was everything around it.\nThe platform had been quietly failing I\u0026rsquo;d been losing time to GitHub for months. Not dramatically. No single outage you\u0026rsquo;d write home about, just a steady drip of Actions queues that wouldn\u0026rsquo;t drain, pull requests that wouldn\u0026rsquo;t merge, the occasional morning where the thing simply wasn\u0026rsquo;t there. You absorb it. You re-run the job. You make a coffee and try again. You tell yourself it\u0026rsquo;s a blip.\nThe trouble with a steady drip is that you stop counting it. It becomes weather.\nThe canary left the mine Then, in late April, Mitchell Hashimoto (co-founder of HashiCorp, creator of Vagrant, Terraform and the Ghostty terminal) published Ghostty Is Leaving GitHub, and The Register picked it up a day later under the headline \u0026ldquo;GitHub \u0026rsquo;no longer a place for serious work\u0026rsquo;\u0026rdquo;.\nThis is not a man with a casual relationship to GitHub. He\u0026rsquo;s, by his own account, user #1299, joined February 2008. He called it \u0026ldquo;the place that has made me the most happy\u0026rdquo;. And he still wrote this:\nThis is no longer a place for serious work if it just blocks you out for hours per day, every day.\nThe detail that landed hardest for me wasn\u0026rsquo;t a quote, it was a habit. He\u0026rsquo;d kept a journal for a month, marking an \u0026ldquo;X\u0026rdquo; on every day a GitHub outage had cost him working time. Almost every day had an X. Reading that, I realised I\u0026rsquo;d been having the same month. I\u0026rsquo;d just never been disciplined enough to write it down. He\u0026rsquo;d turned my vague \u0026ldquo;it\u0026rsquo;s been flaky lately\u0026rdquo; into a row of crosses on a calendar.\nI want to ship software and it doesn\u0026rsquo;t want me to ship software.\nWhen the person who\u0026rsquo;s been on the platform for eighteen years and loves it says that out loud, it stops being your private grumble. It\u0026rsquo;s the canary, and the canary has stopped singing.\nWhy GitLab, and not just \u0026ldquo;somewhere else\u0026rdquo; Being annoyed at GitHub is a reason to leave. It is not, on its own, a reason to pick a destination. The destination has to be a positive choice.\nFor me GitLab was an easy one, because I\u0026rsquo;ve been a fan for years. Long enough, in fact, to have also been a reliable grumbler about their pricing tiers, which is how you know it\u0026rsquo;s a real relationship and not a honeymoon. What I\u0026rsquo;ve always rated is the model: GitLab treats source hosting, CI/CD, the package registry, releases and Pages as one integrated product, not a marketplace of bolted-on parts you assemble yourself.\nThat integration is the actual prize. On the old setup, \u0026ldquo;CI\u0026rdquo; meant a folder of separate GitHub Actions workflow files, each pinned, each its own little world. On GitLab it\u0026rsquo;s a single .gitlab-ci.yml pipeline with proper stages (lint, test, security, docs, release) and the release stage talks to the built-in package registry and Pages without me wiring up a single external credential. The CI job that builds the project can authenticate to the things the project needs because they\u0026rsquo;re the same platform.\nThere\u0026rsquo;s a second-order benefit too. A migration is a rare licence to fix things you\u0026rsquo;d never otherwise touch. Moving gave me the cover to reset go-tool-base\u0026rsquo;s versioning cleanly (back to a sensible v0.x line, the accidental v2.0.0 left behind as a cautionary tale) and to move the module path to its new home in one deliberate change rather than a thousand apologetic ones.\nWhat I\u0026rsquo;m not going to claim I\u0026rsquo;m not going to tell you GitHub is finished, or that GitLab never has a bad day, because it does, everyone does. This isn\u0026rsquo;t a teardown. GitHub gave go-tool-base a perfectly good home for its first year, and the archived mirror is still sitting there, read-only, pointing anyone who finds it at the new place.\nWhat changed is simpler than a grand verdict. The friction crossed a line, someone I respect said the quiet part loudly enough that I couldn\u0026rsquo;t keep filing it under \u0026ldquo;weather\u0026rdquo;, and the place I\u0026rsquo;d have moved to anyway was sitting right there with a better model. Sometimes the prudent move and the move you secretly wanted turn out to be the same move, and you just need a wrong version number to give you permission.\nBoiling it down go-tool-base moved from GitHub to GitLab in May 2026. The proximate cause was a self-inflicted version-bump mess; the real cause was months of GitHub unreliability that I\u0026rsquo;d stopped consciously noticing until Mitchell Hashimoto\u0026rsquo;s very public departure named it for me. GitLab was a positive pick, not just an escape hatch: its integrated CI/CD, registry, releases and Pages are one product rather than a kit, and that integration is genuinely worth having. The migration also bought a clean versioning restart as a bonus.\nIf you\u0026rsquo;ve been absorbing a steady drip of friction and telling yourself it\u0026rsquo;s normal: try the calendar trick. Mark the X\u0026rsquo;s for a month. The page will tell you something you already half-know.\n","date":"2026-05-11T00:00:00Z","image":"/why-we-left-github-for-gitlab/cover-why-we-left-github-for-gitlab.png","permalink":"/why-we-left-github-for-gitlab/","title":"Why go-tool-base left GitHub for GitLab"},{"content":"There are well-known community module libraries for AWS: Cloud Posse, the terraform-aws-modules collection, plenty more. Both terraform-aws-bootstrap and terraform-aws-security-baseline use almost none of them. Every sub-module is hand-rolled from raw AWS resources, and before you accuse me of not-invented-here syndrome (a perfectly fair first guess), hear me out, because the same evaluation kept landing the same way for a real reason.\nThe promise of a wrapper module The community module ecosystem makes an appealing offer. Don\u0026rsquo;t write raw aws_s3_bucket and aws_s3_bucket_policy and aws_s3_bucket_public_access_block and the rest. Call a tested, popular module, pass it a handful of inputs, and get a correct, well-configured bucket. Less code in your repo, and the code you don\u0026rsquo;t write has been exercised by thousands of other users.\nFor a lot of infrastructure that\u0026rsquo;s a genuinely good deal, and I take it often. For the two infrastructure modules in this series, I took it almost never. Every sub-module is built from raw AWS resources. That wasn\u0026rsquo;t a reflex. It was the same evaluation, made over and over, landing the same way.\nWhat kept going wrong For each place a wrapper module could have fitted, I looked at the wrapper. And the recurring finding was one of two things. Either using the wrapper correctly, with all the overrides my posture needed, came to more configuration than the raw resources would have. Or the wrapper\u0026rsquo;s abstraction leaked the instant I needed something it hadn\u0026rsquo;t anticipated, and I was now writing code to fight it.\nThe CloudTrail bucket, concretely The clearest example is the bucket that holds CloudTrail logs.\nThere are popular modules that set up CloudTrail and bundle an S3 bucket for the logs. Convenient. But that bundled bucket isn\u0026rsquo;t the bucket I want. It doesn\u0026rsquo;t carry lifecycle { prevent_destroy = true }, and its bucket policy is weaker than the one the state bucket taught me to want: TLS-only, SSE-KMS-only, wrong-key-denied.\nSo to use the wrapper I had two options. Accept a weaker audit-log bucket than the rest of the account, which rather defeats the point of an audit log. Or fight the wrapper: disable its bucket, create my own, wire it back in. Fighting the wrapper is more work than simply writing the fifty-odd lines of raw aws_s3_bucket plus policy that give me exactly the posture I\u0026rsquo;d already designed once. The wrapper didn\u0026rsquo;t save code. It added a negotiation.\nA wrapper is a deal, and deals have terms This isn\u0026rsquo;t an argument that community modules are bad. It\u0026rsquo;s an argument about when the deal is good.\nA wrapper module is a good deal while its abstraction holds: while what it assumes you want matches what you want. The moment you need something it didn\u0026rsquo;t anticipate, the deal inverts. Now you\u0026rsquo;re working against the abstraction, and an abstraction you\u0026rsquo;re fighting costs more than no abstraction at all. (Regular readers will recognise that line from the LangChain argument; it\u0026rsquo;s the same principle in a very different language.)\nInfrastructure that holds signing keys is precisely the case where you need to control the specifics: every encryption setting, every lifecycle rule, every line of every bucket policy. That\u0026rsquo;s a domain where wrapper abstractions leak fast, because the whole job is the details the wrapper smoothed over.\nThe cost, paid on purpose Hand-rolling isn\u0026rsquo;t free. It\u0026rsquo;s more lines of HCL in the repo, up front, than a one-line module call.\nWhat those lines buy is worth the price for this kind of infrastructure. There\u0026rsquo;s no transitive module-version churn to track. There\u0026rsquo;s no abstraction between me and the resource when something behaves oddly. And every line is one I can read, and defend, in a security review, because I wrote it and it says exactly what it does. For a foundation that will hold the most sensitive key in the system, \u0026ldquo;readable and mine\u0026rdquo; beats \u0026ldquo;short and someone else\u0026rsquo;s\u0026rdquo;.\nThat\u0026rsquo;s a deliberate trade, not a universal rule. For an internal tool on a deadline, reach for the wrapper. For the security-critical base of everything else, the raw resources won every time I checked.\nTo sum up The community module ecosystem offers less code that more people have tested, and for plenty of infrastructure that\u0026rsquo;s the right call. For terraform-aws-bootstrap and terraform-aws-security-baseline it almost never was, because each wrapper turned out to be more configuration than the raw resources once my posture was accounted for, or it leaked the moment I needed a specific.\nThe CloudTrail log bucket is the pattern in miniature: the bundled bucket lacked prevent_destroy and a strong policy, so using the wrapper meant either a weaker bucket or fighting the module. A wrapper is a good deal while its abstraction holds and a bad one the moment you fight it, and security-critical foundation infrastructure is all specifics. Hand-rolling cost more lines and bought code I can read and defend. For this, that was the trade worth making.\n","date":"2026-05-10T00:00:00Z","image":"/why-i-hand-rolled-every-module/cover-why-i-hand-rolled-every-module.png","permalink":"/why-i-hand-rolled-every-module/","title":"Why I hand-rolled every module"},{"content":"Bootstrapping the account got it ready: somewhere to store state, an identity to deploy as, enough for the next tofu apply to run. Ready is not the same as safe. An account with no audit trail, nothing watching it, and no considered way for a human to get in is fine for experimenting and absolutely not where you put the most sensitive key in the system. So before the signing key goes anywhere near it, the account gets a security baseline.\nReady is not the same as safe The bootstrap post ended with an account that was ready: it had somewhere to store state and a CI identity to deploy as. The next tofu apply could run.\nReady is not safe. That account still has no audit trail, so nobody could tell you afterwards what happened in it. It has no threat detection, so nothing is watching. Its defaults are AWS\u0026rsquo;s defaults, which are not a security posture. There\u0026rsquo;s no considered way for a human to get in. An account in that condition is fine for experimenting. It\u0026rsquo;s not somewhere you put the most sensitive key in the whole system.\nSo before the signing key is anywhere near it, the account gets a security baseline.\nThe baseline, in one downstream stack terraform-aws-security-baseline is that baseline, and it\u0026rsquo;s exactly the downstream stack the bootstrap post promised: applied through the automation role bootstrap created, not bootstrapped specially.\nIt\u0026rsquo;s six sub-modules, each behind an enable_* toggle: account-hardening (IAM password policy, account-wide S3 public-access blocking, default EBS encryption), audit-logging (a multi-region CloudTrail with log-file validation), aws-config, threat-detection (GuardDuty, Security Hub, IAM Access Analyzer), alerts, and operator-role. Together they turn a bare account into one that records what happens, watches for trouble, and controls who gets in.\nMost of those are the expected baseline. The operator role is the one worth slowing down on, because it\u0026rsquo;s built backwards from how people usually think about an admin role.\nThe operator role, and the inversion InfraAdmin is the human way into the account: the role a person assumes to do operator work. Two things define it.\nThe trust policy decides who may assume it. It trusts only the account root principal, and it requires multi-factor authentication: the assume call must carry aws:MultiFactorAuthPresent, and aws:MultiFactorAuthAge bounds how recently that MFA was performed. No MFA, no role. So far this is a careful but ordinary admin role.\nThe inversion is a second, separate inline policy, and it\u0026rsquo;s almost entirely Deny. It denies, using NotAction, anything where aws:RequestedRegion falls outside an allowed set of regions. The role\u0026rsquo;s power comes from an admin grant. This inline policy fences that power.\nThat\u0026rsquo;s the part worth holding onto. People picture an admin role as a list of what it can do. This one is better understood by what it cannot: it cannot act outside its permitted regions, full stop. A fat-fingered command, or a compromised session, cannot quietly spin resources up in some region nobody\u0026rsquo;s watching. The fence is as much the point of the role as the grant is.\nThe carve-out, because honesty There\u0026rsquo;s a fiddly detail, and it\u0026rsquo;s the kind of thing that makes the region fence real rather than theoretical.\nSome AWS services are global. IAM, CloudFront, Route 53 and friends have no region, and they don\u0026rsquo;t honour aws:RequestedRegion. A naive region-deny would therefore deny calls to IAM, and you\u0026rsquo;d lock yourself out of the very service you manage access with. (A close cousin of the kind of self-inflicted lockout I\u0026rsquo;ll come back to in a later post.)\nSo the Deny carries explicit carve-outs for the global services. It isn\u0026rsquo;t elegant, and it can\u0026rsquo;t be: the global-versus-regional split is just a fact of AWS, and a correct region fence has to account for it. The carve-out list is the honest cost of the control working.\nHarden the room, then move the keys in There\u0026rsquo;s an order to all of this, and the order is the argument.\nThe account that will hold the signing key has to be audited before the key arrives, so that from day one every call against it is in CloudTrail. It has to be watched before the key arrives, so GuardDuty is already looking. It has to be access-controlled before the key arrives, so the only human path in is MFA-gated and region-fenced.\nYou don\u0026rsquo;t move something valuable into a room and then think about locks. You build the room, fit the locks, check they work, and then move the valuable thing in. The security baseline is fitting the locks. The signing key comes later, into a room already built for it.\nWorth remembering Bootstrapping an account makes it ready for the next deploy. It does not make it safe to hold anything that matters. terraform-aws-security-baseline is the downstream stack that closes that gap: audit logging, AWS Config, threat detection, account hardening, and an operator role, applied through the CI role bootstrap created.\nThe operator role is the piece to study. It\u0026rsquo;s MFA-gated on the way in, and then fenced by a separate, almost-all-Deny inline policy that confines it to permitted regions, with carve-outs for the global services that have no region. An admin role defined as much by its fence as its grant. Harden the room first; the keys move in afterwards.\n","date":"2026-05-09T00:00:00Z","image":"/hardening-the-account-that-will-hold-the-keys/cover-hardening-the-account-that-will-hold-the-keys.png","permalink":"/hardening-the-account-that-will-hold-the-keys/","title":"Hardening the account that will hold the keys"},{"content":"A long-lived AWS access key, sitting in a CI system, is just about the single credential I\u0026rsquo;d most like to be rid of. It\u0026rsquo;s powerful, it never expires unless someone remembers to rotate it (nobody remembers to rotate it), and it lives in one of the most attractive targets in the whole supply chain. For infrastructure that\u0026rsquo;s eventually going to hold a release-signing key, it\u0026rsquo;s exactly the wrong place to start. So the phpboyscout infrastructure has no AWS access key in CI at all. None.\nThe access key you don\u0026rsquo;t want A CI pipeline that runs tofu apply against AWS needs AWS credentials. The traditional way to give it some is an IAM user with an access key pair, pasted into the CI system as a masked variable.\nLook at what that key is. It\u0026rsquo;s long-lived: it works until someone remembers to rotate it, and rotating it is a chore, so mostly nobody does. It\u0026rsquo;s powerful: it can apply infrastructure, so it can do nearly anything. And it\u0026rsquo;s sitting in a CI system, which is one of the most attractive targets in your whole supply chain. You\u0026rsquo;ve taken your highest-value credential and stored a permanent copy of it in a place built for running automated jobs.\nFor infrastructure that\u0026rsquo;s going to hold a release-signing key, that\u0026rsquo;s precisely the wrong starting point. So the phpboyscout infrastructure has no AWS access key in CI at all. Not a well-guarded one. None.\nFederation instead of a stored secret The replacement is OIDC federation, and the shape of it is worth walking through, because it\u0026rsquo;s genuinely different from \u0026ldquo;a secret, but better\u0026rdquo;.\nA modern CI platform can mint an OIDC token. GitLab does this with an id_tokens: block: at job time, GitLab issues a short-lived JSON Web Token, signed by GitLab, that asserts a set of facts. This is project X. This is pipeline Y. This is running on ref Z, of this type.\nAWS can consume that. The sts:AssumeRoleWithWebIdentity call takes such a token and, if it satisfies an IAM role\u0026rsquo;s trust policy, returns short-lived AWS credentials for that role. The trust policy is where the control lives: it names GitLab as a trusted token issuer, and it constrains the token\u0026rsquo;s sub claim so that only the specific project, and the specific refs, you intend can assume the role.\nPut it together: the pipeline asks GitLab for a token, hands it to AWS, and gets back credentials that last about an hour and are scoped to one role. Nothing long-lived is stored anywhere. The credential exists only for the job that needs it, and it can\u0026rsquo;t be stolen from a CI variable store, because it was never in one.\nTwo halves of one handshake That handshake is built by two of the repos in this series, each owning one side.\nterraform-aws-bootstrap builds the AWS half, in its automation-iam module: it registers GitLab as an OIDC identity provider in the account, and it creates the automation role with the trust policy that decides which pipelines may assume it.\nThe CI components build the consuming half: the id_tokens: block that asks GitLab for the JWT, and then simply letting the AWS provider\u0026rsquo;s own credential chain perform the exchange. The pipeline doesn\u0026rsquo;t call sts by hand. It presents the token; the SDK does the rest.\nThe gotcha: don\u0026rsquo;t set a profile There\u0026rsquo;s one quiet way to break this, and a stack can look completely correct while doing it.\nThe AWS SDK finds credentials by walking a chain of sources in order. The web-identity path, the one that uses the OIDC token, is one link in that chain. It triggers off environment variables the CI sets up automatically.\nBut if the aws provider block has a hardcoded profile = \u0026quot;...\u0026quot;, the SDK takes the profile link of the chain instead, and never reaches the web-identity link. A profile line is the sort of thing that ends up in a provider block from someone\u0026rsquo;s local development setup, where it\u0026rsquo;s exactly right. Committed and run in CI, it silently short-circuits the federation. The pipeline either fails to find credentials, or finds the wrong ones.\nThe rule is simple once you know it: the provider block that runs in CI must not name a profile. Leave the chain free to find the web identity. It\u0026rsquo;s the kind of bug that teaches you to be precise about which link of the credential chain you\u0026rsquo;re actually relying on.\nThe bottom line Giving CI an AWS access key means storing your most powerful, longest-lived credential in one of your most exposed systems. OIDC federation removes it entirely. The CI platform mints a short-lived signed token, AWS exchanges it via AssumeRoleWithWebIdentity for hour-long credentials against a role whose trust policy names the exact pipeline, and nothing permanent is stored.\nterraform-aws-bootstrap builds the AWS side, the identity provider and the trust policy; the CI components build the consuming side, the token request. The one trap is a hardcoded profile in the provider block, which short-circuits the SDK\u0026rsquo;s credential chain before it reaches the web-identity path. Get that right, and a pipeline deploys to AWS as a verifiable, short-lived identity, with no key to steal.\n","date":"2026-05-08T00:00:00Z","image":"/no-access-keys-in-ci/cover-no-access-keys-in-ci.png","permalink":"/no-access-keys-in-ci/","title":"No access keys in CI"},{"content":"A while ago I worked out where a CLI should keep your API key: env var, OS keychain, or, grudgingly, a literal in the config file. That answers where the secret lives. It says nothing about what happens to it once it\u0026rsquo;s loaded and sitting in your process memory, which is the half where secrets actually tend to leak. Rust, it turns out, can do something about that half that Go simply can\u0026rsquo;t.\nWhat go-tool-base already settled A while back I wrote about where a CLI should keep your API keys. The answer go-tool-base settled on was three storage modes, in a fixed precedence: an environment variable reference (the recommended default), the OS keychain (opt-in), or a literal value in the config file (legacy, and refused outright when CI=true).\nrust-tool-base keeps that design unchanged. Same three modes, same precedence, same refusal of literal secrets in CI. A tool embeds a CredentialRef in its typed config, and a Resolver walks env, then keychain, then literal, then a well-known fallback variable, first hit wins. That part is a straight carry-over, because where to keep the secret was design, and design survives the port.\nBut storage is only half the life of a secret. The other half is what happens to it once it\u0026rsquo;s resolved and sitting in your process memory. That\u0026rsquo;s where Rust can do something Go can\u0026rsquo;t, and rust-tool-base takes the opening.\nThe two ways a secret leaks after you\u0026rsquo;ve loaded it You\u0026rsquo;ve resolved the API key. It\u0026rsquo;s a value in memory now. Two very ordinary things can leak it from there, and neither involves your storage being wrong.\nThe first is the log line. Somewhere a developer writes a debug print of a config struct, or an error includes the struct that holds the key, or a panic dumps it. The secret is a string like any other string, so it renders like any other string, straight into a log aggregator that a lot of people can read.\nThe second is the leftover bytes. The key sat in a heap allocation. The variable goes out of scope, the allocation is freed, and on most runtimes \u0026ldquo;freed\u0026rdquo; just means \u0026ldquo;returned to the allocator\u0026rdquo;. The bytes are still there until something else writes over them. A core dump taken in that window contains your key. So does the next allocation that happens to land on that memory and gets logged before it\u0026rsquo;s overwritten.\nA Go string can\u0026rsquo;t really defend against either. Go strings are immutable, so you can\u0026rsquo;t zero one in place; the runtime copies them freely, so you can\u0026rsquo;t even track every copy; and there\u0026rsquo;s no compile-time barrier stopping anyone printing one. You can be disciplined, but discipline is all you\u0026rsquo;ve got.\nSecretString closes both rust-tool-base routes every secret through secrecy::SecretString, and the crate is explicit that taking a plain \u0026amp;str or String for a secret is a type error, not a style preference.\nFor the log line, SecretString has its own Debug implementation, and it prints [REDACTED]. Always. A config struct holding a SecretString can be debug-printed, put in an error, caught in a panic, and the secret field shows up as [REDACTED] every single time. You don\u0026rsquo;t have to remember not to log it. The type already won\u0026rsquo;t.\nFor the leftover bytes, SecretString zeroes its memory when it\u0026rsquo;s dropped. When the value goes out of scope, before the allocation is handed back, the bytes are overwritten. The window where a freed allocation still holds your key is closed. A core dump taken afterwards finds zeroes.\nThere\u0026rsquo;s a third leak SecretString blocks that\u0026rsquo;s easy to miss. It deliberately doesn\u0026rsquo;t implement Serialize. You cannot serialise a SecretString. That sounds like an inconvenience until you see what it prevents: a tool that loads config, changes one setting, and writes the whole struct back would, with an ordinary string, faithfully write the resolved secret to disk in plain text. Because SecretString can\u0026rsquo;t be serialised, CredentialRef can\u0026rsquo;t be either, and that accident is structurally impossible. Writing a secret back is a deliberate, separate path, never a side effect of saving config.\nWhen code genuinely needs the raw value, to drop it into an Authorization header, it calls expose_secret(). The name is the point. Getting at the plaintext is one explicit, greppable, reviewable call, and everywhere else the secret stays wrapped.\nDiscipline versus the type system The honest framing is this. None of these leaks are exotic. Logging a struct, a core dump after a free, re-saving a config file: they\u0026rsquo;re all routine, and they\u0026rsquo;re all how real credentials end up somewhere they shouldn\u0026rsquo;t.\ngo-tool-base\u0026rsquo;s storage design is good, and rust-tool-base kept it. But in Go, not leaking the secret once it\u0026rsquo;s in memory comes down to every developer being careful every time. In Rust, SecretString makes the type system carry it. The redaction, the zeroing, the un-serialisability aren\u0026rsquo;t things you remember to do. They\u0026rsquo;re things the secret does to itself because of what it is. That\u0026rsquo;s the part Go structurally can\u0026rsquo;t match, and it\u0026rsquo;s why the port didn\u0026rsquo;t just copy the storage modes across, it tightened the handling underneath them.\nThe gist go-tool-base settled where a CLI keeps a secret: env var, keychain, or literal, in a fixed precedence. rust-tool-base keeps that design and hardens what happens once the secret is loaded.\nEvery secret is a secrecy::SecretString. It debug-prints as [REDACTED], so it can\u0026rsquo;t fall into a log by accident. Its memory is zeroed on drop, so it doesn\u0026rsquo;t survive in freed heap. It isn\u0026rsquo;t serialisable, so it can\u0026rsquo;t be written back to config by a blanket save. Getting the plaintext is one explicit expose_secret() call. Go can only ask developers to be careful with a secret in memory; Rust lets the type be careful for them.\n","date":"2026-05-06T00:00:00Z","image":"/secrets-that-scrub-themselves-from-ram/cover-secrets-that-scrub-themselves-from-ram.png","permalink":"/secrets-that-scrub-themselves-from-ram/","title":"Secrets that scrub themselves from RAM"},{"content":"Here\u0026rsquo;s a puzzle that every infrastructure-as-code setup hits exactly once, right at the very beginning, and then never again. An OpenTofu stack stores its state in a backend. The bootstrap stack I wrote about last time has a particular job, and part of that job is to create the backend that remote state lives in. So where does the bootstrap stack store its own state, on the very first run, before it\u0026rsquo;s built the place state is supposed to go?\nWhere does the state of the thing that makes the state store live? That\u0026rsquo;s the puzzle, and it\u0026rsquo;s a real ordering deadlock rather than a riddle.\nAn OpenTofu stack keeps a state file, and for anything shared that state file lives in a remote backend: on AWS, an S3 bucket. Fine. But the bootstrap stack has a particular job, and part of that job is to create the S3 bucket that remote state lives in.\nSo walk through the first run. Bootstrap has never been applied. The state bucket doesn\u0026rsquo;t exist, because creating it is what bootstrap is for. Bootstrap needs somewhere to store its own state. The only place that would make sense is the bucket it\u0026rsquo;s about to create, which isn\u0026rsquo;t there yet. The thing that builds the state store can\u0026rsquo;t store its state in the state store.\nRun local, then migrate The way out is a two-step that OpenTofu supports directly.\nBootstrap starts configured with a local backend: backend \u0026quot;local\u0026quot; {}. State is just a file on the operator\u0026rsquo;s machine. With that in place, the first tofu apply runs. It creates the S3 bucket and the KMS key, and records all of it in the local state file.\nNow the bucket exists. So the backend configuration is rewritten to point at it: an s3 backend block naming the new bucket. Then tofu init -migrate-state. OpenTofu sees the backend has changed, picks up the local state file, and copies it into the S3 bucket. From that point on, bootstrap\u0026rsquo;s own state lives in the bucket that bootstrap created. The egg has laid the chicken.\nThe local backend was a scaffold. It existed for exactly one apply, to break the ordering deadlock, and then the state moved off it and it was never used again.\nIt happened twice The infra repo actually did this migration twice, and the second time is the proof that the pattern is general rather than a one-off trick.\nThe first migration was the one above: local to S3, at the very start. The second came later, during the move from GitHub to GitLab. GitLab offers a managed HTTP state backend, and infra chose to use it. So the backend block was rewritten again, this time from s3 to http, and tofu init -migrate-state ran again, copying the state from the S3 bucket to GitLab\u0026rsquo;s backend.\nThe same move, twice, against three different backends. That\u0026rsquo;s the useful lesson hiding in the chicken-and-egg story. State is portable. The backend is just where you currently keep it, not a property of the stack itself, and moving it is a routine, supported operation rather than surgery.\nWhy this is the honest answer, not a hack It\u0026rsquo;s easy to look at \u0026ldquo;apply once with a local backend, then migrate\u0026rdquo; and feel it\u0026rsquo;s a bit of a smell, a workaround for something that should have been cleaner.\nIt isn\u0026rsquo;t. It\u0026rsquo;s the honest answer to a real ordering problem, and the alternatives are worse.\nThe obvious alternative is to create the state bucket by hand, in the console, before running bootstrap at all. But then the most important bucket in the account is unmanaged. It exists outside every OpenTofu graph, nobody\u0026rsquo;s code describes it, its encryption and policy and prevent_destroy are whatever someone clicked that day, and it drifts. The local-then-migrate dance avoids exactly that. The bucket is created by bootstrap, described in code, and tracked in bootstrap\u0026rsquo;s own state from its very first apply. It\u0026rsquo;s managed from birth.\nThe chicken-and-egg isn\u0026rsquo;t a flaw to be embarrassed about. It\u0026rsquo;s just the shape of the problem when a stack has to build its own foundations, and OpenTofu\u0026rsquo;s -migrate-state is the supported tool for exactly that shape.\nPulling it together Every OpenTofu stack needs a backend to store state, and the bootstrap stack\u0026rsquo;s job is to create the backend, so on its first run the bucket it needs doesn\u0026rsquo;t yet exist.\nThe resolution is to run bootstrap once with a local backend, let that apply create the bucket and key, then rewrite the backend configuration and tofu init -migrate-state the state into the bucket bootstrap just made. The infra repo did it twice, local to S3 and later S3 to GitLab, which shows the real point: state is portable, and the backend is just where you keep it. Doing it this way, rather than hand-creating the bucket, is what keeps that critical bucket managed in code from its very first day.\n","date":"2026-05-06T00:00:00Z","image":"/the-chicken-and-egg-of-remote-state/cover-the-chicken-and-egg-of-remote-state.png","permalink":"/the-chicken-and-egg-of-remote-state/","title":"The chicken-and-egg of remote state"},{"content":"OpenTofu\u0026rsquo;s remote state file is, quietly, the most sensitive thing in an infrastructure repo. It\u0026rsquo;s a plain JSON document listing every resource you manage, every ID, and, depending on your providers, the odd secret in clear text. So the S3 bucket that holds it can\u0026rsquo;t just be a bucket. It has to actively defend itself, on three separate fronts.\nThe most sensitive file in the repo OpenTofu, like Terraform, keeps a state file: a JSON document recording every resource the stack manages, its real-world ID, and its attributes. It\u0026rsquo;s how the tool knows what already exists. It\u0026rsquo;s also, quietly, the most sensitive file in the whole repo. It can hold resource identifiers an attacker would value, and depending on the providers in play it can hold secret values in clear text.\nThree bad things can happen to it. It can be deleted, and now the tool has forgotten everything it manages. It can be read by someone who shouldn\u0026rsquo;t. It can be corrupted by two runs writing at once. The bucket that holds remote state has to defend against all three, and terraform-aws-bootstrap\u0026rsquo;s state-backend module is built around doing exactly that.\nThe DynamoDB lock table is gone Start with the corruption problem, because the answer changed recently.\nThe long-standing pattern for remote state on AWS was an S3 bucket plus a DynamoDB table. S3 held the state; the DynamoDB table held a lock, so two apply runs couldn\u0026rsquo;t write at once. Everyone who\u0026rsquo;s done Terraform on AWS has provisioned that table, probably more times than they\u0026rsquo;d care to count.\nOpenTofu 1.10 made it unnecessary. The S3 backend gained use_lockfile, which does the locking with a small lock object in the same bucket, using S3\u0026rsquo;s conditional-write support. No separate table. The state backend is now genuinely one bucket and one key, with the lock living beside the state. It\u0026rsquo;s one fewer resource to create, one fewer thing to pay for, and one fewer moving part to reason about. The module takes the new path, and the DynamoDB table simply isn\u0026rsquo;t there.\nA bucket you can\u0026rsquo;t delete by accident Deletion is guarded with lifecycle { prevent_destroy = true } on the bucket. With that set, OpenTofu refuses to produce a plan that would destroy the bucket. A stray tofu destroy, a refactor that drops the resource, an accidental rename: all of them fail loudly instead of quietly taking the state bucket with them.\nThis is also why the state-backend module is hand-rolled from raw aws_s3_bucket resources rather than wrapping a community module like terraform-aws-modules/s3-bucket. prevent_destroy has to sit on the actual resource, and a lifecycle block isn\u0026rsquo;t something you can pass into a wrapper module as an input. Hand-rolling the bucket keeps prevent_destroy somewhere you can put it and, just as importantly, somewhere the next reader can see it. (There\u0026rsquo;s a whole post coming on why I hand-rolled every module; this is one of the reasons in miniature.)\nReject anything encrypted wrong Confidentiality is the subtle one, because the obvious control isn\u0026rsquo;t enough.\nThe bucket has a default encryption configuration: server-side encryption with the customer-managed KMS key. But default encryption is a default. A client making a PutObject call can override it per request, asking for plain AES256 or a different KMS key, and S3 will honour the override.\nSo the module doesn\u0026rsquo;t rely on the default. The bucket policy explicitly denies the upload it doesn\u0026rsquo;t want. It denies any request not over TLS. It denies any PutObject that isn\u0026rsquo;t using SSE-KMS. And it denies any PutObject that names the wrong KMS key. The default encryption config says \u0026ldquo;this is what you get if you don\u0026rsquo;t ask\u0026rdquo;; the bucket policy says \u0026ldquo;and you\u0026rsquo;re not allowed to ask for anything else\u0026rdquo;. State can only ever land encrypted, in transit and at rest, under the one key the module controls.\nOne small companion setting: bucket_key_enabled. With per-object SSE-KMS, every object operation is also a KMS API call, which costs money and can throttle. An S3 Bucket Key collapses those into far fewer KMS calls, cutting per-object KMS traffic by well over ninety per cent. It\u0026rsquo;s a one-line setting the module turns on and most people forget exists.\nIn short Remote state is the most sensitive file an infrastructure repo has, and the bucket that holds it has to defend against deletion, disclosure and corruption.\nterraform-aws-bootstrap\u0026rsquo;s state backend handles corruption with OpenTofu 1.10\u0026rsquo;s use_lockfile, dropping the old DynamoDB lock table entirely. It guards deletion with prevent_destroy, which is also why the bucket is hand-rolled rather than wrapped. And it guards confidentiality with a bucket policy that denies non-TLS traffic and denies any upload not encrypted with the right KMS key, because default encryption is only a default and a client can override it. The state bucket isn\u0026rsquo;t just a place to put state. It\u0026rsquo;s built to refuse every wrong thing that could happen to it.\n","date":"2026-05-02T00:00:00Z","image":"/a-state-bucket-that-defends-itself/cover-a-state-bucket-that-defends-itself.png","permalink":"/a-state-bucket-that-defends-itself/","title":"A state bucket that defends itself"},{"content":"If your CLI tool talks to an AI model, you don\u0026rsquo;t want to hard-wire one vendor. So you reach for a single client interface over several providers, which is the right call. The trap is the next step: build that interface on only what every provider has in common, and you quietly throw away the very features that made you want a particular provider in the first place. rust-tool-base\u0026rsquo;s rtb-ai refuses to make that trade.\nThe pull toward one interface If your CLI tool talks to an AI model, hard-wiring one vendor is a poor bet. One user has an Anthropic key, another an OpenAI key. Someone\u0026rsquo;s on Gemini. Someone runs Ollama locally because their data can\u0026rsquo;t leave the building. Someone points at an OpenAI-compatible endpoint from a provider you\u0026rsquo;ve never heard of. You don\u0026rsquo;t want a separate code path for each, so you want one AiClient that all of them slot behind.\nrtb-ai gets that unification from the genai crate, which already speaks to Anthropic, OpenAI, Gemini, Ollama and OpenAI-compatible endpoints. One interface, five providers, the tool author picks one in config. The Go sibling makes the same bet: go-tool-base\u0026rsquo;s chat package also unifies several providers, behind an interface deliberately kept to four methods. So far this is the obvious design, and if it were the whole design there\u0026rsquo;d be nothing to write about.\nWhat \u0026ldquo;unified\u0026rdquo; quietly costs you Here\u0026rsquo;s the catch in any unified interface. It can only expose what every provider behind it has in common.\nThe common subset is plain chat. Messages go in, text comes out, optionally streamed token by token. That\u0026rsquo;s real and it\u0026rsquo;s useful and every provider does it. But the common subset is also the floor, and the features that make a particular provider worth choosing are almost never on the floor. They\u0026rsquo;re the things only that provider does.\nAnthropic is the sharp example, because it has three features that matter and not one of them is common-subset.\nPrompt caching. You can mark the stable parts of a request, the system prompt and the tool list, as cacheable. The provider keeps them warm, and on the next turn you aren\u0026rsquo;t billed to re-send and re-process text that didn\u0026rsquo;t change. On a long agent loop, where the same large system prompt rides along on every single turn, that\u0026rsquo;s a substantial saving in both cost and latency.\nExtended thinking. The model works through a hard problem in a visible, budgeted reasoning pass before it commits to an answer, and you can see that reasoning.\nCitations. Structured references back to source material in the response.\nA client built strictly on the common subset can\u0026rsquo;t express any of those. It has no field for them, because four of the five providers wouldn\u0026rsquo;t know what to do with the field. So a purely lowest-common-denominator client would \u0026ldquo;support\u0026rdquo; Anthropic and then use it badly, leaving its best features unreachable. Support as a checkbox, not as the point.\nThe escape hatch rtb-ai\u0026rsquo;s answer is to not choose. It runs two implementations under one interface.\nFor OpenAI, Gemini, Ollama and OpenAI-compatible endpoints, calls route through genai, the unified path. For Anthropic, every method drops to a direct reqwest implementation straight against the Messages API. Same AiClient on the surface, a different implementation underneath, selected by which provider the config names.\nAnd the request type has deliberate room for the difference:\npub struct ChatRequest { pub system: Option\u0026lt;String\u0026gt;, pub messages: Vec\u0026lt;Message\u0026gt;, pub temperature: Option\u0026lt;f32\u0026gt;, pub max_tokens: Option\u0026lt;u32\u0026gt;, /// Anthropic-only: enables prompt caching at every stable point. /// Ignored on non-Anthropic providers. pub cache_control: bool, /// Anthropic-only: extended-thinking budget. `None` disables. /// Ignored on non-Anthropic providers. pub thinking: Option\u0026lt;ThinkingMode\u0026gt;, } Set cache_control and the Anthropic-direct path inserts cache breakpoints at the three stable points: the system prompt, the tool list, and the first message. Set thinking and it adds the thinking block, and streaming surfaces a separate ThinkingToken event so you can show the reasoning apart from the answer. On a non-Anthropic provider, both fields are simply ignored. The interface carries them; only the implementation that understands them acts on them.\nA hatch, not a leak It\u0026rsquo;s worth being precise about why this isn\u0026rsquo;t the thing it superficially resembles, which is a leaky abstraction.\nA leaky abstraction is one where implementation details bleed through that you didn\u0026rsquo;t intend and can\u0026rsquo;t reason about. The abstraction quietly fails to abstract, and you\u0026rsquo;re left guessing which provider you\u0026rsquo;re really talking to.\nThis is the opposite of that. The two Anthropic-only fields aren\u0026rsquo;t a leak. They\u0026rsquo;re named, documented as Anthropic-only, inert everywhere else, and right there in the public type for anyone to see. The interface is uniform for the common case and deliberately, visibly non-uniform at exactly the points where uniformity would have cost you the good features. You opt into provider-specifics by setting a field. You stay fully portable by leaving it at its default. Nothing bleeds; you decide.\nThe same design line explains what does stay in the unified path. Structured output, chat_structured::\u0026lt;T\u0026gt;, sends a JSON Schema derived from your Rust type with the request and validates the reply against it before handing you a typed T. That\u0026rsquo;s a portability win that costs nothing across providers, so it belongs in the common interface. The split isn\u0026rsquo;t \u0026ldquo;Anthropic versus the rest\u0026rdquo;. It\u0026rsquo;s \u0026ldquo;features that are free to unify go in the unified path; features that aren\u0026rsquo;t get a designed door\u0026rdquo;. Prompt caching and extended thinking get the door, because flattening them away would be the expensive kind of convenient.\nTo sum up A CLI tool that integrates AI wants one client over several providers, and a unified interface can only expose what those providers share. The shared floor is plain chat, and the features worth choosing a provider for, like Anthropic\u0026rsquo;s prompt caching, extended thinking and citations, are never on the floor.\nrtb-ai keeps both. genai provides the unified path across five providers; an Anthropic-direct reqwest path drops below the abstraction for the features genai can\u0026rsquo;t reach, and ChatRequest carries the Anthropic-only fields openly, ignored elsewhere. Uniform where uniformity is free, with a designed escape hatch where it isn\u0026rsquo;t. That\u0026rsquo;s the difference between supporting a provider and actually using it.\n","date":"2026-05-02T00:00:00Z","image":"/supporting-a-provider-or-actually-using-it/cover-supporting-a-provider-or-actually-using-it.png","permalink":"/supporting-a-provider-or-actually-using-it/","title":"Supporting a provider, or actually using it"},{"content":"In the porting post I said go-tool-base\u0026rsquo;s error handler was one of the bits that didn\u0026rsquo;t survive the move to Rust, and promised to come back to it. Here\u0026rsquo;s the come-back. The short version is that Rust hands you, for free, the single consistent error exit that go-tool-base had to build a whole component to get.\nWhat go-tool-base built A while ago I wrote about error handling in go-tool-base. The core of it: an error should carry a hint, a separate field of human guidance telling the user what to do next, kept apart from the error\u0026rsquo;s identity so code can still match on it.\nThe other half of that post was about consistency. Every go-tool-base command returns its errors the idiomatic Cobra way, and they all funnel into one Execute() wrapper at the root, which routes every error through one ErrorHandler. One door out. Presentation decided in exactly one place, so no command can render a failure differently from its neighbour.\nThat handler is a real object. It exists, it\u0026rsquo;s wired in, it\u0026rsquo;s the thing every error passes through. Building it was a deliberate piece of work, and it was the right call for Go.\nWhen I rebuilt this in Rust, the handler didn\u0026rsquo;t survive the move. Not because consistency stopped mattering. Because Rust gives you the single exit for free, and an object to enforce it would just be re-implementing something the language already does for you.\nThe shape of a Rust error Start with the type. In rust-tool-base every crate defines its own error enum, and every one of them derives two traits:\n#[derive(Debug, thiserror::Error, miette::Diagnostic)] pub enum ConfigError { #[error(\u0026#34;config file not found at {path}\u0026#34;)] #[diagnostic( code(rtb::config::not_found), help(\u0026#34;run `mytool init` to create one, or set MYTOOL_CONFIG\u0026#34;), )] NotFound { path: PathBuf }, // ... } thiserror::Error makes it a proper error type. miette::Diagnostic is the interesting one. A Diagnostic is an error that also carries the things you\u0026rsquo;d want when presenting it: a stable code, a severity, a help string, and optionally source labels pointing at spans of input. The help line is the same idea as go-tool-base\u0026rsquo;s hint, the recovery step, except here it\u0026rsquo;s an attribute on the variant rather than a field threaded through a wrapper.\nSo the guidance lives on the error, structured, from the moment the error is created.\nThere is no handler, there\u0026rsquo;s a convention Here\u0026rsquo;s where Rust does the work go-tool-base\u0026rsquo;s handler was built to do.\nA rust-tool-base main looks like this:\n#[tokio::main] async fn main() -\u0026gt; miette::Result\u0026lt;()\u0026gt; { rtb::cli::Application::builder() .metadata(/* ... */) .version(VersionInfo::from_env()) .build()? .run() .await } main returns miette::Result\u0026lt;()\u0026gt;. Every command\u0026rsquo;s run returns a Result too. In between, errors propagate with the ? operator: a function that hits an error returns it upward, immediately, and the caller does the same, all the way to main. Nobody writes a \u0026ldquo;check this error\u0026rdquo; call. ? is the propagation.\nAnd when an error reaches main and main returns it, something has to render it for the user. That something is a report hook. rust-tool-base installs one at startup, and from then on any Diagnostic that exits main is rendered through it: the code, the severity, the help text, the source labels, with colour. One renderer, installed once.\nLook at what that adds up to. Every error in the program flows to one place, main. It\u0026rsquo;s rendered by one thing, the hook. Presentation is decided in exactly one location and no command can deviate from it. That\u0026rsquo;s precisely the property go-tool-base\u0026rsquo;s ErrorHandler was built to guarantee. The difference is that nobody built it. The single exit is just where ? propagation ends, and the single renderer is one hook. The language\u0026rsquo;s own convention for returning errors from main is the funnel.\nErrors are values, all the way The thing that took me a moment to fully trust is that there\u0026rsquo;s no funnel to maintain, because there\u0026rsquo;s no funnel as an object. go-tool-base\u0026rsquo;s handler is a component: it can drift, it has to be kept in the path, a command could in principle be wired to bypass it. The Rust version cannot be bypassed, because bypassing it would mean a command not returning its error, and an error you don\u0026rsquo;t return is a compile-time warning at best and dead-obvious wrong code at worst.\nSo the model is just: errors are values, you return them, ? carries them up, main hands the last one to the hook. The consistency isn\u0026rsquo;t enforced by a guard. It\u0026rsquo;s the only thing the shape of the language really lets you do.\ngo-tool-base reaches a single, consistent error exit by building one and routing everything through it. rust-tool-base reaches the same exit by having errors be ordinary return values and letting them fall out of main. Same outcome. One of them is a component you own; the other is a convention you inherit.\nWorth remembering go-tool-base funnels every error through one ErrorHandler so presentation stays consistent. That handler is a deliberately built component, and it\u0026rsquo;s the right design in Go.\nrust-tool-base has no handler. Every crate\u0026rsquo;s error type derives miette::Diagnostic, carrying its code, severity and help text. Errors propagate with ? to main, which returns miette::Result, and a framework-installed hook renders whatever comes out. The single consistent exit is the end of ? propagation, and the single renderer is one hook. The funnel go-tool-base built by hand is, in Rust, just the language\u0026rsquo;s return-from-main convention.\n","date":"2026-05-01T00:00:00Z","image":"/errors-without-an-error-handler/cover-errors-without-an-error-handler.png","permalink":"/errors-without-an-error-handler/","title":"Errors without an error handler"},{"content":"A brand-new AWS account is a slightly nerve-wracking thing. It can do almost anything, it\u0026rsquo;s hardened against almost nothing, and the list of stuff you ought to set up before you trust it with anything real is long. The natural instinct is to write one big \u0026ldquo;set up the account\u0026rdquo; module that does the whole list in a single apply. I want to talk you out of that, because the bootstrap module I\u0026rsquo;m happiest with does almost nothing, on purpose.\nThe first-apply problem A brand-new AWS account is not ready for anything serious. Before you\u0026rsquo;d responsibly run real infrastructure into it, you want an account baseline: a password policy, account-wide S3 public-access blocking, default EBS encryption, CloudTrail, AWS Config, GuardDuty, alerting, a sensible human operator role. It\u0026rsquo;s a long list, and all of it matters.\nThe instinct, faced with that list, is to write one big \u0026ldquo;set up the account\u0026rdquo; module and have it do everything. One tofu apply, a fully prepared account, done.\nThat instinct is worth resisting, and terraform-aws-bootstrap resists it deliberately.\nThree things, and a hard line terraform-aws-bootstrap does three things:\nstate-backend, an S3 bucket and a customer-managed KMS key to hold remote Terraform state. automation-iam, an OIDC identity provider and an IAM role that CI assumes to apply everything else. nuke-config, which renders an aws-nuke configuration scoped to the account, for tearing a throwaway account back down. That\u0026rsquo;s the whole module. Account hardening, CloudTrail, AWS Config, GuardDuty, the operator role, the alerting: none of it is in here. And it\u0026rsquo;s not absent by accident. The README has a section headed \u0026ldquo;what\u0026rsquo;s deliberately NOT in scope\u0026rdquo; that lists those exclusions out loud. The boundary is written down, because the boundary is the design.\nWhy the line is exactly there The reason the line sits where it does is the most useful idea in the module.\nEverything bootstrap excludes belongs in a separate stack, applied through the automation role bootstrap creates. Bootstrap\u0026rsquo;s only job is to get the account to the point where the next tofu apply can run properly: somewhere to store state, and an identity to run as. Once those two things exist, hardening the account isn\u0026rsquo;t a special bootstrapping act. It\u0026rsquo;s just another apply, done the normal way: in CI, reviewed, versioned, deployed through the role.\nSo the account baseline doesn\u0026rsquo;t need to be bundled into the bootstrap. It needs to be downstream of it. Bootstrap builds the on-ramp; it doesn\u0026rsquo;t also have to be the motorway.\nA narrow module stays re-runnable There\u0026rsquo;s a practical payoff to the narrowness, and it\u0026rsquo;s about fear.\nBootstrap is the one stack that can\u0026rsquo;t be applied through CI, because it\u0026rsquo;s what creates the CI identity in the first place. It runs locally, by a human, rarely. That\u0026rsquo;s exactly the kind of operation you want to be small, boring, and safe to repeat.\nA bootstrap module that also did account hardening would be a large, stateful thing managing dozens of resources. Re-running it would be a held-breath operation. Keeping it to three concerns keeps it the opposite: a small stack you can read top to bottom, re-run without anxiety, and reason about completely. The narrowness isn\u0026rsquo;t minimalism for its own sake. It\u0026rsquo;s what keeps the one human-applied stack trustworthy.\nThe boundary is the feature It\u0026rsquo;s tempting to judge a module by how much it does. A bootstrap module is the case where that\u0026rsquo;s exactly backwards. Its value is in how cleanly it stops.\nterraform-aws-bootstrap does the bare minimum to make an account ready for the next apply, writes down everything it refuses to do, and hands off to a downstream stack for all of it. The next post follows the trickiest of its three jobs: the state backend has a genuine chicken-and-egg problem, because it has to store Terraform state in a bucket Terraform hasn\u0026rsquo;t created yet.\nWhere this leaves us A fresh AWS account needs a long list of things before it\u0026rsquo;s safe, and the obvious move is one big module that does the lot. terraform-aws-bootstrap deliberately does only three: a state backend, a CI identity, and an account-scrub config. Everything else is written down as out of scope.\nThe boundary is the design. The excluded work belongs in a downstream stack applied through the CI role bootstrap creates, so hardening is just a normal reviewed apply rather than a bootstrapping special case. And keeping the one human-run, locally-applied stack small is what keeps it safe to re-run. A bootstrap module is judged by where it stops.\n","date":"2026-05-01T00:00:00Z","image":"/the-bootstrap-that-does-almost-nothing/cover-the-bootstrap-that-does-almost-nothing.png","permalink":"/the-bootstrap-that-does-almost-nothing/","title":"The bootstrap that does almost nothing"},{"content":"go-tool-base has feature flags: switches that decide which built-in commands are live in a given run. rust-tool-base has those too. But it also has a second, completely separate kind of flag, and the difference between them is one of those distinctions that\u0026rsquo;s obvious the moment you see it and dangerously easy to conflate before you do. One decides what a command does. The other decides whether a chunk of code is in the binary at all.\nA workspace of crates Before the flags, the shape that makes them possible. go-tool-base is one Go module with packages under pkg/. rust-tool-base is a Cargo workspace of seventeen crates: rtb-app, rtb-config, rtb-cli, rtb-vcs, rtb-ai, rtb-mcp, rtb-docs, rtb-telemetry, and so on, with an umbrella crate called rtb that re-exports the public surface.\nThat isn\u0026rsquo;t tidiness for its own sake. Each subsystem being a separately compilable crate is what gives you a unit you can include or exclude wholesale. Hold onto that, because it\u0026rsquo;s the hinge for everything below.\nThe flag go-tool-base already has go-tool-base has feature flags, and I\u0026rsquo;d describe them as runtime flags. A tool built on it can enable or disable built-in commands:\nprops.SetFeatures( props.Disable(props.InitCmd), props.Enable(props.AiCmd), ) At startup the framework resolves that set and decides which commands are reachable for this run. The init command might be present in the binary but switched off; the ai command might be switched on. It\u0026rsquo;s about the user-facing surface: which commands exist for someone typing --help.\nrust-tool-base keeps this idea. A command carries a CommandSpec with an optional feature field, and the runtime decides whether a feature-gated command is reachable. Same purpose: shape the surface per invocation.\nIf that were the whole story, there\u0026rsquo;d be nothing to write. The reason there\u0026rsquo;s a post is the other kind of flag, which Rust makes available and Go really doesn\u0026rsquo;t.\nThe flag Rust adds Cargo features are a compile-time mechanism. The rtb umbrella crate declares them like this:\n[features] default = [\u0026#34;cli\u0026#34;, \u0026#34;update\u0026#34;, \u0026#34;docs\u0026#34;, \u0026#34;mcp\u0026#34;, \u0026#34;credentials\u0026#34;, \u0026#34;tui\u0026#34;] cli = [\u0026#34;dep:rtb-cli\u0026#34;] update = [\u0026#34;dep:rtb-update\u0026#34;] ai = [\u0026#34;dep:rtb-ai\u0026#34;, \u0026#34;rtb-docs?/ai\u0026#34;] vcs = [\u0026#34;dep:rtb-vcs\u0026#34;] telemetry = [\u0026#34;dep:rtb-telemetry\u0026#34;] full = [\u0026#34;cli\u0026#34;, \u0026#34;update\u0026#34;, \u0026#34;docs\u0026#34;, \u0026#34;mcp\u0026#34;, \u0026#34;ai\u0026#34;, \u0026#34;credentials\u0026#34;, \u0026#34;tui\u0026#34;, \u0026#34;telemetry\u0026#34;, \u0026#34;vcs\u0026#34;] Each subsystem is an optional crate dependency, and a feature switches it on. This is a different kind of switch entirely, and the difference is the whole point.\nA runtime flag decides what a command does while the program runs. The code is in the binary either way; the flag just gates it.\nA Cargo feature decides what\u0026rsquo;s in the binary in the first place. Build a tool without the vcs feature and rtb-vcs is not compiled. Its dependencies are not compiled. gix, the pure-Rust Git implementation rtb-vcs pulls in, roughly two and a half megabytes of it, is not compiled and not linked. It isn\u0026rsquo;t switched off in the binary. It was never in the binary. The compiler never even saw it.\nThat\u0026rsquo;s something a runtime flag cannot do, because by the time anything runs, the binary already exists with everything in it.\nTwo axes, kept separate So rust-tool-base has two flag systems answering two genuinely different questions.\nCargo features answer: what is this binary made of? They\u0026rsquo;re decided when you build the tool, in Cargo.toml. They control compilation, binary size, dependency surface, and compile time. A tool that never touches Git builds without vcs and is smaller, faster to compile, and has a smaller dependency tree to audit. A tool that wants everything turns on full.\nRuntime feature flags answer: what can the user do in this run? They\u0026rsquo;re decided as the program starts. They control which commands appear, which paths are reachable.\nThese could have been mashed into one mechanism, and it would have been a mistake. The app-context design notes are blunt about it: feature gating doesn\u0026rsquo;t belong on the per-command context object, because a feature-gated command \u0026ldquo;either exists or doesn\u0026rsquo;t\u0026rdquo; rather than changing its behaviour mid-run. Compile-time composition is one decision, made by the person building the tool. Runtime gating is another, made per invocation. Conflating them would mean you couldn\u0026rsquo;t reason cleanly about either.\nThe Go version of this had to be hand-built This isn\u0026rsquo;t a thing Go simply lacks. I wrote a whole post about how go-tool-base keeps its optional keychain dependency out of binaries that don\u0026rsquo;t want it, using a blank import and the linker\u0026rsquo;s dead-code elimination. It works. But it was a piece of deliberate engineering for one dependency, and getting it right took care.\nCargo features make that same outcome a first-class, declarative thing, and not for one dependency but for every subsystem the framework has. You don\u0026rsquo;t engineer the exclusion. You name a feature and leave it off. The crate, and its whole subtree, stays out. Rust\u0026rsquo;s build system was designed for exactly this, and rust-tool-base leans on it across the entire workspace rather than hand-rolling it once.\nWhere this leaves us go-tool-base has runtime feature flags: they decide, per invocation, which built-in commands are reachable. rust-tool-base keeps that, and adds a second kind that Rust makes available.\nCargo features decide what the binary is compiled from. Each of the framework\u0026rsquo;s seventeen crates is an optional dependency, and a feature switched off means that crate and its entire dependency subtree are never compiled or linked. A runtime flag gates what code does; a Cargo feature gates whether code is there at all. Two axes, two questions, deliberately kept as separate systems.\n","date":"2026-04-30T00:00:00Z","image":"/two-kinds-of-feature-flag/cover-two-kinds-of-feature-flag.png","permalink":"/two-kinds-of-feature-flag/","title":"Two kinds of feature flag"},{"content":"\u0026ldquo;It\u0026rsquo;s written in Rust\u0026rdquo; gets thrown around as if it were a memory-safety guarantee. It mostly isn\u0026rsquo;t. Rust is memory-safe by default, which is a wonderful thing, but the unsafe keyword exists precisely so any crate, any module, can step outside that default when it needs to. So \u0026ldquo;written in Rust\u0026rdquo; really means \u0026ldquo;mostly safe, probably\u0026rdquo;. rust-tool-base makes the stronger claim about its own code, and gets the compiler to enforce it.\nSafe by default is not the same as safe People reach for Rust because of memory safety, and the reputation is earned. Write ordinary Rust and the compiler will not let you have a use-after-free, a data race, or a buffer overrun. That\u0026rsquo;s the default, and it\u0026rsquo;s a very good default.\nBut it\u0026rsquo;s a default, and defaults can be turned off. Rust has an unsafe keyword precisely so that, when you genuinely need to, you can dereference a raw pointer, call into C, or tell the compiler you\u0026rsquo;ve upheld an invariant it can\u0026rsquo;t check itself. Inside an unsafe block, the guarantees are yours to maintain, not the compiler\u0026rsquo;s to enforce.\nThat keyword has to exist. Some of the most foundational crates in the ecosystem are built on it, carefully. But it means a fact worth being precise about: a project being \u0026ldquo;written in Rust\u0026rdquo; tells you its code is mostly safe. It does not tell you the project\u0026rsquo;s own code contains no unsafe. Those are different claims, and only the second one is a guarantee.\nrust-tool-base makes the second claim about its own code, and has the compiler back it up.\nforbid, not just deny The mechanism is one line at the top of every crate:\n#![forbid(unsafe_code)] unsafe_code is a lint, and Rust lints have levels. The interesting choice is forbid rather than deny, because the two are not the same strength.\ndeny makes the lint an error. But it\u0026rsquo;s an error a downstream module can locally override. Anyone can write #[allow(unsafe_code)] on a function or a block and the deny is lifted right there. As a policy, deny is \u0026ldquo;don\u0026rsquo;t do this unless you really mean to\u0026rdquo;, and \u0026ldquo;unless you really mean to\u0026rdquo; is a door.\nforbid is the strict one. It makes the lint an error and it makes that error impossible to override from inside the crate. A module cannot #[allow] its way back out. Once a crate root says #![forbid(unsafe_code)], there\u0026rsquo;s no unsafe anywhere in that crate, and no local exception can be carved out. The compiler simply refuses.\nSo every rust-tool-base crate that ships in a built tool forbids unsafe at its root. Not \u0026ldquo;discourages\u0026rdquo;. Cannot contain it.\nThe one honest subtlety There\u0026rsquo;s a wrinkle, and it\u0026rsquo;s worth showing rather than hiding, because it\u0026rsquo;s where the design got specific.\nThe workspace sets unsafe_code = \u0026quot;deny\u0026quot; as the baseline for everything, including test files. But test code occasionally has a real need for unsafe. In the 2024 edition, std::env::set_var became unsafe, because mutating the process environment isn\u0026rsquo;t thread-safe, and a test that exercises environment-driven configuration has to call it.\nSo the split is deliberate. The workspace-wide level is deny, which a test file can locally #[allow] when it genuinely needs that one environment call. But every production lib.rs and main.rs additionally carries #![forbid(unsafe_code)], and forbid cannot be relaxed. Test scaffolding gets a controlled, visible exception for a specific standard-library call. Shipping code gets none. The guarantee that matters, \u0026ldquo;the code in the binary contains no unsafe\u0026rdquo;, holds, and the place it\u0026rsquo;s slightly loosened is exactly the place that never reaches a user.\nWhat the guarantee is actually worth Two things, one for users and one for reviewers.\nFor users: an entire family of bug is ruled out of first-party code mechanically. Use-after-free, double-free, data races on shared memory, reading off the end of a buffer. These are the classic memory-safety vulnerabilities, and in a crate that forbids unsafe they cannot originate, because the constructs that produce them cannot be written. That\u0026rsquo;s not careful coding. It\u0026rsquo;s the compiler refusing to build anything else.\nFor reviewers: the cost of an unsafe block is mostly the review burden it carries. Every one is a spot where a human has to check, by hand, that an invariant holds, and has to re-check it whenever nearby code changes. A crate that forbids unsafe has zero of those. There\u0026rsquo;s no unsafe block to audit, ever, because the compiler guarantees there isn\u0026rsquo;t one.\nI\u0026rsquo;ll be straight about the boundary: this is a promise about rust-tool-base\u0026rsquo;s own code. Its dependencies are another matter, and some of them do contain unsafe, correctly. Keeping that side honest is a different job, done by vetting the dependency tree and gating it in CI. Within first-party code, though, the guarantee is real, and there\u0026rsquo;s no Go equivalent to it. Go has an unsafe package, but nothing that lets a codebase prove, to the compiler, that it never touches it.\nThe bottom line Rust is memory-safe by default, but the unsafe keyword exists so that default can be set aside. \u0026ldquo;Written in Rust\u0026rdquo; therefore does not by itself mean a project\u0026rsquo;s own code contains no unsafe.\nrust-tool-base makes that the stronger claim. Every crate root carries #![forbid(unsafe_code)], and forbid, unlike deny, cannot be overridden from inside the crate. Test files get a narrow, visible deny-level exception for the one standard-library call that needs it; shipping code gets none. The payoff is a whole class of memory-safety bug ruled out of first-party code by construction, and not one unsafe block left for a reviewer to audit.\n","date":"2026-04-28T00:00:00Z","image":"/a-framework-that-contains-no-unsafe/cover-a-framework-that-contains-no-unsafe.png","permalink":"/a-framework-that-contains-no-unsafe/","title":"A framework that contains no unsafe"},{"content":"A config file changes. Someone edits a setting, rotates a credential, flips a feature flag. How does the running process find out? For most processes the answer is blunt: it doesn\u0026rsquo;t, until you restart it. For a short-lived CLI that\u0026rsquo;s completely fine. For a long-running service, \u0026ldquo;just restart it\u0026rdquo; is a much bigger ask than it sounds.\nThe default answer is a restart Configuration lives in a file. The file changes: someone edits a setting, rotates a credential, flips a feature flag. How does the running process find out?\nOverwhelmingly, the honest answer is that it doesn\u0026rsquo;t. A process reads its config once, at startup, and that snapshot is frozen for the life of the process. Change the file and nothing happens until you restart, at which point a fresh process reads the fresh file.\nFor a short-lived CLI invocation that\u0026rsquo;s completely fine. It reads config, does its job, exits, and the next invocation reads whatever the file says then. But the same frameworks are also used to build long-running services, and for a service \u0026ldquo;just restart it\u0026rdquo; is not the small thing it sounds like.\nWhat a restart actually costs Restarting a long-running service means every open connection drops. Any in-flight request is lost, or has to be retried by whoever sent it. Caches that took real time to warm are cold again. There\u0026rsquo;s a window, short but real, where the service simply isn\u0026rsquo;t serving.\nIf the thing you changed was a log level, or a feature flag, or a timeout, you\u0026rsquo;ve paid a disruption wildly out of proportion to the change. And the calculation only gets worse as the service gets more important, because the services you least want to bounce on a whim are exactly the ones that matter most.\nHot-reload: re-read in place Hot-reload is the alternative, and both go-tool-base and rust-tool-base support it.\nThe process doesn\u0026rsquo;t read config once and freeze it. It watches the config file. When the file changes, it re-reads it, re-applies it, and carries on running. No new process, no dropped connections, no cold start. The change lands in the live process.\nThe shape is the same in both frameworks:\nA file watcher notices the config file changed. Underneath, this is the operating system\u0026rsquo;s own file-notification facility, inotify on Linux and its equivalents elsewhere. rust-tool-base reaches it through the notify crate; go-tool-base, through the watcher built into Viper. A debounce step waits for the writes to settle. Saving a file is often several separate operations, and you don\u0026rsquo;t want to reload three times for one edit. The config is re-parsed from disk. The new config is swapped in atomically. Observers are notified, so the subsystems that care can react. Steps four and five are the ones worth slowing down on, because they\u0026rsquo;re where a naive hot-reload quietly goes wrong.\nThe two details that make it safe The atomic swap. You do not mutate the live config object in place. A reader on another thread, partway through reading it, would see a torn mix of old and new values, and that\u0026rsquo;s a genuinely nasty class of bug. Instead the process builds a new, complete config value and swaps the pointer to it in a single atomic operation. Any reader sees either the entire old config or the entire new one, never a blend. rust-tool-base does this with arc-swap; go-tool-base does the equivalent. Reads stay cheap and lock-free, and an update is one pointer swap.\nThe observer notification. Re-reading the file isn\u0026rsquo;t the end of the job. Some subsystems have to do something when config changes: a connection pool resizes, a logger changes level, a rate limiter takes a new ceiling. So a hot-reload system has to let those subsystems subscribe. rust-tool-base hands observers a watch::Receiver, a channel that always holds the latest value; go-tool-base exposes an Observable interface. A subsystem subscribes once and reacts every time config changes, for the life of the process.\nWhere this earns its keep: a Kubernetes pod Hot-reload is a nicety on a developer\u0026rsquo;s laptop. Inside a Kubernetes pod it becomes genuinely valuable, and the reason is a neat fit between how Kubernetes delivers config and how a file watcher works.\nIn Kubernetes you don\u0026rsquo;t usually bake configuration into the container image. It lives in ConfigMap and Secret objects, and the clean way to consume them is to mount them as volumes. Mount a ConfigMap as a volume and each key becomes a file in the pod\u0026rsquo;s filesystem.\nHere\u0026rsquo;s the part that connects to everything above. When you update that ConfigMap or Secret, Kubernetes does not restart your pod. The kubelet notices the object changed and rewrites the projected files inside the still-running pod. The files on disk change underneath a process that never stopped.\nThat file rewrite is exactly the event a hot-reload watcher exists to catch. So the whole chain becomes:\nYou kubectl apply an updated ConfigMap, or rotate a Secret. The kubelet updates the projected files inside the pod. The framework\u0026rsquo;s file watcher sees the write. The config is re-parsed, swapped in atomically, and observers are notified. The new configuration is live, and the pod never cycled. You\u0026rsquo;ve changed a running service, in a running pod, with no rollout, nothing terminated and recreated, no dropped traffic. Rotate a database credential, raise a log level to debug an incident in progress, flip a feature flag: all of it live. For a service where a restart is the very thing you\u0026rsquo;re trying hard to avoid, the kind of long-running service these frameworks are built for, that\u0026rsquo;s the difference between a config change being routine and being an event.\nThe honest caveats Two things, so this doesn\u0026rsquo;t read as magic.\nFirst, not everything can be hot-reloaded. Some configuration genuinely needs a restart: the port a server binds to, the size of a thread pool, anything wired up exactly once at process start. Hot-reload covers the large category of settings a subsystem can re-read and re-apply; it doesn\u0026rsquo;t abolish restarts. A config system worth its salt is clear about which settings are live and which are not.\nSecond, a Kubernetes gotcha that catches people out. The in-place file update happens for ConfigMaps and Secrets mounted as volumes. Consume the same ConfigMap as environment variables instead, and those are fixed when the container starts and never update, short of a restart. If you want hot-reload in a pod, mount config and secrets as files, not env vars. And even with volumes the update isn\u0026rsquo;t instant: the kubelet syncs on a period, around a minute by default, so a reload is \u0026ldquo;within a minute or so\u0026rdquo;, not \u0026ldquo;the moment you hit apply\u0026rdquo;.\nWhat it comes down to A config file changes, and the default way to pick it up is to restart the process. For a long-running service that restart costs dropped connections, lost work and a cold start, often for a change as small as a log level.\ngo-tool-base and rust-tool-base both support hot-reload instead: a file watcher catches the change, the config is re-parsed and swapped in atomically so no reader sees torn state, and observers are notified so subsystems can react, all in a live process. The setting where it pays off most is a Kubernetes pod, where ConfigMaps and Secrets mounted as volumes are rewritten in place by the kubelet and the watcher catches that write directly. Mount them as volumes rather than env vars, allow for the kubelet\u0026rsquo;s sync delay, accept that some settings still need a restart, and within those limits \u0026ldquo;the config changed\u0026rdquo; stops meaning \u0026ldquo;cycle the pod\u0026rdquo;.\n","date":"2026-04-27T00:00:00Z","image":"/reloading-config-without-a-restart/cover-reloading-config-without-a-restart.png","permalink":"/reloading-config-without-a-restart/","title":"Reloading config without a restart"},{"content":"I left a door open a couple of posts ago, and it\u0026rsquo;s been quietly bothering me ever since. When I wrote about verifying your own downloads, I was honest that a checksum sitting next to the binary only catches accidents. Anyone who can compromise the release platform can swap the binary and the checksum together, and the tool will happily verify one fake against the other.\nClosing that gap needs a signature. And a signature, it turns out, needs a surprising amount of infrastructure standing behind it. This is the first post about building that.\nThe door the last post left open A while back I wrote about verifying your own downloads: go-tool-base\u0026rsquo;s self-update command now checks the SHA-256 of every binary it downloads against the release\u0026rsquo;s published checksums.txt before installing it.\nThat post was honest about its own ceiling. A checksum file hosted next to the binary it describes shares a trust root with that binary. Both come from the same release, on the same platform. Corruption, truncation, a CDN serving a stale object: a same-origin checksum catches all of those, because they\u0026rsquo;re accidents and the checksum wasn\u0026rsquo;t part of the accident. What it can\u0026rsquo;t catch is an attacker who\u0026rsquo;s compromised the release platform itself. Someone who can replace the binary can replace checksums.txt in the same breath, and the tool will cheerfully verify the malicious download against the malicious checksum and call it good.\nThe post named the fix and then deferred it: a signature whose trust root sits somewhere the release platform can\u0026rsquo;t reach. \u0026ldquo;That\u0026rsquo;s the next phase of this work.\u0026rdquo; This series is that phase.\nWhat a signature actually needs It\u0026rsquo;s worth being precise about why a signature helps where a checksum doesn\u0026rsquo;t, because it\u0026rsquo;s easy to wave the word \u0026ldquo;signature\u0026rdquo; around and assume it settles everything.\nA signature closes the gap only under two conditions. The verifying key, the public half, must reach the user by a path the release platform doesn\u0026rsquo;t control. And the signing key, the private half, must live somewhere the release platform can\u0026rsquo;t reach.\nThe second condition is the one people skip. If the signing key sits in the same CI system that builds the release, you\u0026rsquo;ve gained almost nothing. An attacker who owns the CI owns the key, and a key they own will sign whatever they hand it. The signature verifies perfectly and means precisely nothing. A signature is only worth the distance between the signing key and the thing being signed. Put them in the same place and the distance is zero.\nSo the signing key has to live in a different security domain from the release pipeline. Not a different folder. A different account, with a different blast radius, that the release platform has no standing access to.\n\u0026ldquo;Just sign the binary\u0026rdquo; is not a small feature That reframes a line item that sounds tiny. \u0026ldquo;Sign the release binary\u0026rdquo; unpacks into a list:\nthere must be a private signing key; it must live outside the release platform, in its own security domain; it must be access-controlled, audited, and protected from exfiltration; only the release pipeline may ask it to sign, and only by proving a short-lived, federated identity, never by holding a copy of the key. That\u0026rsquo;s not a feature you bolt onto a CLI. That\u0026rsquo;s infrastructure.\nThe shape of it: a cloud account, with the key held in a managed key service so the private key material never exists as a file on a disk that anyone, me included, can copy. The release pipeline authenticates to that account as itself, briefly, and asks the key service to produce a signature. The key never moves.\nBut an account you\u0026rsquo;re going to trust with a signing key is itself something you have to get right first. An account with a weak baseline, no audit trail, and long-lived credentials lying around is not a safe home for the most security-sensitive key in the whole system. Before the key can move in, the house has to be built and the locks have to actually work.\nWhat this series builds So this turned into a rather longer project than \u0026ldquo;add a signature\u0026rdquo;, and the series follows it in order.\nIt starts with bootstrapping a fresh AWS account: the deliberately minimal first tofu apply, and the remote state backend that has a genuine chicken-and-egg problem. Then the credential question, which is the heart of it: how a CI pipeline deploys to AWS with no stored access key at all. Then hardening the account, so it\u0026rsquo;s genuinely safe to hold something valuable. Then the discipline of deploying changes to it: plans reviewed before they\u0026rsquo;re applied. Then the shared tooling that makes all of it repeatable.\nEvery one of those pieces exists for the same reason. The signing key needs somewhere to live, and somewhere safe is not a default you\u0026rsquo;re handed. It\u0026rsquo;s a thing you build, deliberately, before you have anything worth protecting in it.\nThe series ends where the verifying-downloads post pointed: a signing service whose key the release platform can\u0026rsquo;t touch, so a self-updating tool can finally verify that the binary it\u0026rsquo;s about to become is genuinely the one I published.\nThe upshot go-tool-base\u0026rsquo;s self-update verifies downloads against a checksum, and a same-origin checksum stops accidents but not a compromise of the release platform. The fix is a signature, and a signature is only worth the distance between its signing key and the release pipeline.\nHolding that key safely means a private key that never leaves a managed key service, in a separate cloud account, reached only by a short-lived federated identity. That\u0026rsquo;s infrastructure, and a safe account is something you build before you trust it with anything. The rest of this series builds it, piece by piece, right up to the signing service itself.\n","date":"2026-04-26T00:00:00Z","image":"/a-signing-key-needs-somewhere-to-live/cover-a-signing-key-needs-somewhere-to-live.png","permalink":"/a-signing-key-needs-somewhere-to-live/","title":"A signing key needs somewhere to live"},{"content":"A vulnerability scanner gives you a yes or a no. Is there a known advisory on a path you actually use? Yes, or no. That\u0026rsquo;s genuinely useful, and you should run one. But it\u0026rsquo;s a snapshot, taken on the day you ask, and supply-chain risk in a framework is a bigger and more ongoing thing than a single yes-or-no can capture.\nSo rust-tool-base treats its whole dependency tree as something to have a policy about, not something to scan and forget.\nA scanner answers one question When I had go-tool-base security-audited, part of the routine was running a vulnerability scanner over the dependencies. Go has a good one. It looks at your dependency graph, cross-references known advisories, and tells you whether any of them reach code you actually call.\nThat\u0026rsquo;s useful and you should do it. But notice the shape of what it gives back: essentially a yes or a no. Either there\u0026rsquo;s a known vulnerability on a reachable path or there isn\u0026rsquo;t. It answers one question, on the day you ask it.\nSupply-chain risk in a framework is broader than that one question, because a framework drags its entire dependency tree into every tool built on it. rust-tool-base treats the whole tree as something to have a policy about, and the tool for that is cargo-deny.\nA gate, not a scan cargo-deny reads a deny.toml and checks the dependency graph against four kinds of rule.\nLicences. There\u0026rsquo;s an allowlist: MIT, Apache-2.0, the BSD variants, ISC, a handful of others. Every transitive crate\u0026rsquo;s licence has to be on it. A dependency that pulls in something copyleft, or something with no licence at all, fails the build. You find out the first time it enters the tree, not during a release scramble when someone finally reads the legal implications.\nAdvisories. It checks the RustSec advisory database, and yanked crates are set to deny, so a dependency that\u0026rsquo;s been pulled from the registry stops CI.\nBans. Wildcard version requirements (version = \u0026quot;*\u0026quot;) are denied outright, because a dependency that floats to whatever\u0026rsquo;s newest is a supply-chain hole by construction. Duplicate versions of the same crate get surfaced too.\nSources. Crates may only come from the official registry. An unknown registry or a stray git dependency is denied. Nothing sneaks in from a URL.\nThat\u0026rsquo;s a gate. It encodes, as rules in a file, what the project will and won\u0026rsquo;t accept into its dependency tree, and it enforces them on every build instead of once an audit.\nThe honest part is the waiver list Here\u0026rsquo;s the thing every real project runs into. Sooner or later there\u0026rsquo;s an advisory you genuinely can\u0026rsquo;t fix this week. It\u0026rsquo;s against a crate three levels down your tree. The fix needs an upstream release that hasn\u0026rsquo;t happened. The crate is scheduled to be reworked two milestones from now anyway. The gate is going to fail, and the work to satisfy it honestly isn\u0026rsquo;t available to you yet.\nThe lazy response is a blanket ignore: silence the advisory, move on, forget. Now your gate has a hole in it that nobody remembers opening.\nrust-tool-base\u0026rsquo;s deny.toml does something better. Every waiver in the ignore list is a documented record. Each one carries a comment that names the crate, traces the exact dependency path that reaches it, gives the reason, and names the condition that lifts it:\nignore = [ # `instant` - reached via async-openai -\u0026gt; backoff -\u0026gt; rtb-ai (v0.3). \u0026#34;RUSTSEC-2024-0384\u0026#34;, # `paste` - reached via ratatui -\u0026gt; rtb-docs (v0.2) / rtb-tui (v0.4). \u0026#34;RUSTSEC-2024-0436\u0026#34;, # ... ] The file states the policy out loud: \u0026ldquo;Every waiver points at a deferred stub crate that will be reworked before its ship milestone. Lift each waiver when the owning crate lands its v0.1.\u0026rdquo;\nSome waivers go further and carry a structured reason field, so the why travels with the entry rather than living only in a comment above it:\n{ id = \u0026#34;RUSTSEC-2025-0140\u0026#34;, reason = \u0026#34;gix-date via gix is a stub dependency; rtb-vcs v0.5 will upgrade\u0026#34; }, Read that list and you don\u0026rsquo;t see a project that quietly stopped caring about seven advisories. You see seven advisories the project knows about, can trace, and has tied to a specific milestone. The waiver has an expiry condition. When rtb-vcs reaches v0.5, that gix entry is meant to come out, and the comment is the reminder that it should.\nWhy this is the bit to copy A gate that can\u0026rsquo;t be relaxed is a gate people route around. They\u0026rsquo;ll find the broadest possible ignore and use it, because the alternative is being blocked on someone else\u0026rsquo;s release. The pressure to do that is real, and it\u0026rsquo;s not unreasonable.\nSo the design that actually holds up isn\u0026rsquo;t a stricter gate. It\u0026rsquo;s a gate with an honest, structured escape hatch: you can waive an advisory, but a waiver costs you a documented record with a dependency path and an expiry condition. That price is small enough that nobody routes around it, and high enough that waivers don\u0026rsquo;t accumulate silently. The ignore list stays readable, and every line in it is something you could defend out loud.\nSupply-chain hygiene framed this way isn\u0026rsquo;t an audit you survive once a year. It\u0026rsquo;s bookkeeping: a ledger of what you accepted, why, and when each exception is due to close. Which, now I write it down, is just the Boy Scout rule again, pointed at a dependency tree. Leave it tidier than you found it, and write down the bits you couldn\u0026rsquo;t tidy yet.\nWhere this leaves us A vulnerability scanner answers one question on one day. cargo-deny is a standing policy gate: licences against an allowlist, advisories and yanked crates denied, wildcard versions banned, sources restricted to the official registry, enforced on every build.\nThe part of rust-tool-base\u0026rsquo;s setup worth copying is the waiver list. Every advisory that can\u0026rsquo;t be fixed yet is recorded with its crate, its dependency path, its reason and the milestone that removes it. A waiver is a dated note, not a shrug, and that\u0026rsquo;s what keeps the gate honest enough that nobody actually wants to bypass it.\n","date":"2026-04-26T00:00:00Z","image":"/waivers-with-an-expiry-date/cover-waivers-with-an-expiry-date.png","permalink":"/waivers-with-an-expiry-date/","title":"Waivers with an expiry date"},{"content":"go-tool-base configures things with functional options, and if you forget a required one, the best case is a runtime failure and the worst case is an empty value sailing silently into everything downstream. Most builder patterns share the same hole. rust-tool-base closes it in a way I find genuinely delightful: the .build() method simply doesn\u0026rsquo;t exist until you\u0026rsquo;ve set every required field.\nWhen is a required field actually required Every framework has constructors with a mix of required and optional inputs. An Application in rust-tool-base needs tool metadata and a version. It optionally takes a custom config type, extra commands, feature toggles. The metadata needs a name and a summary; a description and a help channel are optional.\nThe interesting question is when \u0026ldquo;required\u0026rdquo; gets enforced. There are really only two moments available: when the program runs, or when it compiles. Most APIs pick the first without ever framing it as a choice.\nHow go-tool-base does it go-tool-base uses functional options, the standard Go pattern:\ntool := props.New( props.WithName(\u0026#34;mytool\u0026#34;), props.WithVersion(version), ) New takes a variadic list of options and applies them. It\u0026rsquo;s flexible and it reads well. But look at what the type actually says. New accepts zero or more options. The signature is satisfied by passing nothing at all. If WithName is required, nothing in the type system knows that. Forget it and the code compiles cleanly, and you find out when the program runs, or worse, when it doesn\u0026rsquo;t visibly fail but quietly carries an empty name into everything downstream.\nA plain builder is no better here. builder.name(\u0026quot;mytool\u0026quot;).build() and builder.build() are both perfectly valid calls as far as the compiler is concerned. The builder hopes you set the name. It can check at the end and return an error, but that check still happens at runtime.\nIn every one of these the required-ness of a field is a fact that lives in documentation and in the author\u0026rsquo;s head, not in the code.\nTypestate: putting \u0026ldquo;required\u0026rdquo; in the type rust-tool-base builds these with bon, and the pattern it generates is a typestate builder. The idea is that the builder\u0026rsquo;s type changes as you call it, and that type tracks which required fields you\u0026rsquo;ve set so far.\nlet metadata = ToolMetadata::builder() .name(\u0026#34;mytool\u0026#34;) .summary(\u0026#34;my CLI tool\u0026#34;) .build(); ToolMetadata::builder() returns a builder in a state that records \u0026ldquo;name not set, summary not set\u0026rdquo;. Calling .name(...) consumes that builder and returns a different type, one whose state records \u0026ldquo;name set\u0026rdquo;. Calling .summary(...) does the same for the summary.\nThe part that matters is .build(). It isn\u0026rsquo;t a method on the builder in general. It only exists on the builder type that represents \u0026ldquo;every required field has been set\u0026rdquo;. So this:\nlet metadata = ToolMetadata::builder() .summary(\u0026#34;my CLI tool\u0026#34;) .build(); doesn\u0026rsquo;t compile. Not because a runtime check fired, but because in the state \u0026ldquo;name not set\u0026rdquo; there\u0026rsquo;s no .build() method to call in the first place. The compiler stops you, and the error points straight at the missing .name(...).\nOptional fields stay optional. You can call .description(...) or skip it, and .build() is reachable either way, because the description was never part of the state that gates it. The required and the optional are genuinely different in the type, which is exactly the distinction the functional-options version could only keep in a comment.\nApplication::builder() works the same way. It won\u0026rsquo;t produce an Application until it has metadata and a version, and \u0026ldquo;won\u0026rsquo;t\u0026rdquo; there means the method is absent, not that a check returns Err.\nWhy the moment matters Moving the check from run time to compile time changes who finds the mistake, and when.\nA runtime check finds it when that code path executes, which might be in a test, might be in CI, might be on a user\u0026rsquo;s machine at the worst possible moment. A compile-time check finds it the instant you write it, in the editor, before anything has run at all. The same mistake, caught at the cheapest possible point instead of one of the expensive ones.\nIt also changes what the API documents about itself. A functional-options constructor can\u0026rsquo;t tell you, from its signature alone, which options you must pass. A typestate builder can, because the set of methods available to you at each step is the documentation. You literally cannot reach .build() without having been walked past every required field on the way.\nThis is one of those places where Rust\u0026rsquo;s type system earns its reputation. The builder isn\u0026rsquo;t more careful than the Go version. It\u0026rsquo;s that \u0026ldquo;this field is required\u0026rdquo; stopped being a convention and became something the compiler enforces. (Another entry, if you\u0026rsquo;re keeping score from the porting post, in the column of outcomes that survived while the Go mechanism got left behind.)\nThe short version Required fields have to be enforced somewhere. Functional options and ordinary builders enforce them at runtime, if at all, because .build() is always callable and the type system never learns which inputs were mandatory.\nrust-tool-base uses typestate builders generated by bon. The builder\u0026rsquo;s type changes as you set fields, and .build() only exists once every required field is present. Forgetting one is a compile error that names the missing call, not a runtime surprise. The required-versus-optional distinction stops being a comment and becomes part of the type.\n","date":"2026-04-25T00:00:00Z","image":"/a-builder-that-wont-compile-if-you-forget-a-field/cover-a-builder-that-wont-compile-if-you-forget-a-field.png","permalink":"/a-builder-that-wont-compile-if-you-forget-a-field/","title":"A builder that won't compile if you forget a field"},{"content":"I ended the last post promising to show how a Rust command registers itself when the language flatly refuses to run any of your code before main(). This is that post, and it\u0026rsquo;s a lovely example of reaching the same outcome by a completely different road.\nThe outcome I wanted to keep is self-registration.\nWhat self-registration buys A command in go-tool-base lives in its own file, and that file puts the command into the framework itself. There\u0026rsquo;s no central list of commands to keep in sync. You add a file, the command appears. You delete the file, it\u0026rsquo;s gone. Nothing else changes.\nThat property is worth protecting. The alternative, a hand-maintained registry that every new command has to be threaded into, is exactly the sort of central file that turns into a merge-conflict magnet and quietly falls out of date. So when go-tool-base moved to Rust, self-registration was firmly in the column of things that had to survive.\nThe way Go did it was not.\nHow Go does it A Go package can declare an init() function, and the runtime guarantees every init() runs before main() starts. A go-tool-base command file uses this to append itself to a package-level slice:\nfunc init() { registry.Register(\u0026amp;DeployCommand{}) } By the time main() runs, every command file\u0026rsquo;s init() has already fired and the registry slice is populated. It\u0026rsquo;s a tidy trick, and it leans entirely on a Go feature: code that executes before main().\nRust doesn\u0026rsquo;t have that Rust has no init(). There\u0026rsquo;s no language-blessed phase that runs your code before main(). This is a deliberate decision, not an oversight. Code running before main() across many files has no well-defined order, and a startup phase whose ordering you can\u0026rsquo;t see is a classic source of subtle, miserable bugs. Rust closed that door on purpose.\nWhich leaves a real question. If nothing runs before main(), how does a command file insert itself into a registry without a central list editing it in?\nDistributed slices The answer is a crate called linkme, and the mechanism is the linker rather than a runtime phase.\nYou declare a slice the framework will collect into:\n#[distributed_slice] pub static BUILTIN_COMMANDS: [fn() -\u0026gt; Box\u0026lt;dyn Command\u0026gt;]; A command file then contributes one entry to it:\nstruct Greet; impl Command for Greet { /* ... */ } #[distributed_slice(BUILTIN_COMMANDS)] fn register_greet() -\u0026gt; Box\u0026lt;dyn Command\u0026gt; { Box::new(Greet) } Here\u0026rsquo;s the part that makes it work. The #[distributed_slice] attribute doesn\u0026rsquo;t generate any code that runs at startup. It places each entry into a dedicated section of the compiled object file. When the linker builds the final binary, it gathers everything in that section and lays it out as one contiguous array. BUILTIN_COMMANDS is that array.\nSo by the time the program exists as a binary on disk, the registry is already assembled. main() doesn\u0026rsquo;t build it. No init() builds it. The linker built it, statically, as part of producing the executable. At runtime the framework iterates a slice that was complete before the process ever started.\nWhat you get from it The outcome is the one Go\u0026rsquo;s init() gave, and then a bit more.\nA command still lives in one file and still self-registers. Adding a command is still adding a file. There\u0026rsquo;s still no central list.\nBut there\u0026rsquo;s no startup phase to reason about, because there isn\u0026rsquo;t one. There\u0026rsquo;s no global mutable slice being appended to as init()s fire, because nothing is appended at runtime; the slice is immutable and finished. There\u0026rsquo;s no ordering question, because the linker isn\u0026rsquo;t running your code, it\u0026rsquo;s collecting data. And it costs nothing at runtime: assembling the registry happened at link time, so program start just reads it.\nIt\u0026rsquo;s the same idea go-tool-base had, expressed by the tool Rust actually gives you. Go reaches the registry through a controlled phase before main(). Rust reaches it without any phase at all, because the linker did the assembly while the binary was still being built. Two roads, one destination\u0026hellip; which, if you\u0026rsquo;ve been following along, is becoming the whole theme of the Rust side of this project.\nIn short Self-registration, where a command file inserts itself into the framework with no central list, is a property worth keeping. go-tool-base achieves it with a package-level init(), leaning on Go\u0026rsquo;s guarantee that such functions run before main().\nRust has no equivalent and wants none, because code running before main() has no clear ordering. rust-tool-base uses linkme distributed slices instead: each command is placed into a dedicated linker section, and the linker assembles them into one contiguous, immutable slice as it builds the binary. The registry is complete before the program runs. Same outcome as Go\u0026rsquo;s init(), with no life before main required.\n","date":"2026-04-24T00:00:00Z","image":"/registering-commands-without-life-before-main/cover-registering-commands-without-life-before-main.png","permalink":"/registering-commands-without-life-before-main/","title":"Registering commands without life before main"},{"content":"Way back in the introduction I promised I\u0026rsquo;d come back to the self-update integrity checks. Here we are. And the honest starting point is a slightly uncomfortable admission: for a good long while, go-tool-base\u0026rsquo;s update command was the most trusting line of code in the entire tool.\nThe most trusting line of code in the tool Self-update is a lovely feature. The user runs yourtool update, the tool fetches the latest release, swaps itself out, and they\u0026rsquo;re current. go-tool-base has had this since early on, wired to GitHub, GitLab, Bitbucket, Gitea and a few others.\nBut look closely at what that feature actually does. It reaches out to the internet, pulls down a file, and then replaces the executable that\u0026rsquo;s currently running with that file. The next time the user invokes the tool, they\u0026rsquo;re running whatever those bytes turned out to be.\nThe original implementation downloaded the release asset over HTTPS and extracted it. HTTPS gets you transport security: the bytes weren\u0026rsquo;t tampered with in flight. It tells you nothing about whether the bytes were right when they left, or whether they\u0026rsquo;re even the bytes you meant to fetch. A truncated download, a CDN cache serving a mangled object, a release asset that got swapped after the fact\u0026hellip; HTTPS waves all of those straight through. For the one operation in the whole tool that replaces the binary, \u0026ldquo;we didn\u0026rsquo;t check\u0026rdquo; is an uncomfortable place to be sitting.\nGoReleaser already does half the job The good news is that the build side was already producing exactly what I needed. GoReleaser, which builds go-tool-base\u0026rsquo;s releases, generates a checksums.txt for every release: one SHA-256 per published artefact, the same format sha256sum emits. It was sitting right there as a release asset and nothing was reading it.\nSo Phase 1 of the integrity work is exactly that: read it.\nWhen update downloads the platform binary, it now also fetches checksums.txt from the same release, looks up the entry for the asset it just pulled, and compares the SHA-256 of the downloaded bytes against the expected hash before anything gets extracted or installed. Mismatch, and the update aborts before it has so much as touched the installed binary. The hash comparison runs in constant time, which is more defence-in-depth than strictly necessary here, but it costs nothing and means every hash comparison in the codebase is the same and reassuringly audit-boring.\nFail open, or fail closed? The interesting design question wasn\u0026rsquo;t the hashing. It was: what do you do when there is no checksums.txt?\nPlenty of older releases predate this feature. A release might have been cut by hand without GoReleaser. If go-tool-base flatly refused to update whenever a manifest was missing, the very act of shipping this feature would brick the update path for every existing tool the moment they upgraded into it. That\u0026rsquo;s a cure worse than the disease.\nSo the default is fail-open: no manifest, log a clear warning, proceed. It matches how the existing offline-update path already behaved with its optional .sha256 sidecar, and it keeps upgrades working.\nFail-open as a default is not the same as fail-open being right for everyone, though. A security-sensitive tool should be able to say \u0026ldquo;no manifest, no update, full stop\u0026rdquo;. Two ways to get there:\nTool authors flip a compile-time switch (setup.DefaultRequireChecksum = true in main()) and their binary ships fail-closed from day one. End users override either way through config (update.require_checksum) or an environment variable. go-tool-base itself ships with the strict setting turned on, because a tool whose entire job is being a careful framework should hold itself to the stricter bar.\nThe honest caveat Here\u0026rsquo;s the part I want to be straight about, because security features oversell themselves constantly.\nA checksum hosted next to the binary it describes protects you from accidents. Corruption, truncation, a CDN serving stale junk, a release asset that got partially clobbered. It does not protect you from a determined attacker who\u0026rsquo;s compromised the release platform itself. If someone can replace the binary, they can replace checksums.txt in the same breath, and your tool will cheerfully verify a malicious download against a malicious manifest and pronounce it good.\nThat\u0026rsquo;s not a flaw in the implementation. It\u0026rsquo;s the inherent ceiling of same-origin integrity: the manifest and the artefact share a trust root, so they fall together. Closing that gap needs a signature whose trust root is somewhere the release platform can\u0026rsquo;t reach, a key the attacker doesn\u0026rsquo;t have. That\u0026rsquo;s the next phase of this work, and it\u0026rsquo;s a bigger piece: GPG-signing the manifest, with the public half both embedded in the binary and published independently so a single platform compromise isn\u0026rsquo;t enough.\nPhase 1 is the floor, not the ceiling. But it\u0026rsquo;s a floor worth having, because the overwhelming majority of real-world \u0026ldquo;the download was wrong\u0026rdquo; incidents are accidents, not attacks, and accidents are exactly what a same-origin checksum catches.\nPulling it together The update command is the most trusting code in a self-updating tool: it fetches bytes from the internet and then becomes them. go-tool-base now verifies the SHA-256 of every self-update download against the release\u0026rsquo;s own checksums.txt before installing. It fails open by default so shipping the feature doesn\u0026rsquo;t strand anyone on an un-updatable version, fails closed for tool authors who ask (go-tool-base itself does), and stays honest that a same-origin checksum stops accidents, not a platform compromise.\nVerifying your own downloads is a low bar. The point is that the previous height of that bar was zero.\n","date":"2026-04-24T00:00:00Z","image":"/verifying-your-own-downloads/cover-verifying-your-own-downloads.png","permalink":"/verifying-your-own-downloads/","title":"Verifying your own downloads: how I solved it for self-updating CLI tools"},{"content":"Rebuilding go-tool-base in Rust turned out to be the most honest design review I\u0026rsquo;ve ever sat through, and I didn\u0026rsquo;t have to do anything except keep going. Porting a framework into a language with completely different idioms forces a separation you can\u0026rsquo;t fake: the parts that survive the move are design, and the parts that don\u0026rsquo;t are just habit.\nTwo columns When you port a system between languages that don\u0026rsquo;t share idioms, every piece of it sorts itself into one of two columns, without you having to make the call.\nIn the first column is the outcome a piece of the design produces: every command receives the framework\u0026rsquo;s services, configuration is layered with a fixed precedence, commands register themselves, errors carry guidance to the user. In the second column is the mechanism that produced that outcome in the original language.\nThings in the first column survive the port. You rebuild them, differently, because the tool genuinely needs them. Things in the second column do not survive. You find their replacement, and the Go version turns out to have been one valid implementation of an idea, not the idea itself. Doing this for go-tool-base, mechanism by mechanism, was more honest about my own design than any amount of sitting and staring at it would have been.\nThe container go-tool-base hands every command a Props struct. It carries the logger, the config, the assets, the filesystem handle. Some of it is reached through loosely-typed accessors. It works well, and I wrote a whole post about it.\nThe outcome is column one: a command should receive one object, and that object should carry the framework\u0026rsquo;s services so the command doesn\u0026rsquo;t go assembling them itself. That survived. RTB hands every command an App.\nThe loosely-typed accessors were column two. In Rust an App is a plain struct with concrete fields, each one an Arc\u0026lt;T\u0026gt; so a clone is a few atomic increments rather than a deep copy. Nothing is keyed by string. Nothing is fetched by name and asserted to a type. The thing the container is for survived; the way Go expressed it did not.\nRegistration A go-tool-base command self-registers using a package-level init() function, which Go runs before main() and which appends the command to a global slice.\nThe outcome, column one, is that a command lives in its own file and inserts itself into the framework with no central list to edit. That\u0026rsquo;s genuinely worth keeping.\nThe init() mechanism is column two, and Rust doesn\u0026rsquo;t even offer it: Rust deliberately has no code that runs before main(). The replacement is link-time registration through distributed slices, which gets its own post next. Same outcome, no global mutable state, assembled by the linker rather than by a startup phase.\nConfiguration go-tool-base layers configuration with a precedence: flags over environment over file over defaults. Some of it is read back through key lookups.\nThe layering and the precedence are column one. They survived exactly. RTB layers config with the same ordering.\nThe key lookups were column two. In Rust the merged configuration is deserialised into your own serde struct, so a config value is a typed field you access like any other field, and a typo is a compile error instead of a missing key at runtime. The precedence survived; reading values back out of a string-keyed bag did not.\nThe error path go-tool-base routes every error through one handler so presentation is consistent, which I also wrote up.\nOne consistent exit for errors is column one. It survived. What didn\u0026rsquo;t survive was the handler: RTB has no error-handler object at all, because Rust\u0026rsquo;s own return-from-main convention plus a report hook does the job the handler was built to do. That one has its own post too.\nWhat the exercise was actually worth Every mechanism told the same story. The container, the registration, the config access, the error path, the cancellation signal that go-tool-base carries on a context.Context and RTB carries on a CancellationToken. In every case the thing it achieved walked across to Rust untouched, and the Go code that achieved it was left behind.\nThat\u0026rsquo;s the useful result. Before this port I couldn\u0026rsquo;t have told you, for any given pattern in go-tool-base, whether it was load-bearing design or just the idiomatic Go way to write it that day. Now I can, because each one was forced to prove itself by being rebuilt from nothing in a language that flatly wouldn\u0026rsquo;t accept the original. Whatever survived was real. Whatever I had to replace was always replaceable, which means it was never really the point.\nThe upshot Porting a framework into a language with different idioms separates design from habit for free. The outcome a pattern produces is design, and it survives the move. The mechanism that produced it is idiom, and it gets left behind for the new language\u0026rsquo;s equivalent.\ngo-tool-base\u0026rsquo;s Props bag, its init() registration, its key-based config access and its error handler were all idiom. The single context object, self-registration, layered precedence and a consistent error exit were all design, and all four came through to RTB intact. The next three posts take the most interesting replacements one at a time, starting with how a Rust command registers itself when the language won\u0026rsquo;t run anything before main.\n","date":"2026-04-23T00:00:00Z","image":"/what-survives-a-port/cover-what-survives-a-port.png","permalink":"/what-survives-a-port/","title":"What survives a port, and what doesn't"},{"content":"I built go-tool-base because I was sick of rebuilding the same CLI scaffolding every time I started a new Go tool. You\u0026rsquo;d think that would have taught me a lesson about doing things more than once. Apparently not, because I\u0026rsquo;ve now started building rust-tool-base: the same idea, the same itch, for Rust.\nIn my defence, there\u0026rsquo;s method in it.\nThe same itch, a different language go-tool-base exists because I kept writing the same couple of hundred lines of wiring every time I started a new Go CLI. Config loading, logging setup, an update check, an error path, a help system. None of it was the tool. All of it had to be there before the tool could be.\nLately I\u0026rsquo;ve been learning Rust, and two things collided. The first is how I tend to learn a language. I\u0026rsquo;ve always picked them up reasonably quickly, and the way I do it isn\u0026rsquo;t with a tutorial that builds a toy, it\u0026rsquo;s by rebuilding something whose shape I already know cold, so that every decision is about the language rather than the problem. The second is that every time I started a Rust CLI of any size, I hit the very same gap I\u0026rsquo;d already filled once in Go.\nSo rather than learn Rust on a throwaway, I decided to learn it by building rust-tool-base: the same idea, the same niche, for Rust.\nThe gap in Rust The Rust ecosystem has a well-earned reputation for sharp, focused crates and a deliberate shortage of big opinionated frameworks. clap for argument parsing, figment for layered config, tracing for logging, miette for errors, ratatui for terminal UI, reqwest and tokio underneath. Each of them is genuinely best-in-class.\nWhat nobody hands you is the assembly. Wiring those into one coherent product, and then adding self-update, AI integration, an MCP server, embedded documentation, credential handling, telemetry and a scaffolder, is real work, and it\u0026rsquo;s the same work on every project.\nThe closest existing neighbours stop short of it. cli-batteries is a thin preamble: argument parsing plus a logging subscriber plus panic and signal handling. starbase has a proper session and lifecycle model but is CLI-agnostic and shaped around the moonrepo tooling it came from. cargo-dist and cargo-release are about release packaging, not the runtime. Good tools, all of them, but none is the opinionated, full-lifecycle, scaffolded base that go-tool-base is in the Go world. That space is empty, and rust-tool-base is built to fill it.\nWhy it is not a port The obvious way to build this would be to open go-tool-base and translate it file by file. I\u0026rsquo;m not doing that, and the reason matters enough that it\u0026rsquo;s the rule the whole project is built around.\ngo-tool-base is full of Go. It leans on a Props struct that carries the framework\u0026rsquo;s services in loosely-typed fields. It configures things with functional options. It registers commands using package-level init(). It threads a context.Context through every call. Those are all good, idiomatic Go. Transliterated into Rust they\u0026rsquo;d become code that argues with the compiler on every single line, because Rust has its own answers to every one of those problems and they are emphatically not the Go answers.\nSo rust-tool-base reaches the same outcomes by Rust\u0026rsquo;s means. Commands still self-register, but through link-time machinery instead of init(). There\u0026rsquo;s still one context object per command, but it\u0026rsquo;s strongly typed rather than a loosely-keyed bag. Configuration is still layered, but it lands in your own typed struct instead of a string-keyed lookup. Same philosophy, same shape of product, an entirely different ecosystem underneath. The README says it plainly: it\u0026rsquo;s a sibling, not a port.\nWhy do it twice at all Three reasons, and they reinforce each other.\nThe first is plain usefulness. The next time I want a Rust CLI tool, I want the same head start go-tool-base already gives me in Go.\nThe second is the learning. Rebuilding a system I understand forces me to meet Rust\u0026rsquo;s idioms where they actually bite, not where a tutorial gently stages them. You learn ownership properly when a real design is pushing back at you.\nThe third is the one I didn\u0026rsquo;t expect, and it\u0026rsquo;s the subject of the next post. Building the same framework twice, in two languages, turns out to be the cleanest way to find out which of your original decisions were genuine design and which were merely idiom. The design survives the move. The idiom does not. Sorting one from the other has been the most interesting part so far.\nBoiling it down rust-tool-base is the Rust sibling of go-tool-base: the same batteries-included, scaffolded, opinionated CLI framework, aimed at the same gap, which in Rust is the gap between a pile of excellent crates and a coherent product.\nIt\u0026rsquo;s not a port. Transliterating Go idioms into Rust produces code that fights the language, so RTB reaches the same outcomes through Rust\u0026rsquo;s own mechanisms instead. The posts after this one walk through the specific cases: how commands register, how the builder works, how errors are reported, and a few things RTB can do that the Go version structurally can\u0026rsquo;t. First, though, the thing the exercise taught me about my own design.\n","date":"2026-04-22T00:00:00Z","image":"/rust-tool-base-the-same-idea/cover-rust-tool-base-the-same-idea.png","permalink":"/rust-tool-base-the-same-idea/","title":"rust-tool-base: the same idea, in a language that argues back"},{"content":"go-tool-base can stash your credentials in the OS keychain, which most people building on it are perfectly happy about. But some of them ship into regulated and air-gapped environments where the binary isn\u0026rsquo;t permitted to contain keychain or session-bus code at all\u0026hellip; not dormant, not unused, simply not there.\nSo I had a feature most users want and a minority must be able to provably not have. The way I ended up solving it is one of my favourite little bits of honest Go.\nA feature some users have to be able to not have go-tool-base needs somewhere to keep secrets: AI provider keys, VCS tokens, the occasional app password. The best home for those on a developer\u0026rsquo;s machine is the operating system\u0026rsquo;s own keychain. macOS Keychain, GNOME Keyring or KWallet on Linux via the Secret Service, Windows Credential Manager. So I wanted go-tool-base to support all three. (This is the keychain mode I mentioned back in the credentials post, finally getting the explanation I promised it.)\nThe Go library for that is go-keyring, and it\u0026rsquo;s good. The catch is what it drags in behind it. On Linux it talks to the Secret Service over D-Bus, which means godbus. On Windows it pulls wincred. Perfectly reasonable dependencies for a desktop tool.\nNow here\u0026rsquo;s the constraint that made this interesting. Some of the people building tools on go-tool-base don\u0026rsquo;t ship to developer laptops. They ship into regulated sectors and air-gapped deployments where a security review will scan the binary, enumerate every dependency, and ask pointed questions about anything that does inter-process communication. For those builds, \u0026ldquo;the keychain code is there but we never call it\u0026rdquo; is not an acceptable answer. The reviewer\u0026rsquo;s position, and it\u0026rsquo;s a fair one, is that code which isn\u0026rsquo;t in the binary cannot be a finding.\nSo I had a feature that most users want, and a minority of users must be able to provably not have. Same framework, same release.\nWhy I didn\u0026rsquo;t reach for a build tag The obvious Go answer is a build tag. Compile with -tags keychain to get it, leave the tag off to not. I started down that road. I even spent a while on an inverted version, a nokeychain tag, on the theory that the regulated build should be the one that has to ask, so a forgotten flag fails safe.\nIt works. It also isn\u0026rsquo;t very nice. Build tags are invisible at the call site. Nothing in the source tells you that a file only exists in some builds. The two worlds drift, because the tagged-out path isn\u0026rsquo;t compiled in your normal editor session and quietly rots. And the ergonomics for a downstream consumer are poor: every tool built on go-tool-base would have to know the right magic incantation and thread it through their own release pipeline correctly, forever.\nI tried a second approach too: pull the keychain backend out into a completely separate Go module. That genuinely solves the dependency question (a module you don\u0026rsquo;t require can\u0026rsquo;t contribute to your go.sum). But a separate module for one backend is clunky. Separate versioning, separate release, separate repo, all for a single file\u0026rsquo;s worth of behaviour. It felt like using a shipping container to post a letter.\nThe shape that actually fits: a registry and an init() The version I\u0026rsquo;m happy with leans on two boring, well-worn Go mechanisms and lets them do something quietly clever together.\nFirst, pkg/credentials defines a Backend interface and a registry. By default the registry holds a stub backend that politely returns \u0026ldquo;unsupported\u0026rdquo; for everything. The framework only ever talks to the registered backend, whatever that happens to be.\nSecond, the keychain implementation lives in its own package, pkg/credentials/keychain, still inside the same module, no separate release to manage. That package has an init() that registers its go-keyring-backed backend:\n//nolint:gochecknoinits // registration via import is the whole point func init() { credentials.RegisterBackend(Backend{}) } And go-keyring, godbus, wincred, the whole IPC dependency chain, are only imported by that package.\nNow the trick. To switch keychain support on, you import the package. You don\u0026rsquo;t have to use anything from it. A blank import is enough, because a blank import still runs the package\u0026rsquo;s init():\n// cmd/gtb/keychain.go - the entire file. package main import _ \u0026#34;gitlab.com/phpboyscout/go-tool-base/pkg/credentials/keychain\u0026#34; That single line is the on/off switch for the shipped gtb binary. The blank import means init() runs, the keychain backend registers itself, and credential operations start routing through the OS keychain. No flag, no tag, no config.\nThe part that makes it provable Here\u0026rsquo;s why this beats the build tag, and it comes down to one guarantee in the Go toolchain: the linker only includes packages that are actually imported.\nIf cmd/gtb/keychain.go exists, the keychain package is in the import graph, so go-keyring, godbus and wincred are linked in. Delete that one file and rebuild, and the keychain package is no longer reachable from main. The linker performs dead-code elimination, and the entire go-keyring chain is gone. Not dormant. Not present-but-unused. Absent from the binary.\nThat\u0026rsquo;s the bit a regulated build needs. It isn\u0026rsquo;t a promise that the code won\u0026rsquo;t run. It\u0026rsquo;s a structural fact that the code isn\u0026rsquo;t there, and you can hand a security reviewer an SBOM that proves it. go-keyring won\u0026rsquo;t appear, because it genuinely isn\u0026rsquo;t linked.\nFor a downstream tool built on go-tool-base the story is the same, and just as cheap. Want keychain support? Add the one-line blank import to your own cmd package. Must ship keychain-free? Don\u0026rsquo;t. Your binary\u0026rsquo;s dependency graph follows your import graph, exactly as Go always promised it would. The default (no import) is the locked-down one, which is the right way round for a safety property.\nWhy I like this more than I expected to Build tags hide a decision in the compiler invocation. This pattern puts the decision in the source, as an import, where it\u0026rsquo;s greppable, obvious in code review, and impossible to get subtly wrong. There\u0026rsquo;s a real file called keychain.go whose entire content is one import, and it reads as exactly what it is: a switch.\nIt\u0026rsquo;s also just honest Go. No reflection, no plugin loader, no clever runtime. A registry, an init(), and the linker doing the one job it\u0026rsquo;s always done. The cleverness, such as it is, is in the arrangement, not in any individual piece.\nStepping back go-tool-base needed OS keychain support for the many, and a way to provably exclude it for the few. Build tags could express the toggle but hid it in the build invocation and rotted in the dark. A separate module solved the dependency question but was far too much machinery for one backend.\nPutting the keychain backend in its own package, activated by a blank import _ that fires its init(), gets you both: a one-line, in-source, code-reviewable switch, and, because the linker only links what\u0026rsquo;s imported, a build with the import omitted that contains none of the keychain dependency chain. Provable absence, not promised disuse.\nIf you\u0026rsquo;re carrying an optional dependency that some of your users need gone rather than merely idle, this is the pattern. Let the import graph be the feature flag.\n","date":"2026-04-22T00:00:00Z","image":"/the-blank-import-that-keeps-a-dependency-out-of-your-binary/cover-the-blank-import-that-keeps-a-dependency-out-of-your-binary.png","permalink":"/the-blank-import-that-keeps-a-dependency-out-of-your-binary/","title":"The blank import that keeps a dependency out of your binary"},{"content":"Your CLI tool needs the user\u0026rsquo;s API key. It has to come from somewhere, and it has to survive between runs, so the obvious move is to ask once and write it into the config file. One tidy api_key: line. Job done.\nIt works beautifully on the first afternoon. And then, months later, it\u0026rsquo;s quietly become a liability nobody actually decided to create.\nThe config file that quietly becomes a liability Your CLI tool needs the user\u0026rsquo;s API key. It has to come from somewhere, and it has to survive between invocations, so the obvious move is to ask once and write it into the tool\u0026rsquo;s config file. ~/.config/yourtool/config.yaml, a nice api_key: line, done.\nIt works on the first afternoon. It keeps working. And then, slowly, it becomes a problem nobody decided to create.\nThe config file gets committed to a dotfiles repo. It gets caught in a tar of someone\u0026rsquo;s home directory that lands in a backup bucket. It scrolls past in a screen share. It sits, world-readable, on a shared build box. None of these are exotic. They\u0026rsquo;re just a Tuesday. The plaintext key was fine right up until the file went somewhere the key shouldn\u0026rsquo;t, and config files go places.\nI didn\u0026rsquo;t want go-tool-base handing every tool built on it that same slow-motion liability by default. So credential handling got rebuilt around a simple idea: the config file should usually hold a reference to the secret, not the secret itself.\nThree modes, and which one you get go-tool-base supports three ways to store a credential.\nEnvironment-variable reference, the default. The config records the name of an environment variable, not its value:\nanthropic: api: env: ANTHROPIC_API_KEY The secret itself lives in your shell profile, your direnv setup, or your CI platform\u0026rsquo;s secret store, wherever you already keep that sort of thing. The config file now contains nothing sensitive at all. You can commit it, back it up, paste it into a bug report. The reference is inert on its own.\nOS keychain, opt-in. The config holds a \u0026lt;service\u0026gt;/\u0026lt;account\u0026gt; reference and the actual secret goes into the operating system\u0026rsquo;s keychain: macOS Keychain, GNOME Keyring or KWallet via the Secret Service, Windows Credential Manager.\nanthropic: api: keychain: mytool/anthropic.api This one is opt-in by design, because the keychain backend carries dependencies that some deployments simply aren\u0026rsquo;t allowed to ship. (That opt-in mechanism turned out to be an interesting little problem all of its own, and it gets its own post in a couple of days.)\nLiteral value, legacy and grudging. The old behaviour. The secret sits in the config in plaintext:\nanthropic: api: key: sk-ant-... It still works, because breaking every existing tool\u0026rsquo;s config on an upgrade would be its own kind of vandalism. But it\u0026rsquo;s the last resort, it\u0026rsquo;s documented as the last resort, and the setup wizard puts a warning in front of you when you pick it.\nThe one place literal mode is not allowed There\u0026rsquo;s a single hard \u0026ldquo;no\u0026rdquo; in all of this. If go-tool-base detects it\u0026rsquo;s running in CI (CI=true, which every major CI platform sets) the setup flow will refuse to write a literal credential, and exits non-zero.\nThe reasoning is that a plaintext secret written during a CI run is a plaintext secret written onto an ephemeral, often shared, frequently-logged machine, by an automated process that no human is watching. That\u0026rsquo;s the exact situation where the slow-motion liability becomes a fast one. CI environments inject secrets as environment variables already; there\u0026rsquo;s no good reason for a tool to be writing one to disk there, so go-tool-base simply won\u0026rsquo;t.\nHow it decides at runtime A credential can be configured more than one way at once. You might have an env reference and an old literal key still lurking. So resolution follows a fixed precedence, highest to lowest:\nThe *.env reference. If that env var is set, use it. Otherwise the *.keychain reference. If a keychain entry resolves, use it. Otherwise the literal *.key / *.value, the legacy path. Otherwise a well-known fallback env var (ANTHROPIC_API_KEY and friends), so a tool still picks up the ecosystem-standard variable with no config at all. The useful property here is that adding a more secure mode transparently wins. Drop an env reference next to an old literal key and the next run uses the env var. You can migrate a credential to a better home without first removing it from its worse one, which makes the migration safe to do incrementally instead of as one nervous big-bang edit.\nThe tool tells on itself A precedence rule is no use if nobody knows their config still has a plaintext key three layers down. So the built-in doctor command grew a check for exactly that. Run doctor, and if any literal credential is sitting in your config it reports a warning, names the offending keys (the key names, never the values) and points you at how to migrate.\nIt\u0026rsquo;s not an error. Literal mode is still legal. But the tool will quietly keep reminding you that you left the campsite messier than you could have, until you go and tidy it. (Old Scout habits die hard, and they\u0026rsquo;ve leaked all the way into the framework.)\nThe gist A CLI tool that writes your API key into a plaintext config file isn\u0026rsquo;t doing anything wrong, exactly. It\u0026rsquo;s just handing you a liability that activates later, when the file travels somewhere the key shouldn\u0026rsquo;t. go-tool-base\u0026rsquo;s answer is three storage modes: an env-var reference by default, the OS keychain on request, and a plaintext literal only as a documented last resort that CI environments can\u0026rsquo;t use at all. Runtime resolution runs in a fixed precedence so a more secure mode always wins, which makes migrating a credential safe to do gradually. And doctor keeps an eye on the config so a stray plaintext secret doesn\u0026rsquo;t get to hide forever.\nThe secret should live in a secret store. The config file should just know its name.\n","date":"2026-04-20T00:00:00Z","image":"/where-should-a-cli-keep-your-api-keys/cover-where-should-a-cli-keep-your-api-keys.png","permalink":"/where-should-a-cli-keep-your-api-keys/","title":"Where should a CLI keep your API keys?"},{"content":"When a real security audit lands back in your inbox, the temptation is to read it as a shopping list of unrelated mistakes. Fix one, fix the next, tick them off, move on. I did exactly that the first time. The second time, I noticed something far more useful: the findings weren\u0026rsquo;t scattered at all. They clustered. Almost every one was the same sentence with the nouns swapped out.\nFindings cluster, they don\u0026rsquo;t scatter When you get a real security audit back, the instinct is to read it as a list of unrelated mistakes. Finding 1, unrelated to Finding 2, unrelated to Finding 3. Triage each, fix each, move on.\nThat\u0026rsquo;s not what the go-tool-base audits looked like once I stopped reading them as a list. The findings clustered. Strip away the specifics and almost every one was the same sentence with the nouns swapped: untrusted input reaches a powerful operation, and nothing checks it in between.\nThat reframe is worth more than any individual fix, because it turns \u0026ldquo;we patched some bugs\u0026rdquo; into \u0026ldquo;we know where to look next time\u0026rdquo;. A framework\u0026rsquo;s attack surface isn\u0026rsquo;t spread evenly. It\u0026rsquo;s concentrated at the boundaries: the handful of points where data from outside (a config file, a command-line flag, something typed into a TUI, an HTTP response) flows into machinery that can be made to misbehave. Audit the boundaries and you\u0026rsquo;ve audited most of the risk. Three examples make the pattern obvious.\nBoundary one: a regex compiler Somewhere in the tool, a user-supplied string gets compiled into a regular expression. A search pattern typed into the docs browser, a filter from a config file. Feeding user input to regexp.Compile feels harmless. It\u0026rsquo;s just pattern matching, after all.\nIt isn\u0026rsquo;t quite harmless. A regular expression is a tiny program, and some tiny programs are catastrophically slow. A pattern with the wrong kind of nested repetition can take exponential time to evaluate against a modestly hostile input. That\u0026rsquo;s the class of bug known as ReDoS. A user, or something feeding the user\u0026rsquo;s config, hands you a pathological pattern and your tool wedges, burning a whole core, on what looked for all the world like a search box.\nThe fix isn\u0026rsquo;t to ban user-supplied regexes. It\u0026rsquo;s to stop treating \u0026ldquo;compile this string\u0026rdquo; as free. go-tool-base routes any regex whose pattern came from outside the binary through a regexutil.CompileBounded helper. It caps the pattern length and puts a hard timeout on compilation. A pattern known at build time can still use plain regexp.MustCompile, because that isn\u0026rsquo;t a boundary, it\u0026rsquo;s a constant. The discipline only applies where the input genuinely crosses in.\nBoundary two: a URL opener The tool needs to open a URL in the user\u0026rsquo;s browser, a docs link or an OAuth flow. Under the hood that\u0026rsquo;s the OS handler: xdg-open, or open, or rundll32.\nNow ask where the URL came from. If any part of it is influenced by config, by a server response, by user input, then \u0026ldquo;open this URL\u0026rdquo; has quietly become \u0026ldquo;ask the operating system to do something with an attacker-influenced string\u0026rdquo;. A file:// URL. A javascript: URL. Something with control characters smuggled into it. The browser-open was never the dangerous part. The unvalidated string was.\nSo go-tool-base funnels every URL-open through one package, pkg/browser, and that package is a gate. It enforces an allowlist of schemes (https, http, mailto, and nothing else), bounds the length, and rejects control characters before the OS ever sees the string. The rule that makes it stick is that nothing else is allowed to call the OS handler directly. One door, and the door has a lock. A scattered capability with no chokepoint can\u0026rsquo;t be secured; a capability that has a chokepoint can. (You\u0026rsquo;ll have spotted the \u0026ldquo;one door out\u0026rdquo; idea by now\u0026hellip; it\u0026rsquo;s the same instinct as the single error handler, pointed at security instead of consistency.)\nBoundary three: a log sink This one\u0026rsquo;s the sneakiest, because it runs the wrong way round. The first two boundaries are about dangerous input coming in. This one is about sensitive data leaking out.\nThe tool handles credentials. It also logs, emits telemetry, and reports errors, and all three of those are exit boundaries: places where strings leave the process for somewhere more persistent and more public, like a log aggregator, an analytics backend, an error tracker. If a token ever ends up in a string that flows to one of those, you haven\u0026rsquo;t logged an event, you\u0026rsquo;ve published a secret.\nThe defence is pkg/redact. Any free-form string heading for an observability surface goes through it first, and it strips the usual suspects: credentials in URL userinfo, sensitive query parameters, Authorization headers, the well-known provider key prefixes (sk-, ghp_, AIza and friends), long opaque tokens. The places most likely to leak, command arguments and error messages in telemetry, get it applied automatically rather than relying on every caller to remember.\nSame pattern as the other two. A boundary, and something standing on it checking what goes through.\nThe unglamorous part None of these fixes is clever. There\u0026rsquo;s no exploit demo, no neat trick to show off. Bound a length. Check a scheme against an allowlist. Run a string through a redactor. The work was almost entirely in noticing the boundary existed, and then making sure everything routes through the one checked path instead of dotting raw calls all over the codebase.\nThat\u0026rsquo;s the actual lesson of a security audit, and it\u0026rsquo;s why the cluster reframe matters. The value wasn\u0026rsquo;t the dozen-or-so individual fixes. It was learning that the next risk will be at a boundary too, the next place untrusted input meets a powerful operation with nothing in between, and that the job is to find those points and put a single, mandatory, checked door on each.\nTo sum up A security audit of a CLI framework reads like a list of unrelated bugs and isn\u0026rsquo;t one. go-tool-base\u0026rsquo;s findings nearly all reduced to the same shape: untrusted input reaching a powerful operation unchecked. A regex compiler that needed a length and time bound (regexutil.CompileBounded). A URL opener that needed a scheme allowlist and a single chokepoint (pkg/browser). Log and telemetry sinks that needed credentials redacted on the way out (pkg/redact).\nThe fixes were structural and dull, which is exactly right. Find your boundaries (config, flags, TUI input, network responses, log and telemetry sinks), give each one a single mandatory checked path, and you\u0026rsquo;ve spent your audit effort where the risk actually lives.\n","date":"2026-04-17T00:00:00Z","image":"/every-finding-was-the-same-shape/cover-every-finding-was-the-same-shape.png","permalink":"/every-finding-was-the-same-shape/","title":"I had the framework audited: every finding was the same shape"},{"content":"I\u0026rsquo;m going to tell you about a bug go-tool-base shipped, because it\u0026rsquo;s one of those bugs that\u0026rsquo;s so reasonable-looking you\u0026rsquo;ll find it in textbooks, conference talks, and an awful lot of otherwise excellent Go code. We had it too. It passed every test on my laptop, every single time, and then quietly fell over on CI while blaming an innocent bystander.\nIt\u0026rsquo;s the classic Go trick for mocking a dependency, and it races.\nA pattern that looks completely reasonable Here\u0026rsquo;s a thing you need to do constantly in Go tests: stop a function from really shelling out. It calls exec.LookPath to find a binary, or exec.Command to run one, and your test very much does not want it touching the real $PATH or spawning a real process.\nThe Go community has a well-worn answer. Hoist the function into a package-level variable, call that, and let tests reassign it:\n// production code var execLookPath = exec.LookPath func findTool() (string, error) { return execLookPath(\u0026#34;sometool\u0026#34;) } // test func TestFindTool(t *testing.T) { old := execLookPath defer func() { execLookPath = old }() execLookPath = func(string) (string, error) { return \u0026#34;/fake/path\u0026#34;, nil } // ...assert... } It\u0026rsquo;s tidy. No interface to thread through, no constructor to change. You\u0026rsquo;ll find it in a great deal of Go code, including some very respectable Go code indeed. go-tool-base had it too.\nAnd it works. It works on your machine, it works in code review, it works the first hundred times CI runs it. Which is precisely what makes it dangerous, because it\u0026rsquo;s wrong, and it\u0026rsquo;s just been biding its time.\nAdd one line and it detonates Go\u0026rsquo;s t.Parallel() is more or less free performance. Mark your tests with it and the runner overlaps them instead of plodding through one at a time. On a package with a few hundred tests it\u0026rsquo;s a real, worthwhile speed-up, so naturally you reach for it.\nNow picture two tests, both using the pattern above, both marked t.Parallel(). They run concurrently. Test A assigns its fake to execLookPath. Test B assigns its fake to execLookPath. Test A reads execLookPath expecting its own fake. Two goroutines, one variable, writes and reads with nothing synchronising them. That\u0026rsquo;s a textbook data race, and the textbook is right: the behaviour is undefined. Test A might see B\u0026rsquo;s fake. The deferred restore might land in the wrong order and leave the variable pointing at a fake after both tests have finished, poisoning a third one for good measure.\nThe truly nasty part is the intermittency. Whether the race actually bites depends on goroutine scheduling, which depends on machine load and core count. Your laptop running eight tests at once might never lose the coin-toss. A CI runner under load, scheduling differently, loses it and fails a test that has nothing obviously to do with the change in the commit. You re-run the pipeline, it passes, everyone shrugs and moves on. A test suite that fails one run in twenty trains your team to ignore it, and an ignored CI failure is worse than no CI at all.\nI can tell you this one from direct, slightly embarrassed experience, because go-tool-base shipped exactly this bug and CI caught it the honest way: green on the laptop, red on the runner, with the failure cheerfully pointing at innocent bystander tests rather than the global that was actually the culprit. go test -race will name it for you if you crank the parallelism up high enough to lose the toss reliably\u0026hellip; but you have to go looking, and you only go looking once it\u0026rsquo;s already ruined an afternoon.\nThe fix isn\u0026rsquo;t synchronisation, it\u0026rsquo;s structure The instinct is to slap a mutex around the variable. Resist it. A mutex makes the race defined, but it doesn\u0026rsquo;t make the design any good. You\u0026rsquo;ve still got global mutable state, you\u0026rsquo;ve just queued the fight instead of cancelling it. And tests that serialise on a shared lock aren\u0026rsquo;t really parallel any more, so you\u0026rsquo;ve also handed back the speed-up you came for in the first place.\nThe real fix is to not have a shared variable at all. The dependency was always an input to the code; the package-level var was just a way of avoiding saying so out loud. So say it. Inject it.\nA struct field:\ntype Finder struct { lookPath func(string) (string, error) // defaults to exec.LookPath } func (f *Finder) find() (string, error) { return f.lookPath(\u0026#34;sometool\u0026#34;) } Or a functional option, if you\u0026rsquo;d rather keep the zero value clean. Either way, each test constructs its own Finder with its own fake. There\u0026rsquo;s no shared variable, so there\u0026rsquo;s no race, and t.Parallel() is free again because the tests genuinely don\u0026rsquo;t touch each other.\ngo-tool-base wrote this straight into its standing rules: no package-level mocking hooks, full stop. Dependencies come in through struct fields, functional options, or config fields. (The same injection discipline that makes Props so testable, applied one rung further down.) And to stop everyone hand-rolling the same exec fakes, there\u0026rsquo;s a small internal package, internal/exectest, with ready-made LookPath and CommandContext doubles you construct per-test. The pattern is gone, and the door it came in through is shut.\nThe rule worth taking away A package-level variable that tests reassign is shared mutable state. It reads as a harmless convenience because in a single-threaded test run it behaves like one. t.Parallel() is the thing that reveals it was never harmless, only unobserved.\nThe general lesson is older than Go: if a value is an input to your code, make it an input. Smuggling it in as a global is borrowing test-time convenience against a debt that comes due, with interest, the day someone wants their tests to run in parallel. Pay cash. Inject the dependency.\nWorth remembering Mocking via a reassignable package-level variable is a beloved Go shortcut and a latent data race. It survives because single-threaded test runs hide it; t.Parallel() exposes it as intermittent, bystander-blaming CI flake that\u0026rsquo;s miserable to trace. A mutex only makes the bad design defined. The fix is structural: inject the dependency as a struct field or functional option, so each test owns its own double and there\u0026rsquo;s no shared state to race over. go-tool-base banned the global-hook pattern outright and ships internal/exectest so nobody\u0026rsquo;s tempted back to it.\nIf a piece of code depends on something, let it say so in its signature. Your future self, staring at a CI failure that flatly refuses to reproduce, will thank you.\n","date":"2026-04-16T00:00:00Z","image":"/the-test-mocking-pattern-that-races/cover-the-test-mocking-pattern-that-races.png","permalink":"/the-test-mocking-pattern-that-races/","title":"The test-mocking pattern that races"},{"content":"\u0026ldquo;You can\u0026rsquo;t test code that calls an AI.\u0026rdquo; I\u0026rsquo;ve heard it said with great confidence, and it\u0026rsquo;s half right, which is the most dangerous kind of right. You genuinely can\u0026rsquo;t assert on what a non-deterministic model says. But the model isn\u0026rsquo;t your code, and the bits sitting either side of it most certainly are.\n\u0026ldquo;You can\u0026rsquo;t test AI code\u0026rdquo; It\u0026rsquo;s a fair worry. Your command calls an LLM. The LLM returns something slightly different every run. A test that asserts response == \u0026quot;...\u0026quot; is broken before you\u0026rsquo;ve finished typing it. So the conclusion arrives quickly: the AI path can\u0026rsquo;t be tested, leave it uncovered.\nWhich is a shame, because the AI call is usually the riskiest line in the whole command.\nThe conclusion is also wrong. It mistakes \u0026ldquo;I can\u0026rsquo;t test the model\u0026rdquo; for \u0026ldquo;I can\u0026rsquo;t test my code\u0026rdquo;. The model is not your code. Your code is the two pieces sitting on either side of it.\nYour code is a prompt and a handler Strip the command down to what it actually does:\nIt builds a prompt. It assembles a system prompt, the user\u0026rsquo;s input, perhaps some context, and sends it. The model does something. This is not your code. It takes the response and does something with it. It parses it, branches on it, prints it, stores it. Steps one and three are entirely yours, and entirely deterministic. The same inputs build the same prompt and handle the same response the same way, every single time. That\u0026rsquo;s testable. Step two is the only part that isn\u0026rsquo;t, and step two was never yours to test in the first place.\nSo the job is to pin step two to a known value, and then test one and three properly.\nTest the prompt: snapshot it Step one produces a prompt, and a prompt is just a string, which means you can pin it.\nBoth frameworks lean on snapshot testing here. go-tool-base uses a golden-file approach: the prompt your code generates is recorded to a file, and the test re-generates it and compares against that file. rust-tool-base does the same with insta, snapshotting the request body the client would send.\nThe reason this matters is that the prompt is load-bearing and quietly easy to break. You refactor how context gets assembled. Without noticing, you\u0026rsquo;ve changed the wording, or the ordering, or dropped a line the model was leaning on. Nothing fails to compile. The behaviour just drifts, silently.\nA snapshot test catches exactly that. It fails, it shows you the diff between the old prompt and the new one, and it makes you stop and make a decision. Was this change intended? If yes, you accept the new snapshot and move on. If no, you\u0026rsquo;ve just caught a bug before it shipped. Either way the prompt never changes by accident, which for AI code is most of the battle.\nTest the handler: mock the response Step three needs a response to handle, and in a unit test you don\u0026rsquo;t get that response from the real model. You supply it.\ngo-tool-base ships generated mocks for the ChatClient interface. A test builds a mock client, tells it \u0026ldquo;when Ask is called, return this canned value\u0026rdquo;, and runs the command against it:\nmockClient := mock_chat.NewMockChatClient(t) mockClient.EXPECT(). Ask(mock.Anything, mock.Anything, mock.AnythingOfType(\u0026#34;*main.Analysis\u0026#34;)). RunAndReturn(func(_ context.Context, _ string, target any) error { *(target.(*Analysis)) = Analysis{Severity: \u0026#34;critical\u0026#34;} return nil }) Because the interface is only four methods, that mock is trivial to set up and complete by construction. rust-tool-base takes the same idea one layer down: HTTP-bound tests use wiremock, which stands up a fake server returning a canned response body. The client makes a real HTTP request; it just goes to a fake endpoint the test controls.\nEither way, step two is now fixed to a value you chose, which makes step three deterministic. And that unlocks the tests that actually matter: given a malformed response, does the command fail gracefully? Given a rate-limit error, an empty answer, a field missing? Those are the cases a live model almost never hands you on demand, and a mock hands you every time, on the first run.\nThis is, incidentally, the same discipline as the test-mocking work elsewhere in the framework: the dependency is injected, so the test gets to decide what it does.\nWhat you deliberately don\u0026rsquo;t test One honest boundary. None of this tests whether the model gives good answers. That question is real, but it\u0026rsquo;s a different activity (evaluations, run as their own suite) and not something to mix into the unit tests.\nThe unit suite\u0026rsquo;s job is your code: that it builds a sound prompt, and that it handles every shape of response correctly, including the ugly ones. Keep that well away from \u0026ldquo;is the model clever today\u0026rdquo;. A unit test that depends on the model being clever is a unit test that fails when the weather changes, and a flaky test just teaches people to ignore the whole suite.\nWhat it comes down to Code that calls an LLM is testable; the model is not, and those are different statements. Your code is a prompt builder and a response handler, both deterministic, with the model sat in between.\ngo-tool-base and rust-tool-base converge on the same approach. Snapshot the prompt, with golden files or insta, so a refactor can\u0026rsquo;t change what you send without a test noticing. Mock the response, with generated ChatClient mocks or a wiremock server, so tests run with no network and you can feed in the malformed and error cases a real model won\u0026rsquo;t reliably produce. Leave \u0026ldquo;are the answers any good\u0026rdquo; to a separate evaluation suite. Test the two halves you own, and the non-determinism in the middle stops being an excuse to leave the riskiest line uncovered.\n","date":"2026-04-08T00:00:00Z","image":"/testing-code-that-calls-an-llm/cover-testing-code-that-calls-an-llm.png","permalink":"/testing-code-that-calls-an-llm/","title":"Testing code that calls an LLM: yes, you actually can"},{"content":"go-tool-base\u0026rsquo;s chat package puts five AI providers behind one interface. Four of them are exactly what you\u0026rsquo;d guess: HTTP calls to OpenAI, Claude, Gemini, and anything OpenAI-compatible. The fifth one isn\u0026rsquo;t an API at all. It shells out to a binary.\nThat sounds like a slightly mad thing to want, right up until you\u0026rsquo;ve worked somewhere the network says no.\nThe fifth provider shells out The chat package speaks to five providers through one ChatClient interface. Four of them are what you\u0026rsquo;d expect: HTTP requests to OpenAI, to Claude, to Gemini, to any OpenAI-compatible endpoint. The tool author picks one in config, and the rest of the code never knows the difference.\nThe fifth, ProviderClaudeLocal, is different in kind. It doesn\u0026rsquo;t make an HTTP request at all. It shells out. It runs the claude CLI binary as a child process, passes the prompt in, and reads the answer back from the binary\u0026rsquo;s output.\nThat sounds like an odd thing to want until you\u0026rsquo;ve been stuck in the environment it was built for.\nWhy you\u0026rsquo;d want that Picture a corporate network with its egress locked right down. Outbound HTTPS to api.anthropic.com is blocked by policy. A tool built on go-tool-base that uses AI would simply fall over there. It tries to reach the API, there\u0026rsquo;s no route, and that\u0026rsquo;s the end of the feature.\nBut the developer at that machine has the claude CLI installed, and has run claude login. That binary is permitted. It\u0026rsquo;s an approved, managed tool, and it has its own sanctioned path out. The direct API call is blocked; the claude command is not.\nProviderClaudeLocal is what bridges those two facts. If your tool\u0026rsquo;s AI calls go through that already-blessed binary instead of straight at the API, they work, in an environment where the direct call cannot. That\u0026rsquo;s the whole reason the provider exists. It isn\u0026rsquo;t faster (a real API call has lower latency) and it isn\u0026rsquo;t more capable. It\u0026rsquo;s for the place where the API call simply isn\u0026rsquo;t an option, and \u0026ldquo;isn\u0026rsquo;t an option\u0026rdquo; is a surprisingly common place to find yourself inside a large organisation.\nWhat it costs, honestly It\u0026rsquo;s worth being straight about the trade, because ProviderClaudeLocal is the reduced-capability provider.\nIt doesn\u0026rsquo;t do tool calling. It doesn\u0026rsquo;t do parallel tools. It doesn\u0026rsquo;t stream. Those need a live, structured connection to the model\u0026rsquo;s API, and a subprocess that runs once and prints an answer is not that. What it does support is plain chat and structured output, the latter through the binary\u0026rsquo;s own --json-schema flag.\nSo the honest positioning, and the package\u0026rsquo;s documentation says exactly this, is: prefer the API providers when you can reach them, because they\u0026rsquo;re lower latency and feature-complete. Reach for ProviderClaudeLocal when API access is restricted. You accept the narrower capability set as the price of working at all. For a tool whose AI feature is \u0026ldquo;answer a question\u0026rdquo; or \u0026ldquo;return a structured analysis\u0026rdquo;, that price is often nothing you\u0026rsquo;d even notice. For one built on an agentic tool-calling loop, it\u0026rsquo;s a real limitation, and you\u0026rsquo;d know to expect it.\nHow it stays behind the same interface Here\u0026rsquo;s the part that makes it pleasant rather than a special case to maintain. Despite being a subprocess and not an API, ProviderClaudeLocal is still a ChatClient. Your feature code calls Chat and Ask exactly the way it would for any other provider.\nEverything that makes a subprocess provider awkward stays inside the provider. Spawning the binary, feeding it the prompt, parsing its output, capturing stderr and surfacing it when the binary exits non-zero, and threading multi-turn continuity through session identifiers passed back on the next call with --resume: all of that is the provider\u0026rsquo;s problem, and all of it sits behind the interface. The code in your tool that uses AI doesn\u0026rsquo;t know, and has no way to find out, that this particular provider is a child process rather than an HTTPS call.\nThat\u0026rsquo;s a unified interface genuinely earning its place. It\u0026rsquo;s easy to put a uniform face on four things that already work the same way underneath. The real test of the abstraction is whether something that works in a completely different way, a subprocess instead of a socket, can still slot in without the caller changing a line. Here it can. You swap one config value, and a tool that talked to an API now talks through a binary, and nothing downstream so much as blinks.\nThe bottom line go-tool-base\u0026rsquo;s chat package puts five providers behind one ChatClient interface, and ProviderClaudeLocal is the one that isn\u0026rsquo;t an API. It runs the locally installed, pre-authenticated claude CLI as a subprocess.\nIt exists for the locked-down environment where outbound HTTPS to the AI API is blocked but the claude binary is allowed: there, AI features keep working where a direct call would fail. The trade is a narrower capability set (no tool calling, no streaming, plain chat and structured output only) so you prefer the API providers when you can reach them and fall back to this when you can\u0026rsquo;t. And because it\u0026rsquo;s still a ChatClient, all the subprocess machinery stays hidden, and your code uses it without knowing it\u0026rsquo;s there. That last part is the real test of an abstraction: a provider that works in an entirely different way still slots in unchanged.\n","date":"2026-04-06T00:00:00Z","image":"/the-ai-provider-that-isnt-an-api/cover-the-ai-provider-that-isnt-an-api.png","permalink":"/the-ai-provider-that-isnt-an-api/","title":"The AI provider that isn't an API"},{"content":"An AI conversation is, fundamentally, its own history. The model\u0026rsquo;s next answer depends on everything said so far. And a CLI tool, by its very nature, forgets everything the moment it exits. Put those two facts together and you get the problem: run an AI command, exit, run it again, and you\u0026rsquo;re talking to someone who\u0026rsquo;s never met you.\nA CLI forgets everything A long-running service keeps its state in memory for as long as it runs. A CLI tool doesn\u0026rsquo;t get that luxury. It starts, does one thing, exits. The next invocation is a brand-new process with no memory of the last one.\nFor most commands that\u0026rsquo;s exactly right, and you wouldn\u0026rsquo;t want it any other way. But an AI conversation is a different kind of beast, because a conversation is its history. The model\u0026rsquo;s next answer depends on everything said so far. Run an AI command, exit, run it again, and you\u0026rsquo;ve started a fresh conversation with someone who\u0026rsquo;s never met you. For an interactive assistant, or any AI workflow that unfolds across several invocations, that\u0026rsquo;s plainly the wrong behaviour. The user expects to pick up where they left off.\nSave and restore The chat package handles this through a PersistentChatClient interface. Like streaming, it\u0026rsquo;s an optional capability discovered with a type assertion, sitting beside the four-method core rather than bloating it. A client that supports persistence also satisfies this interface:\nif pc, ok := client.(chat.PersistentChatClient); ok { snapshot, err := pc.Save() // store the snapshot somewhere } A snapshot is a serialisable value that captures the conversation. You store it. Next run, you load it, Restore it onto a fresh client, re-register your tools, and call Chat again. \u0026ldquo;Where were we?\u0026rdquo; works, because the model is handed back the whole history.\nA snapshot is opinionated about what it carries The interesting part is what a snapshot does and doesn\u0026rsquo;t contain, because that\u0026rsquo;s a series of deliberate decisions.\nIt carries the messages, the system prompt, the model name, and tool metadata: the names, descriptions and parameter schemas of the tools that were registered.\nIt does not carry tool handlers. Handlers are code, not data; you can\u0026rsquo;t serialise a function meaningfully, so after a restore you re-register them with SetTools. The snapshot remembers that a tool called read_file existed and what its shape was; it doesn\u0026rsquo;t try to remember the Go function behind it.\nAnd it does not carry API tokens. This is the one to dwell on. A snapshot is a file. A file gets synced, backed up, copied between machines, attached to a support ticket by a user trying to be helpful. A snapshot that carried the API key would be a credential leak the moment it left the laptop it was made on. So the snapshot never contains a token, at all. On restore, the client picks the credential up again the ordinary way, from the environment or the keychain. The conversation and the secret are kept in separate places on purpose, and only one of them is ever in the file.\nEncrypted at rest, if you want it The package ships a FileStore that writes snapshots as JSON files, with 0600 permissions in a 0700 directory, and it can encrypt them. Pass WithEncryption a 32-byte key and snapshots are written with AES-256-GCM.\nThat option exists because a conversation can hold sensitive content even when it holds no credential. The log a user pasted in for analysis, the source file they asked the model to review, the internal details tucked into their questions: none of that is an API key, and all of it might be something you\u0026rsquo;d rather not have sitting in plain JSON in a backup somewhere. Encryption at rest covers it.\nThe FileStore is also careful about the snapshot identifiers it\u0026rsquo;s handed. An ID has to be a canonical UUID, and the resolved file path is checked to lie inside the store directory, so a snapshot ID arriving from an untrusted source (a CLI flag, a request payload) can\u0026rsquo;t be bent into a path-traversal that reads or writes somewhere it shouldn\u0026rsquo;t. Persisting conversations adds a small filesystem surface, and the store treats it as exactly that.\nThe short version A CLI tool forgets everything between invocations, which is correct for most commands and wrong for an AI conversation, because a conversation is its history.\ngo-tool-base\u0026rsquo;s chat package lets you persist one. PersistentChatClient saves a snapshot you can store and restore later, picking the conversation back up where it ended. The snapshot is deliberate about its contents: messages, system prompt and tool metadata yes; tool handlers no, because they\u0026rsquo;re code you re-register; API tokens never, because a snapshot is a file and a file travels. The built-in FileStore can encrypt snapshots at rest with AES-256-GCM and validates snapshot IDs against path traversal. Resumable conversations, without the conversation file turning into a place secrets leak from.\n","date":"2026-04-04T00:00:00Z","image":"/ai-conversations-you-can-resume/cover-ai-conversations-you-can-resume.png","permalink":"/ai-conversations-you-can-resume/","title":"AI conversations you can resume"},{"content":"Most AI code generation works on a charming little principle I\u0026rsquo;ll call generate-and-hope. The model writes the code, the model stops at the closing brace, and whether the thing actually compiles is left as an exercise for you. For a snippet you paste into an editor, fine. For a whole generated command, that\u0026rsquo;s just outsourcing the disappointment.\ngo-tool-base does something I\u0026rsquo;m rather happier with: the AI has to make the build pass before it\u0026rsquo;s allowed to claim it\u0026rsquo;s done.\nGenerate and hope The usual shape of AI code generation is this. You ask for code, the model produces it, and the model\u0026rsquo;s job ends at the closing brace. Whether it compiles, whether the tests pass, whether the imports even resolve, none of that has been checked. The model produced something that looks right. You find out whether it is right when you build it.\nFor a snippet you paste into an editor, that\u0026rsquo;s perfectly fine. The compiler tells you in a second. But go-tool-base\u0026rsquo;s generator, driven by gtb generate command --script or --prompt, produces a whole command: the implementation, its tests, the lot. \u0026ldquo;Generate and hope\u0026rdquo; at that scale means handing the user a project that may or may not build, and quietly making them the one who finds out which.\nDrafting is only step one So the generator doesn\u0026rsquo;t stop at drafting. Writing the first version of the implementation and its tests is step one of two. Step two is an autonomous repair agent.\nOnce the draft is on the filesystem, a separate agent takes over. It\u0026rsquo;s an LLM running in a loop, but a loop aimed at one narrow, checkable job: make this project build and pass its tests. It isn\u0026rsquo;t asked to be creative. It\u0026rsquo;s asked to get to green.\nA fixed set of tools, and no shell The agent is not handed a shell. It\u0026rsquo;s given a fixed, defined set of tools and nothing else. Three of them let it explore and edit the project: list_dir, read_file, write_file. Four of them let it verify the project:\ngo_build runs the build and captures the compiler errors. go_test runs the tests and captures the failures. go_get resolves a missing dependency. golangci_lint runs the project\u0026rsquo;s linter. That restriction is the design, not a limitation of it. The agent can\u0026rsquo;t delete arbitrary files, can\u0026rsquo;t reach the network, can\u0026rsquo;t run anything that isn\u0026rsquo;t on the list. It has exactly what it needs to make code compile and nothing it would need to do damage. Its file writes are confined to the project directory by an explicit path check, so even write_file can\u0026rsquo;t go wandering up into /etc. A coding agent you\u0026rsquo;d actually let near a filesystem is one whose abilities are an allowlist, not a denylist. (I keep coming back to that principle through this series\u0026hellip; safety as a boundary you draw, not a behaviour you hope for.)\nThe loop The repair loop is a ReAct loop, the same reason-act-observe shape as the tool-calling loop, only this time pointed at a goal:\nThe draft is on disk. Verify: run go_build and go_test. If verification failed, read the error logs, the compiler error or the failing test. Reason about the cause: an undefined variable, a missing import, a wrong signature. Act: call write_file to patch the code, or go_get to add the dependency. Loop. Steps two to five repeat until the project is green, or the agent hits its step limit, which defaults to 15. What makes this work is treating the error output as feedback rather than as a failure to log and walk away from. A compiler error is the single most useful sentence you can hand a model that\u0026rsquo;s trying to fix code. It says what\u0026rsquo;s wrong, and usually where. The loop feeds it straight back in, and the model fixes against it.\nVerification changes what \u0026ldquo;done\u0026rdquo; means Here\u0026rsquo;s the real shift, and the agent\u0026rsquo;s own documentation puts it well: the agent \u0026ldquo;doesn\u0026rsquo;t just say it fixed a bug; it uses a Test tool to verify the fix before reporting success.\u0026rdquo;\nA generate-and-hope model reports success when it finishes writing. It has no idea whether the code works, and it isn\u0026rsquo;t really claiming otherwise. \u0026ldquo;Done\u0026rdquo; means \u0026ldquo;I produced text\u0026rdquo;. The repair agent reports success when go_build and go_test actually pass. \u0026ldquo;Done\u0026rdquo; means \u0026ldquo;the build is green\u0026rdquo;. Those are two completely different claims, and only the second is worth anything to the person who asked for the command.\nThat\u0026rsquo;s the line between an AI that\u0026rsquo;s a creative writer and an AI that\u0026rsquo;s a collaborator you can hand a task to. And when the agent can\u0026rsquo;t reach green, when it spends its whole step budget and the project is still broken, the generator fails safely: it leaves the best-attempt code in place, commented out so the project still compiles, and tells the user what to finish by hand. There\u0026rsquo;s also an --agentless flag for anyone who\u0026rsquo;d rather have a plain single-shot retry than the multi-step agent. The default, though, is the agent, because the default should be code that\u0026rsquo;s been checked.\nWhere this leaves us Most AI code generation generates and hopes: the model writes code and the user discovers whether it works. For a whole generated command, that pushes a may-or-may-not-build project onto the user.\ngo-tool-base\u0026rsquo;s generator drafts the command and then hands it to an autonomous repair agent. The agent has a fixed set of tools (explore and edit the project, build it, test it, lint it, fetch dependencies) and no shell at all, with file writes confined to the project directory. It runs a ReAct loop, reading each error and patching against it, until the build is green or it exhausts its steps. The point is what \u0026ldquo;done\u0026rdquo; comes to mean: not \u0026ldquo;the model finished writing\u0026rdquo;, but \u0026ldquo;the build passes\u0026rdquo;. Only one of those is a claim worth trusting.\n","date":"2026-04-02T00:00:00Z","image":"/an-ai-agent-that-has-to-make-the-build-pass/cover-an-ai-agent-that-has-to-make-the-build-pass.png","permalink":"/an-ai-agent-that-has-to-make-the-build-pass/","title":"An AI agent that has to make the build pass"},{"content":"Ask an LLM a question and it hands you back prose. Lovely to read, miserable to program against. You wanted the one number buried in the middle of it, and now you\u0026rsquo;re writing a regular expression to fish a word out of three well-written paragraphs that phrase themselves slightly differently every single time you run them.\nThere\u0026rsquo;s a much better way, and it\u0026rsquo;s the difference between forever interpreting an LLM and actually building on one.\nThe problem with a paragraph You ask an LLM to analyse a log file and tell you the severity of what it found and a suggested fix. It comes back with three well-written paragraphs. Somewhere in there is the word \u0026ldquo;critical\u0026rdquo;, and somewhere is the fix.\nYour program now has to extract those two facts from prose, and prose has no contract. The next run, the model phrases it differently. It leads with a caveat. It says \u0026ldquo;severe\u0026rdquo; where last time it said \u0026ldquo;critical\u0026rdquo;. It puts the fix first. Anything that worked by finding \u0026ldquo;critical\u0026rdquo; in the text is now quietly wrong, and you didn\u0026rsquo;t change a line. Parsing free text for structured facts is a game you lose slowly.\nWhat you actually wanted was never a paragraph. It was a value: a thing with a severity field and a fix field, that you can branch on and store and pass around like any other.\nAsk for the struct, not the prose go-tool-base\u0026rsquo;s chat package draws the line with two methods. Chat gives you text. Ask gives you a struct.\nYou define the Go type you want back:\ntype Analysis struct { Severity string `json:\u0026#34;severity\u0026#34;` Fix string `json:\u0026#34;fix\u0026#34;` } var result Analysis err := client.Ask(ctx, \u0026#34;Analyse this log file: \u0026#34;+logText, \u0026amp;result) The framework generates a JSON Schema from that struct, sends it to the model as the required response format, and unmarshals the reply straight into result. You never lay a finger on the prose. You get result.Severity and result.Fix, typed, ready to use. If you want the model\u0026rsquo;s answer to drive a switch statement, this is the method that lets it.\nThe struct is the schema is the contract The detail that makes this hold up over time: you don\u0026rsquo;t write the schema. The struct is the schema.\nThe framework derives the JSON Schema from your type. In go-tool-base that\u0026rsquo;s GenerateSchema[T](); in rust-tool-base the schema comes from your Rust type through schemars. (Yes, there\u0026rsquo;s a Rust sibling now. I\u0026rsquo;ll introduce it properly in a few weeks, but it keeps gatecrashing these posts because the two frameworks deliberately share ideas.) Either way there\u0026rsquo;s one definition, your type, and the schema is just a projection of it.\nThat matters, because otherwise two things have to agree. There\u0026rsquo;s the schema you tell the model to obey, and there\u0026rsquo;s the type you unmarshal the answer into. Hand-write the schema and those two can drift: add a field to the struct, forget to add it to the schema, and the model is never told to produce it, so it silently never appears. Deriving the schema from the type collapses the two into one. They can\u0026rsquo;t disagree, because there\u0026rsquo;s only one of them.\nBoth frameworks, with one extra step in Rust go-tool-base does this with Ask and a ResponseSchema set on the client config. rust-tool-base does it with chat_structured::\u0026lt;T\u0026gt;, where T is any type that\u0026rsquo;s both deserialisable and JsonSchema.\nrust-tool-base adds one step worth calling out. Before it deserialises the model\u0026rsquo;s reply into your T, it validates the raw response against the schema with a JSON Schema validator. That splits the failure into two distinct, named cases: the response didn\u0026rsquo;t match the schema, or it matched the schema but still wouldn\u0026rsquo;t deserialise. A model that returns subtly wrong JSON fails loudly and specifically, with an error that tells you which of those happened, instead of quietly handing you a zero-valued struct that you end up debugging an hour later.\nWhen you\u0026rsquo;d reach for it The line is simple, and it\u0026rsquo;s about who reads the answer.\nIf a human reads the answer, prose is right. Chat, free text, let the model write well. A summary, an explanation, an interactive reply: leave all of those as prose.\nIf a program consumes the answer, you want a value. Classification, extraction, a code review scored out of a hundred with a list of issues, a yes-or-no with reasons: anything where the next thing that happens is your code branching on the result. There, Ask and chat_structured turn the LLM from something you have to interpret into something that returns a value, and a typed value is a thing you can actually build on.\nTo sum up An LLM returns prose by default, and prose has no contract, so a program that picks structured facts out of it breaks the moment the model rephrases.\nStructured output asks for the value instead. You define a struct, the framework derives a JSON Schema from it, the model is constrained to that shape, and you get a typed result. go-tool-base\u0026rsquo;s Ask and rust-tool-base\u0026rsquo;s chat_structured both work this way, with the schema derived from your type so the schema and the type can\u0026rsquo;t drift; rust-tool-base additionally validates the response against the schema before deserialising. Use it whenever the answer feeds code rather than a human. It\u0026rsquo;s one of the four methods that make up go-tool-base\u0026rsquo;s small chat interface, and it\u0026rsquo;s the one that makes an LLM safe to program against.\n","date":"2026-03-31T00:00:00Z","image":"/stop-regexing-the-llms-prose/cover-stop-regexing-the-llms-prose.png","permalink":"/stop-regexing-the-llms-prose/","title":"Stop regex-ing the LLM's prose"},{"content":"Usage telemetry is genuinely useful. Knowing which commands people actually run, where the errors cluster, whether anyone ever touched the feature you spent a fortnight on\u0026hellip; that\u0026rsquo;s the stuff that makes you a better maintainer. Wanting it is completely legitimate.\nThe trouble is that the usual way of getting it, on by default and quietly hoovering up everything, is a small betrayal of the people who installed your tool to get a job done. I wasn\u0026rsquo;t willing to build that, so go-tool-base\u0026rsquo;s telemetry starts from a different question.\nThe data you want, and the line you shouldn\u0026rsquo;t cross If you maintain a tool, you want to know how it\u0026rsquo;s actually used. Which commands matter and which are dead weight. Where the error rate spikes. Whether anyone touched the feature you spent that fortnight on. That information makes you a better maintainer, and, to say it again, wanting it is completely legitimate.\nThe trouble is the standard way of getting it. Telemetry on by default. An opt-out buried three levels down in a settings file nobody reads. And once it\u0026rsquo;s running, it quietly collects far more than it ever admitted to: the arguments people passed, the paths they were working in, an IP address for good measure.\nEvery one of those is a small betrayal of someone who installed your tool to get a job done, not to become a data point. And the cost when users notice isn\u0026rsquo;t a slap on the wrist. It\u0026rsquo;s trust, and trust in a developer tool does not grow back quickly. A tool that surprises you once with what it was quietly collecting is a tool you uninstall and warn your colleagues about.\nSo go-tool-base\u0026rsquo;s telemetry started from a different question. Not \u0026ldquo;how do we collect the most data\u0026rdquo; but \u0026ldquo;how do we collect useful data without ever putting the user in a position they didn\u0026rsquo;t choose\u0026rdquo;.\nRule one: it is off until you say otherwise The foundation is the simplest possible rule, and it\u0026rsquo;s absolute. Telemetry is never enabled by default. A freshly installed tool built on go-tool-base sends nothing. Not a heartbeat, not a ping, nothing at all.\nIt only starts collecting when the user makes an explicit, visible choice to let it. Three honest doors: they run telemetry enable, they say yes to a clear prompt during init, or they set TELEMETRY_ENABLED themselves. All three are deliberate acts. None of them is a pre-ticked box or a default they have to discover and then undo.\nThis is opt-in, and the distinction from a well-hidden opt-out is the entire point. Opt-out telemetry treats consent as something to be assumed and grudgingly reversed. Opt-in treats it as something that has to be given. Only one of those is actually consent.\nRule two: no personally identifiable information, full stop Consent to \u0026ldquo;some telemetry\u0026rdquo; is not consent to \u0026ldquo;any telemetry\u0026rdquo;, so the second rule constrains what can ever be collected, even from a user who\u0026rsquo;s opted in.\nNo personally identifiable information. The framework does not record command arguments (they routinely contain paths, hostnames, the occasional secret someone\u0026rsquo;s pasted in). It does not record file contents. It does not record IP addresses.\nIt does need some notion of \u0026ldquo;distinct installations\u0026rdquo; for the numbers to mean anything, so it derives a machine ID from a handful of system signals and runs it through SHA-256. What leaves the machine is a hash. It tells you \u0026ldquo;this is the same install as last week\u0026rdquo; and tells you precisely nothing about whose install it is, and the hash can\u0026rsquo;t be walked backwards into the signals it came from.\nThe events themselves are deliberately thin. Which command ran, roughly how long it took, whether it errored. The shape of usage, not a transcript of it.\nRule three: the author picks the destination Even with consent given and PII excluded, there\u0026rsquo;s a third question: where does the data actually go? go-tool-base doesn\u0026rsquo;t answer that for you, because it can\u0026rsquo;t. A corporate internal tool, an open-source CLI and an air-gapped utility have completely different right answers.\nSo the backend is the tool author\u0026rsquo;s choice. The framework ships several (a noop backend, stdout, a file, plain HTTP, and OpenTelemetry over OTLP) and supports custom ones. The noop backend matters more than it looks: it lets a tool wire up the whole telemetry surface, commands and all, while sending data precisely nowhere. A perfectly reasonable, fully supported configuration.\nPluggable backends also mean the data never has to touch any infrastructure I run. It goes where the tool\u0026rsquo;s author decides, on their terms. The framework provides the plumbing and stays well out of the destination.\nAnd a way back out One last thing, because it\u0026rsquo;s the part that makes the opt-in real rather than decorative. A user who opted in can opt straight back out, and the package includes a GDPR-aligned deletion path, so \u0026ldquo;stop, and remove what you have\u0026rdquo; is an actual supported request rather than a polite fiction.\nConsent you can\u0026rsquo;t withdraw isn\u0026rsquo;t consent. It\u0026rsquo;s a one-way door with a friendly sign on it. The deletion path is what keeps the front door an actual door.\nThe bottom line Telemetry is genuinely useful to a maintainer and genuinely dangerous to the trust of the people running the tool, and the usual implementation (on by default, opt-out buried, collecting everything) spends that trust recklessly. go-tool-base\u0026rsquo;s telemetry holds three lines: never enabled without an explicit user action, never collecting personally identifiable information even once enabled, and always sending data to a destination the tool\u0026rsquo;s author chose, up to and including nowhere. A real deletion path makes the opt-in something you can take back.\nYou can have your usage numbers. You just have to ask for them, the way you would for anything else that wasn\u0026rsquo;t yours to begin with.\n","date":"2026-03-30T00:00:00Z","image":"/telemetry-that-asks-first/cover-telemetry-that-asks-first.png","permalink":"/telemetry-that-asks-first/","title":"Telemetry that asks first"},{"content":"An AI that can only produce text can describe your system. An AI that can call your Go functions can actually operate it. That gap, between describing and doing, is the difference between a chatbot and something genuinely useful, and crossing it comes down to one fiddly mechanism: tool-calling, and the loop that drives it.\nTalking about the system versus operating it Wire an AI provider into a CLI command and you get something that can talk. Ask it a question, get a paragraph back. Useful, up to a point.\nBut notice the ceiling. An AI that can only generate text can describe things. It can tell you what it would do. What it can\u0026rsquo;t do is look at the actual current state of your system, or take a real action, because it has no hands. It\u0026rsquo;s reasoning in a vacuum about a world it can\u0026rsquo;t reach out and touch.\nThe thing that gives it hands is tool-calling. You hand the AI a set of functions it\u0026rsquo;s allowed to call. Now, mid-conversation, it can decide it needs to read that file before it can answer, or run that query, or check that status, and actually go and do it, and then reason about the real result. The AI stops describing your system and starts operating it.\nThe loop is the hard part Tool-calling has a shape, and the shape is a loop. The literature calls it ReAct: Reason, Act, Observe.\nThe AI reasons about the prompt and decides whether it needs a tool. If it does, it acts, asking for a specific tool with specific arguments. Your code runs the tool and feeds the result back. The AI observes that result. Round again. Reason about the new information, maybe call another tool, maybe several. Keep going until the AI has what it needs and produces a final text answer with no more tool calls. Conceptually simple. Tedious and error-prone to implement by hand every single time: parsing the model\u0026rsquo;s tool-call requests, dispatching to the right function, marshalling arguments in and results out, feeding observations back in the exact format the provider expects, knowing when to stop, and not looping forever if the model gets itself stuck.\nThat orchestration is pure plumbing, and it\u0026rsquo;s identical for every tool and every command. So you can probably guess what\u0026rsquo;s coming: go-tool-base\u0026rsquo;s chat package owns it. You don\u0026rsquo;t write the loop. You write the tools.\nDefining a tool A chat.Tool is four things: a name, a description, a parameter schema, and a handler. The description is what the AI reads to decide whether to use the tool, so it\u0026rsquo;s worth writing well. The schema describes the arguments, and you don\u0026rsquo;t hand-write it. You write a tagged Go struct and let it generate:\ntype ReadFileParams struct { Path string `json:\u0026#34;path\u0026#34; jsonschema_description:\u0026#34;Relative path to the file\u0026#34;` } The struct is the contract. The framework derives the JSON Schema the AI is given straight from those tags, so the schema and the Go type the handler receives can\u0026rsquo;t drift apart, because they share a single source. The handler is then just an ordinary Go function that takes those parameters and returns a result.\nYou register your tools with SetTools, call Chat, and that\u0026rsquo;s the whole of your involvement. The framework runs the ReAct loop and Chat returns the AI\u0026rsquo;s final text answer once the loop settles.\nTwo details that show it was built for real use A couple of decisions in the loop tell you it\u0026rsquo;s meant for production, not a demo.\nTool errors don\u0026rsquo;t abort the conversation. When a handler returns an error, the framework doesn\u0026rsquo;t crash the loop. It hands the error back to the AI as a string, as just another observation. That\u0026rsquo;s deliberate, and it\u0026rsquo;s right. A real agent should be able to call a tool, watch it fail, and react: try different arguments, take a different route, or tell the user it couldn\u0026rsquo;t manage it. A loop that aborted on the first tool error would be far more brittle than the model driving it.\nThe loop is bounded. There\u0026rsquo;s a MaxSteps limit, default 20. An AI that gets confused could otherwise call tools forever, and a CLI command that never returns is a worse failure than a wrong answer. The cap guarantees the command terminates. The agent gets room to genuinely work a problem across many steps, but not infinite room to flail about in.\nThere\u0026rsquo;s also parallel tool execution: when the model asks for several tools in a single step (three independent file reads, say) the framework runs them concurrently rather than one after another, because there\u0026rsquo;s no reason to make the AI sit and wait out a sequence of things that don\u0026rsquo;t depend on each other.\nBoiling it down A text-only AI can describe your system; an AI that can call your functions can operate it. Bridging that gap means tool-calling, and tool-calling means the ReAct loop (reason, act, observe, repeat) whose orchestration is fiddly, identical every time, and not a problem worth solving twice.\ngo-tool-base\u0026rsquo;s chat package runs the loop for you. You define chat.Tool values (name, description, a tagged parameter struct that generates its own schema, a handler), call SetTools and Chat, and get the final answer. Tool errors go back to the AI as observations so it can recover, and a MaxSteps cap guarantees the command always terminates. You write Go functions. The framework turns them into things an agent can reach for.\n","date":"2026-03-29T00:00:00Z","image":"/letting-the-ai-call-your-go-functions/cover-letting-the-ai-call-your-go-functions.png","permalink":"/letting-the-ai-call-your-go-functions/","title":"Letting the AI call your Go functions"},{"content":"Let me describe the actual lifecycle of a user meeting your CLI tool, because it\u0026rsquo;s a bit humbling. They run it. It doesn\u0026rsquo;t quite do what they expected. They run it again with --help. They get a wall of monospaced flag descriptions, skim it, don\u0026rsquo;t find the thing they wanted, and either give up or go and ask a human who already knows.\nYour documentation might be magnificent. It doesn\u0026rsquo;t matter, because the user never reached it.\nThe manual loses on location, not quality That\u0026rsquo;s the lifecycle, and notice exactly where it breaks. The documentation might be excellent. It might answer their precise question in full. It doesn\u0026rsquo;t matter, because it\u0026rsquo;s on a website, in another window, behind a search box, and the user is here, in the terminal, mid-task. The docs lost not on quality but on location. They simply weren\u0026rsquo;t where the work was.\ngo-tool-base\u0026rsquo;s answer starts with a decision about location: the documentation gets embedded into the binary itself. Your docs/ folder ships inside the tool, the same way its default config does. Wherever the tool is installed, the docs are right there alongside it, no network, no browser. That embedding is what makes everything else possible, and there are two things built on top of it.\nA browser, in the terminal The first is the docs command, and it\u0026rsquo;s not --help with extra steps. It launches a proper Terminal User Interface, built on Bubble Tea.\nIt has a sidebar, structured from the project\u0026rsquo;s own zensical.toml or mkdocs.yml, so the docs are a navigable tree rather than one flat scroll. Markdown renders with real formatting through Glamour (colour, tables, lists, headings) instead of collapsing into monospaced soup. There\u0026rsquo;s live search across every page, regex included.\nCompared with man and --help, the difference isn\u0026rsquo;t a nicer coat of paint. man gives you linear scrolling and grep; this gives you a structured tree, rich rendering and real search. It\u0026rsquo;s the documentation experience a modern developer expects, except it followed the tool into the terminal instead of demanding the user leave it.\nA documentation assistant that won\u0026rsquo;t make things up The second thing built on the embedded docs is the one I find genuinely transformative: docs ask.\nThe user doesn\u0026rsquo;t navigate anything. They just ask:\nmytool docs ask \u0026#34;how do I point this at a self-hosted server?\u0026#34; and get a direct, specific answer. Under the hood, the framework collates the tool\u0026rsquo;s embedded markdown and hands it to the configured AI provider (Claude, OpenAI, Gemini, Claude Local, any OpenAI-compatible endpoint) as the context for the question.\nNow, \u0026ldquo;an AI answers questions about my tool\u0026rdquo; should immediately make you nervous, and the correct thing to be nervous about is hallucination. An AI that confidently invents a flag that doesn\u0026rsquo;t exist, or describes behaviour the tool simply doesn\u0026rsquo;t have, is worse than no assistant at all, because the user trusts it.\nThis is where embedding the docs pays off a second time, and it\u0026rsquo;s why I keep stressing that the corpus is closed. The model is instructed to answer only from the tool\u0026rsquo;s actual documentation, and the context it\u0026rsquo;s handed is exactly that documentation and nothing else. It isn\u0026rsquo;t drawing on a vague memory of similar tools from its training data. It\u0026rsquo;s answering from this tool\u0026rsquo;s real, shipped, version-matched docs. The corpus is small, closed and authoritative, which is the combination that keeps the answers honest. \u0026ldquo;Zero hallucination by design\u0026rdquo; isn\u0026rsquo;t a slogan about the model. It\u0026rsquo;s a property of bounding what the model is allowed to look at, which is the same instinct I leaned on with the mcp command: the safety comes from the boundary you drew, not from trusting the AI to behave itself.\nThere\u0026rsquo;s a nice second-order effect, too. The answer is always about the version of the tool the user actually has, because the docs were embedded into that build. No mismatch between a website documenting the latest release and the slightly older binary sitting on the user\u0026rsquo;s machine.\nThe upshot Documentation usually loses to --help not on quality but on location: it\u0026rsquo;s in a browser, and the user is in the terminal. go-tool-base embeds the docs into the binary and surfaces them two ways: a docs command that\u0026rsquo;s a real TUI browser with a sidebar, rich markdown and search, and docs ask, which answers natural-language questions using the embedded docs as context.\nBecause that context is the tool\u0026rsquo;s own closed, shipped documentation and the model is told to use nothing else, the assistant stays grounded, and it\u0026rsquo;s always describing the exact version the user is holding. The fix for unread documentation was never to write more of it. It was to put it where the work happens and let it answer back.\n","date":"2026-03-29T00:00:00Z","image":"/nobody-reads-the-manual/cover-nobody-reads-the-manual.png","permalink":"/nobody-reads-the-manual/","title":"Nobody reads the manual"},{"content":"I have a slightly complicated relationship with BDD. I\u0026rsquo;ve watched it turn a tangled test suite into something the whole team could read and reason about, and I\u0026rsquo;ve watched it turn a perfectly good unit test into a paragraph of ceremonial English that nobody benefits from. So when go-tool-base brought in Cucumber-style BDD, the interesting decision wasn\u0026rsquo;t adopting it. It was being ruthless about where not to.\nTwo tests that hurt for different reasons Most of go-tool-base\u0026rsquo;s tests are ordinary table-driven Go tests, and they\u0026rsquo;re absolutely fine. A function, a slice of input/expected pairs, a loop. Nobody needs Gherkin to understand a parser test.\nBut two areas were genuinely painful, and they were painful in the same way: the test had become harder to understand than the thing it was testing.\nThe first was pkg/controls, the service-lifecycle package. It runs a small state machine (Unknown, Running, Stopping, Stopped) with signal handling, health monitoring, restart policies and graceful shutdown all woven through it. The integration tests for graceful shutdown had grown to over three hundred lines of imperative goroutine and channel coordination. They worked. But reviewing them was a slog, and a test you can\u0026rsquo;t review with confidence is a test you can\u0026rsquo;t trust when it fails. The behaviour being checked, \u0026ldquo;when a shutdown signal arrives mid-startup, the controller stops cleanly\u0026rdquo;, was a simple sentence buried under a heap of synchronisation scaffolding.\nThe second was the CLI itself. init, update, doctor are user workflows. \u0026ldquo;Given a config file with a custom value, when I run init, then the custom value survives the merge.\u0026rdquo; That\u0026rsquo;s already a Given/When/Then; it just happened to be written out as Go.\nGodog, and the line I drew Godog is the official Go implementation of Cucumber. You write .feature files in plain Gherkin and bind each step to a Go function. The shutdown scenario stops being three hundred lines of channels and becomes this:\nScenario: graceful shutdown completes within the deadline Given a controller with two registered services When a shutdown signal is received Then both services stop in registration order And the controller reports a clean shutdown The goroutine choreography doesn\u0026rsquo;t vanish, of course. It moves into the step definitions, written once and reused. What changes is that the scenario is now readable by someone who\u0026rsquo;s never opened the file before, including someone from an ops team who\u0026rsquo;ll never write a line of Go but absolutely has opinions about how shutdown should behave.\nHere\u0026rsquo;s the part I want to dwell on, because it\u0026rsquo;s the part most BDD adoptions get wrong. The first design decision written down for this work was: strategic, not universal. Use Godog only where BDD adds clarity. Keep table-driven Go tests as the baseline everywhere else.\nThat sounds obvious written down. It is not obvious in practice, because BDD has a gravitational pull. Once a team has feature files, there\u0026rsquo;s a powerful urge to express everything as feature files, for consistency. And that\u0026rsquo;s how you end up with Gherkin scenarios for a pure function (Given the number 2, When I double it, Then I get 4) which is pure ceremony. You\u0026rsquo;ve wrapped a one-line table test in a paragraph of English and a step-definition indirection, and made it actively worse.\nThe honest test for whether BDD belongs is this: is this test a narrative, or is it a matrix?\nA matrix is the same logic with many input/output pairs. That\u0026rsquo;s a table-driven test, that\u0026rsquo;s most unit tests, and Gherkin actively harms them. A narrative is a sequence of steps where the ordering and the state between steps is the thing under test, and that\u0026rsquo;s where Gherkin pays for itself. Lifecycle transitions are narratives. A user running three commands in sequence is a narrative. Doubling a number is not.\ngo-tool-base drew that line and stuck to it. Feature files live in features/ at the project root, where a non-Go developer can find and read them. Step definitions live in test/e2e/, kept well away from the unit tests. And the unit tests stayed exactly what they were, because they were already the right tool.\nMade to fit, not bolted on A couple of smaller decisions kept the BDD layer from feeling like a foreign object.\nIt runs under go test. There\u0026rsquo;s no separate Cucumber runner to install or remember. A godog.TestSuite is invoked from an ordinary TestFeatures(t *testing.T), so the BDD scenarios run in the same go test ./... as everything else. CI didn\u0026rsquo;t need a new concept bolted onto it.\nAnd the CLI end-to-end tests build the gtb binary once and reuse it across every scenario. Compiling a binary per scenario would make the suite slow enough that people would quietly start skipping it, and a test suite people skip is just decoration. Build once, test many.\nStepping back go-tool-base brought in Godog for BDD, but the decision worth writing about is the restraint. BDD was applied to exactly two things: the service-lifecycle state machine, where a 300-line goroutine tangle became a four-line scenario anyone can review, and CLI workflows, which are Given/When/Then by their very nature. Everywhere else, table-driven Go tests remained the baseline, because wrapping a matrix test in Gherkin makes it worse, not better.\nThe useful rule: BDD fits a narrative, ordered steps with meaningful state in between, and fights a matrix. Adopt it as a scalpel for the narratives. Resist the pull to turn it into a religion.\n","date":"2026-03-28T00:00:00Z","image":"/bdd-where-it-earns-its-place/cover-bdd-where-it-earns-its-place.png","permalink":"/bdd-where-it-earns-its-place/","title":"BDD where it earns its place, and nowhere else"},{"content":"The moment you decide a CLI tool should talk to an LLM, there\u0026rsquo;s a strong gravitational pull towards reaching for LangChain, or one of its many relatives. It\u0026rsquo;s the obvious move. It\u0026rsquo;s also, for most CLI work, a bit like hiring a removals firm to carry a single box up the stairs.\nLet me explain why go-tool-base went the other way, and what \u0026ldquo;the other way\u0026rdquo; actually looks like.\nThe instinct, and why it overshoots When you add AI to a tool, the instinct is to reach for the big general-purpose framework. LangChain and its relatives are capable, and they exist for a real need: orchestrating complex multi-step AI applications, with retrieval pipelines, memory stores, chains of calls, whole fleets of agents.\nNow look at what a CLI tool actually needs from an LLM. It needs to send a prompt and get text back. Sometimes it wants structured data back instead of prose. Sometimes it wants to let the model call a few of the tool\u0026rsquo;s own functions. That\u0026rsquo;s pretty much the whole list.\nPulling in a framework built to orchestrate retrieval and agent swarms in order to do that is a poor trade. You take on a large new vocabulary of concepts, a wide dependency surface, and a great deal of abstraction you\u0026rsquo;ll never touch, all to perform three or four operations. The framework isn\u0026rsquo;t wrong. It\u0026rsquo;s just answering a far bigger question than the one a CLI tool is asking.\nWhat go-tool-base chose instead go-tool-base didn\u0026rsquo;t reach for a framework. The decision is on the record in its own design notes: before a single line was written, LangChain Go, go-openai, Vercel\u0026rsquo;s AI SDK and around ten other options were evaluated, and not one of them matched what a CLI framework actually needs. So the chat package was built deliberately small.\nHow small? The entire core ChatClient interface is four methods:\ntype ChatClient interface { Add(prompt string) error Chat(ctx context.Context, prompt string) (string, error) Ask(question string, target any) error SetTools(tools []Tool) error } Add appends a message to the conversation. Chat sends a prompt and returns text. Ask sends a prompt and returns a typed Go struct, the model\u0026rsquo;s answer unmarshalled straight into a value you defined. SetTools hands the model a set of your own functions it\u0026rsquo;s allowed to call. That\u0026rsquo;s the whole surface. Downstream code that uses AI never holds anything larger than this, and never has to know which provider is behind it.\nThe package\u0026rsquo;s own documentation has a word for this: right-sized. Large enough to solve genuine provider-abstraction complexity, small enough that the full interface fits on a single screen.\n\u0026ldquo;Thin\u0026rdquo; is not the same as \u0026ldquo;does little\u0026rdquo; This is the part worth being precise about, because \u0026ldquo;four methods\u0026rdquo; can sound like \u0026ldquo;barely does anything\u0026rdquo;, and that\u0026rsquo;s the wrong read entirely.\nBehind those four methods sits genuinely awkward work. Five providers (OpenAI, Claude, Gemini, a locally installed claude binary, and any OpenAI-compatible endpoint) each with a different wire API, all normalised behind the one interface. A tool-calling loop. Structured output via JSON Schema, made to behave consistently across providers that each express it differently. Error normalisation. Token chunking.\nThe point of a thin abstraction is not that there\u0026rsquo;s little underneath it. It\u0026rsquo;s that the interface stays small while the implementation quietly absorbs the complexity. Four methods on the surface; five provider integrations and a tool-calling loop below the waterline. The thinness is a property of what the caller sees, not of what the package does. A reach-for-LangChain decision gets that backwards: it exposes the caller to all the machinery, whether or not the caller will ever need it.\nThe core stays small even as features grow There\u0026rsquo;s a neat detail in how chat keeps the interface from creeping. The package also supports streaming responses and conversation persistence, both of which are real features with real surface area. Neither of them is in the four-method core.\nInstead they\u0026rsquo;re separate, optional interfaces. A streaming-capable client also satisfies StreamingChatClient; a persistable one also satisfies PersistentChatClient. Code that wants those capabilities does a type assertion to ask for them, and code that doesn\u0026rsquo;t simply never sees them. So the common path stays four methods forever. New capabilities arrive as opt-in interfaces alongside the core, not as new methods bolted onto it. The thing that fits on one screen keeps fitting on one screen.\nExtensible without forking, testable without a network Two more properties keep the package small without making it limiting.\nIt\u0026rsquo;s extensible. The provider list isn\u0026rsquo;t closed. A RegisterProvider call lets any package contribute a new provider, and chat.New will route to it. You add a backend without forking pkg/chat or sending a patch upstream.\nAnd it\u0026rsquo;s testable. The package ships generated mocks. A downstream tool\u0026rsquo;s AI features can be tested against a mock ChatClient returning canned responses, with no network, no API key, and no flakiness. Because the interface is four methods, that mock is trivial to set up and complete by construction. A sprawling framework interface is a sprawling thing to fake; a four-method one is not. (I\u0026rsquo;ll come back to testing AI code properly in a later post, because it deserves a whole article of its own.)\nThe right size When a CLI tool needs AI, the instinct is a large framework like LangChain. For orchestrating retrieval pipelines and agent swarms, that\u0026rsquo;s exactly the right tool. For sending a prompt, getting a struct back, and letting the model call a few functions, it\u0026rsquo;s enormous overkill.\ngo-tool-base\u0026rsquo;s chat package is the deliberate alternative, chosen only after LangChain Go and a dozen others were weighed up and rejected. Its core ChatClient interface is four methods. Underneath sit five normalised providers, a tool-calling loop, structured output and error handling, but the caller sees four methods and never learns which provider is active. Streaming and persistence are opt-in interfaces beside the core, not additions to it. It extends without forking and tests without a network. Right-sized: the complexity is real, but it lives under the interface rather than in it.\n","date":"2026-03-27T00:00:00Z","image":"/an-ai-interface-that-fits-on-one-screen/cover-an-ai-interface-that-fits-on-one-screen.png","permalink":"/an-ai-interface-that-fits-on-one-screen/","title":"An AI interface that fits on one screen"},{"content":"Run a command in your favourite CLI tool and look at what comes back. Colour. Neatly aligned columns. A friendly little summary sentence. Lovely\u0026hellip; if you happen to be a human with eyes.\nBut a good half of any tool\u0026rsquo;s users aren\u0026rsquo;t people at all. They\u0026rsquo;re scripts, CI pipelines, bits of automation. And that pretty output you\u0026rsquo;re so proud of is, to them, actively hostile.\nYour tool has two audiences and only serves one I made more or less this same point about AI assistants when I argued that your CLI is already an AI tool. The machines are users too. Here it isn\u0026rsquo;t an AI doing the calling, it\u0026rsquo;s a humble shell script, but the principle is identical.\nRun a CLI command and look at what comes back. Colour. Aligned columns. A friendly summary sentence. It\u0026rsquo;s designed for a person reading a terminal, and for a person reading a terminal it\u0026rsquo;s great.\nNow picture the other half of your users. A deploy script that needs to know which version is installed. A CI job that runs doctor and wants to fail the build on one specific check. A bit of automation gluing your tool to three others. None of them have eyes. They have parsers.\nSo what do they do with your beautiful human output? They butcher it. They grep for a keyword, awk out the third field, sed off a prefix. It works in the demo. Then someone rewords a status line, or adds a column, or the colour codes shift, and every script downstream breaks at once. Silently, too, because a broken grep returns nothing rather than an error. You changed a sentence and quietly took out somebody\u0026rsquo;s pipeline without ever knowing.\nThe human-readable output was never the contract. It just got used as one, because it was the only output there was.\nGive the machines their own channel The fix is not to make the human output more parseable. That\u0026rsquo;s a trap. You\u0026rsquo;d be constraining prose meant for people in order to satisfy programs, and end up serving neither of them well. The fix is to give programs their own output format, declared and stable, kept well away from the prose.\nSo every command built with go-tool-base gets a --output flag. Leave it alone and you get the friendly human rendering. Pass --output json and you get something a parser can actually rely on.\nAnd not just some JSON. JSON with a fixed shape.\nOne envelope, every command The temptation with JSON output is to let each command emit whatever structure happens to suit it. Don\u0026rsquo;t. A consumer scripting against five of your commands then has to learn five shapes, and \u0026ldquo;where\u0026rsquo;s the actual payload?\u0026rdquo; has a different answer every single time.\ngo-tool-base wraps every command\u0026rsquo;s JSON in one standard Response envelope:\n{ \u0026#34;status\u0026#34;: \u0026#34;success\u0026#34;, \u0026#34;command\u0026#34;: \u0026#34;deploy\u0026#34;, \u0026#34;data\u0026#34;: { \u0026#34;environment\u0026#34;: \u0026#34;production\u0026#34;, \u0026#34;version\u0026#34;: \u0026#34;1.4.0\u0026#34;, \u0026#34;replicas\u0026#34;: 3 } } status says how it went. command says what produced it. data holds the command-specific payload, and only the payload. Every built-in command (version, doctor, update, init) emits exactly this shape. So does every command you write, because pkg/output hands you the envelope rather than letting you freelance:\nformat, _ := cmd.Flags().GetString(\u0026#34;output\u0026#34;) w := output.NewWriter(os.Stdout, output.Format(format)) return w.Write(output.Response{ Status: output.StatusSuccess, Command: \u0026#34;deploy\u0026#34;, Data: result, }) The consumer-side payoff is the whole point. A script can check .status without ever touching .data. It can pull .data.version and know the field is there because it\u0026rsquo;s typed, not scraped. It learns the envelope once, and every command in your tool, and every tool built on the framework, honours it. The contract is explicit, versioned, and the same everywhere, which is precisely what the abused human output never was.\nThe human output gets to relax There\u0026rsquo;s a quiet second benefit, and it\u0026rsquo;s my favourite kind: the sort you get for free. Once programs have their own reliable channel, the human output is freed. It no longer has to stay accidentally parseable. You can reword a status line, add colour, restructure a table, make it genuinely nicer to read, and not break a single script, because no script is reading it any more. They\u0026rsquo;re all over on --output json, where the real contract lives.\nTwo audiences, two formats, each one actually suited to its reader. That\u0026rsquo;s the deal a CLI tool ought to be offering, and most of them don\u0026rsquo;t.\nIn short A CLI tool that only emits human-readable output is only half-built, because half its users are programs that end up grep-ing prose and shattering the moment that prose changes. go-tool-base gives every command a --output json flag and one standard Response envelope (status, command, data) used identically by every built-in command and by anything you write through pkg/output. Machines get a stable, explicit, learn-it-once contract; humans get output that\u0026rsquo;s now free to be properly readable, because nothing fragile depends on its wording any more.\nIf your tool will ever be called by another program (and it will), give that program a front door. Don\u0026rsquo;t make it climb in through the window.\n","date":"2026-03-25T00:00:00Z","image":"/half-your-users-dont-have-eyes/cover-half-your-users-dont-have-eyes.png","permalink":"/half-your-users-dont-have-eyes/","title":"Half your users don't have eyes"},{"content":"There\u0026rsquo;s a moment in the life of a lot of CLI tools where they stop being a CLI tool. Nobody quite decides it. It just happens. Someone needs the thing to also expose a little HTTP endpoint, or poll a queue, or run a scheduler, so it grows a serve command\u0026hellip; and the honest command-line utility you wrote is suddenly a long-running service wearing a CLI as a hat.\nAnd a service needs a whole pile of production plumbing that a one-shot command never did.\nThe command that stops being a command go-tool-base is CLI-first. It is not CLI-only, and the reason is a pattern I\u0026rsquo;ve watched play out more times than I can count.\nA tool starts its life as an honest command-line utility. It runs, it does its thing, it exits. Then someone needs it to expose a small HTTP endpoint. Or poll a queue. Or run a scheduler. So it grows a serve command, or a run command, and the moment it does, the thing that was a CLI tool is now a long-running service that happens to have a CLI bolted on the front.\nAnd a long-running service needs a whole category of plumbing a one-shot command never did. It has to start things up in a sensible order. It has to shut them down gracefully when someone sends a SIGTERM, finishing in-flight work rather than dropping it on the floor. It has to tell an orchestrator whether it\u0026rsquo;s alive, and whether it\u0026rsquo;s ready. It has to do something sensible when one of its internal services quietly falls over at 3am.\nHand-rolled, that\u0026rsquo;s a few hundred lines of goroutine choreography, channel-wrangling and signal handling that every such tool reinvents, slightly differently and slightly wrong each time. It\u0026rsquo;s the first-afternoon problem all over again, just turning up later in the project\u0026rsquo;s life. So go-tool-base ships it: pkg/controls.\nA controller and the things it controls The model is small. A Controller manages any number of services, each of which satisfies a Controllable interface, which at heart is just a StartFunc and a StopFunc. An HTTP server, a background worker, a scheduler, anything with a \u0026ldquo;begin\u0026rdquo; and an \u0026ldquo;end\u0026rdquo;.\nYou register your services with the controller and it owns their collective lifecycle. They share a common set of channels (errors, OS signals, health, control messages) so the whole set can react together. A SIGTERM doesn\u0026rsquo;t get caught by one service off in a corner; it reaches the controller, and the controller takes everything down in order, each StopFunc handed a context with a deadline so that one sulking service can\u0026rsquo;t wedge the whole shutdown forever.\nThat ordering and timeout handling is the bit nobody enjoys writing and everybody needs. Centralising it means a tool that adds a second service later inherits correct coordinated shutdown for free, rather than discovering on its first production SIGTERM that it only half shuts down.\nProbes, because something is usually watching If the service ends up in Kubernetes (and a lot of them do) the orchestrator wants to ask two different questions, and they really are different questions.\nLiveness: are you alive, or are you wedged and in need of a kill? Readiness: are you alive and able to take traffic right now? A service can quite easily be live but not ready\u0026hellip; still warming a cache, still waiting on a dependency. Conflate the two and you get yourself killed during a slow startup, or sent traffic before you can actually serve it.\ncontrols keeps them separate. You attach a WithLiveness probe and a WithReadiness probe to a service, each just a function returning a health report, and the controller exposes them. The tool answers Kubernetes honestly, in Kubernetes\u0026rsquo; own terms, without you hand-wiring two more HTTP handlers.\nSelf-healing, but only if you ask The last piece is what happens when a service fails. A worker\u0026rsquo;s StartFunc returns an error. Health checks start failing. In a hand-rolled setup this is where you either crash the whole process or write yourself a bespoke restart loop.\ncontrols has a supervisor that can restart a failed service for you, and the important word in that sentence is can. It\u0026rsquo;s off by default. A service is only supervised if you hand it a RestartPolicy at registration:\ncontrols.WithRestartPolicy(controls.RestartPolicy{ MaxRestarts: 5, InitialBackoff: time.Second, MaxBackoff: 30 * time.Second, HealthFailureThreshold: 3, }) With a policy in place, the controller restarts the service if its StartFunc errors out, or if it racks up more consecutive health-check failures than the threshold allows. Restarts back off exponentially, from InitialBackoff up to a MaxBackoff ceiling, so a service that\u0026rsquo;s failing because its database is down doesn\u0026rsquo;t sit there hammering that database flat with a tight restart loop. MaxRestarts caps the attempts, because a service that\u0026rsquo;s failed five times in a row is not going to be rescued by a sixth go, and at that point honest failure beats a thrashing pretence of health.\nOpt-in matters here. Automatic restarts are exactly right for a resilient daemon and exactly wrong for a tool where a failure should stop the line and get a human\u0026rsquo;s attention. The framework doesn\u0026rsquo;t make that call for you. It gives you the supervisor and lets you point it at the services that genuinely want it.\nThe bottom line A surprising number of CLI tools become long-running services the day they grow a serve command, and the day they do, they need coordinated startup, graceful ordered shutdown, real liveness and readiness probes, and a considered answer to a service falling over. That\u0026rsquo;s a few hundred lines of fiddly, easy-to-get-wrong plumbing.\npkg/controls provides it: a Controller over Controllable services with shared channels and deadline-bounded graceful shutdown, separate Kubernetes-style liveness and readiness probes, and an opt-in supervisor that restarts failed services with exponential backoff and a restart ceiling. Your tool can start as a command and grow into a daemon without that growth turning into a rewrite.\nCLI-first, but not stuck there.\n","date":"2026-03-24T00:00:00Z","image":"/lifecycle-management-for-long-running-go-services/cover-lifecycle-management-for-long-running-go-services.png","permalink":"/lifecycle-management-for-long-running-go-services/","title":"Lifecycle management for when your CLI grows up into a service"},{"content":"Every CLI tool past a certain size grows a category of logic that doesn\u0026rsquo;t really belong to any one command, and yet has to happen for loads of them. Timing. An auth check. Panic recovery, so a crash becomes a clean error instead of a stack-trace all over someone\u0026rsquo;s terminal. A log line saying the command started and how it finished.\nWeb frameworks sorted this out years ago. CLIs, for some reason, mostly still copy-paste it around.\nThe logic that belongs to no single command That category of logic doesn\u0026rsquo;t belong to any one command, yet needs to happen for many of them. Time how long the command took. Check the user is authenticated before a command that needs it. Recover from a panic so a crash becomes a clean error rather than a stack-trace vomited across the screen. Log that the command started and how it ended.\nNone of that is the command\u0026rsquo;s job. The deploy command\u0026rsquo;s job is to deploy. But timing and recovery and auth still have to happen around it, and around build, and around sync.\nPut that logic inside each command\u0026rsquo;s RunE and you\u0026rsquo;ve copied the same six lines into thirty functions, which means thirty places to fix when the logging format changes and thirty chances to forget one of them. Cross-cutting concerns copied by hand don\u0026rsquo;t stay consistent. They drift, every time.\nWeb frameworks already solved this This is not a new problem. It\u0026rsquo;s about the oldest problem in web frameworks, and they settled on an answer a long time ago: middleware. Gin has it, Echo has it, every HTTP stack you\u0026rsquo;ve ever touched has it. A middleware is a wrapper that sits around a handler, runs its cross-cutting logic, and calls through to the handler in the middle.\nA CLI command is, structurally, just a handler too. So go-tool-base brings the same pattern to the Cobra command tree, with the same functional Chain shape:\ntype Middleware func( next func(cmd *cobra.Command, args []string) error, ) func(cmd *cobra.Command, args []string) error A middleware receives the next handler in the chain and returns a new handler that wraps it. You compose a stack of them, and each command\u0026rsquo;s real RunE runs in the middle of the onion. Write the timing logic once, as one middleware, and every command in the chain is timed. Change the log format once and all thirty commands change with it, because there was only ever one copy. (The \u0026ldquo;write it once, in a place where everyone inherits it\u0026rdquo; drum again, which I will keep banging until the series runs out.)\n\u0026ldquo;But Cobra already has PreRun\u0026rdquo; It does, and this is the objection worth answering properly, because Cobra ships PersistentPreRun and PreRun hooks and they look, at a glance, like they cover this.\nThey don\u0026rsquo;t, and the reason is structural. A PreRun hook is a thing that happens before the command. That\u0026rsquo;s all it is. It can\u0026rsquo;t run anything after. It can\u0026rsquo;t wrap the command in a defer. It can\u0026rsquo;t catch a panic the command throws. It can\u0026rsquo;t measure how long the command took, because measuring a duration needs a start point and an end point, and the hook only owns the start.\nA middleware wraps the entire execution. Because it\u0026rsquo;s a function that calls next() in its own body, it straddles the command:\nfunc TimingMiddleware(next HandlerFunc) HandlerFunc { return func(cmd *cobra.Command, args []string) error { start := time.Now() err := next(cmd, args) // the command runs here log.Debug(\u0026#34;command finished\u0026#34;, \u0026#34;took\u0026#34;, time.Since(start)) return err } } Before, after, and around. A recovery middleware can put a defer recover() in place that a PreRun hook structurally cannot. An auth middleware can check a condition and return an error instead of calling next() at all, refusing to let the command run in the first place. PreRun can\u0026rsquo;t veto the command; it runs, and then the command runs regardless.\nPreRun is a notification that the command is about to happen. Middleware is control over whether and how it happens. For genuine cross-cutting concerns you need the second thing, not the first.\nTo sum up Timing, auth, recovery and logging are cross-cutting concerns: necessary for many commands, owned by none. Hand-copied into every RunE, they drift out of sync. Web frameworks fixed this with middleware years ago, and a CLI command is structurally just another handler.\ngo-tool-base brings the functional Chain middleware pattern to the Cobra command tree. A middleware wraps a command\u0026rsquo;s whole execution, so it acts before and after and can decide whether the command runs at all\u0026hellip; strictly more than Cobra\u0026rsquo;s PreRun hooks, which only fire beforehand and can\u0026rsquo;t wrap, recover, time, or veto. Write the concern once, wrap the chain, and every command inherits it consistently.\n","date":"2026-03-24T00:00:00Z","image":"/middleware-for-cli-commands-not-just-web-servers/cover-middleware-for-cli-commands-not-just-web-servers.png","permalink":"/middleware-for-cli-commands-not-just-web-servers/","title":"Middleware for CLI commands, not just web servers"},{"content":"The same tool, in two different lives, wants two completely different kinds of log.\nOn my laptop I want logs I can actually read: colour, alignment, friendly timestamps. The very same tool running as a daemon in a container wants none of that. It wants structured JSON, one object a line, ready for a log aggregator to swallow. And in a test I want the logger to shut up entirely. The interesting question is what it costs you to move between the three.\nThe same tool wants different logs On a developer\u0026rsquo;s machine the tool is a CLI. You want logs that are pleasant to read in a terminal: colour, alignment, human-friendly timestamps. The charmbracelet logger does that beautifully.\nThen the very same tool grows a serve command and gets deployed as a daemon in a container. Now coloured terminal output is worse than useless. The log aggregator wants structured JSON, one object per line, machine-parseable. slog does that.\nAnd in tests you want neither. You want the logger to exist, satisfy the interface, and stay completely silent.\nThat\u0026rsquo;s three different logging backends, wanted by one tool across three different lives. So what does switching between them actually cost?\nWhat it costs depends on what your packages imported If your packages import a concrete logger, if pkg/config and pkg/setup and twenty others each have import \u0026quot;github.com/charmbracelet/log\u0026quot; and take a *log.Logger, then the backend is welded into the entire codebase. Switching to JSON for the container build means editing the import and the parameter type in every single one of those packages. The backend has leaked. A detail that should have been one decision has become a property of a hundred files.\ngo-tool-base doesn\u0026rsquo;t let it leak. Every package in the framework accepts a logger.Logger, an interface, and nothing else. No package anywhere imports a concrete logging library. A package states, in its types, \u0026ldquo;I need something I can log through\u0026rdquo;, and stops right there. It has no idea, and no way to find out, what\u0026rsquo;s actually on the other end.\n// what every package depends on type Logger interface { Debug(msg string, args ...any) Info(msg string, args ...any) Warn(msg string, args ...any) Error(msg string, args ...any) // ... } The backend gets chosen once, at the top, when the tool builds its Props. It travels down to every package as the interface, through the Props container. The packages underneath never see the concrete type, so the concrete type can change without a single one of them noticing. (There\u0026rsquo;s that \u0026ldquo;decide it once, in one place\u0026rdquo; theme again. I did warn you it runs through everything.)\nThree backends, and the swap is one line go-tool-base ships three implementations of that interface:\ncharmbracelet (logger.NewCharm(w, opts...)). Coloured, styled, for humans at a terminal. The CLI default. slog JSON, a slog-backed backend emitting structured JSON, for daemons and containers feeding a log aggregator. noop, which does precisely nothing, for tests that want a real Logger and total silence. Switching the tool from a friendly CLI logger to container-ready JSON is a change to the one line in main() that constructs the logger. That\u0026rsquo;s the lot. pkg/config doesn\u0026rsquo;t change. pkg/setup doesn\u0026rsquo;t change. None of the twenty packages change, because none of them ever knew which backend they had. The decision was always one line; the interface is what kept it one line.\nThe noop backend deserves its own mention, because it\u0026rsquo;s the one people underrate. A test for a command shouldn\u0026rsquo;t be spraying log output all over the test run, but the command still needs a non-nil Logger to function. logger.NewNoop() gives you exactly that: interface satisfied, output binned, test quiet. And because it\u0026rsquo;s just another implementation of the same interface, no test needs any special logging machinery. It passes a different backend, exactly the way the container build does.\nThe general shape There\u0026rsquo;s nothing exotic going on here. It\u0026rsquo;s \u0026ldquo;depend on interfaces, not implementations\u0026rdquo;, which every Go developer has had drilled into them at some point. The bit worth holding onto is where the rule actually pays out, and it\u0026rsquo;s at the seams between a stable core and a detail you know full well you\u0026rsquo;ll want to vary.\nA logging backend is exactly such a detail. You will want it different in a terminal, in a container, and in a test. So the thing your code depends on has to be the interface, and the concrete backend has to be chosen at one well-known point and nowhere else. Get that boundary right and \u0026ldquo;we need JSON logs in production\u0026rdquo; is a one-line change. Get it wrong and it\u0026rsquo;s a refactor and a bad afternoon.\nWhat it comes down to One tool legitimately wants three different logging backends across its life: coloured output in a terminal, structured JSON in a container, silence in a test. The cost of moving between them is decided entirely by whether your packages imported a concrete logger or an interface.\ngo-tool-base\u0026rsquo;s packages depend only on logger.Logger, never a backend. Three implementations ship (charmbracelet, slog JSON, noop) and the backend is chosen once, in main(), then carried everywhere as the interface through Props. Switching is one line at the top, because the detail was never allowed to leak into the hundred files below it.\n","date":"2026-03-23T00:00:00Z","image":"/a-logging-interface-that-doesnt-leak-its-backend/cover-a-logging-interface-that-doesnt-leak-its-backend.png","permalink":"/a-logging-interface-that-doesnt-leak-its-backend/","title":"A logging interface that doesn't leak its backend"},{"content":"Here\u0026rsquo;s an error message I\u0026rsquo;ve been on the receiving end of more times than I\u0026rsquo;d care to count:\nerror: failed to read config file True. Also completely useless! I now know something is broken and I haven\u0026rsquo;t the faintest idea what to do about it. Which file? Why couldn\u0026rsquo;t it be read? Should I create it, run some init command, fix a permission, set an environment variable? The message states the problem and then abandons me at it, rather like a sat-nav cheerfully announcing \u0026ldquo;you have arrived\u0026rdquo; in the middle of a motorway.\nA message is not a fix The instinct, the moment you notice this, is to go and write a better message:\nerror: failed to read config file at ~/.config/mytool/config.yaml. Run \u0026#39;mytool init\u0026#39; to create one, or set MYTOOL_CONFIG to point at an existing file. Better for the human, no question. But look at what you\u0026rsquo;ve just done to the error as a value. The recovery advice is now welded into the error string. Any code that wants to ask \u0026ldquo;is this the config-missing error?\u0026rdquo; is reduced to substring-matching English prose. Reword the advice and you break the check. So you\u0026rsquo;ve helped the user and quietly sabotaged the program at the same time, because you\u0026rsquo;ve made one poor little string do two completely incompatible jobs\u0026hellip; being a stable identity for code, and being friendly guidance for people.\nWhy I changed error libraries go-tool-base started out on github.com/go-errors/errors. It\u0026rsquo;s a perfectly fine library and it gave us stack traces. What it didn\u0026rsquo;t give us was any way to attach human guidance to an error without shoving it into the message string. So the codebase did exactly the daft thing I just described: multi-line suggestion text baked straight into errors.Errorf calls, user-facing content and programmatic identity all mashed into one value.\nThat\u0026rsquo;s the whole reason for the migration to github.com/cockroachdb/errors. Not novelty, and not because I fancied a weekend of find-and-replace. One specific capability: cockroachdb/errors lets you attach a hint to an error as a separate, structured field.\nreturn errors.WithHint( errors.New(\u0026#34;failed to read config file\u0026#34;), \u0026#34;Run \u0026#39;mytool init\u0026#39; to create one, or set MYTOOL_CONFIG to point at an existing file.\u0026#34;, ) Now there are two things, cleanly apart. errors.New(\u0026quot;failed to read config file\u0026quot;) is the identity\u0026hellip; stable, matchable, the program\u0026rsquo;s handle on the error. The hint is the guidance\u0026hellip; for the human, and rewordable as much as you like without breaking a single check, because no check ever looks at it. errors.Is and errors.As work properly through every wrapper layer, so code matches on identity and never has to read prose.\nThe migration brought a few other things worth having. Stack traces print with a plain %+v instead of a type assertion. Errors can carry structured, machine-readable metadata. Multiple errors from concurrent work can be combined as a first-class value. But the hint is the one that actually changed the user\u0026rsquo;s day, because the hint is the recovery step, stored where it belongs.\nOne door out, and it knows where the help is Separating the hint is only half of it. The other half is making sure those hints actually reach the user, every time, and that comes down to having a single way out.\nEvery go-tool-base command returns its errors the idiomatic Cobra way, through RunE. They all funnel into one Execute() wrapper at the root, which routes every error (runtime failure, flag parse error, pre-run failure) through one ErrorHandler. One door out. So error presentation gets decided in exactly one place, and no command can render an error differently from the command sat next to it.\nAnd because there\u0026rsquo;s one handler, it can pull off something the individual commands never could. The framework knows your tool\u0026rsquo;s metadata, including its configured support channel, be it a Slack workspace or a Teams channel. So the error handler can finish a fatal error not just with the what and the recovery hint, but with where to go if the hint didn\u0026rsquo;t help:\nerror: failed to read config file hint: Run \u0026#39;mytool init\u0026#39; to create one, or set MYTOOL_CONFIG. Still stuck? Ask in #mytool-support on Slack. The user is never left at a dead end. The error tells them what broke, the hint tells them the most likely fix, and if that\u0026rsquo;s still not enough the handler tells them which door to go and knock on. A failure becomes a signpost instead of a full stop.\nThe short version An error that only reports what went wrong leaves the user stranded, and the obvious fix (writing the recovery advice into the message) quietly wrecks the error as a value, because now your code has to substring-match prose just to work out what it\u0026rsquo;s looking at.\ngo-tool-base moved from go-errors to cockroachdb/errors to get hints: a structured, separate field for human guidance that leaves the error\u0026rsquo;s identity clean for errors.Is and errors.As. Every command\u0026rsquo;s errors leave through one Execute() wrapper and one ErrorHandler, so presentation stays consistent, and because that handler knows the tool\u0026rsquo;s support channel it can point a stuck user at real help.\nState the problem for the program. Give the fix to the human. And for pity\u0026rsquo;s sake, keep the two in different fields.\n","date":"2026-03-22T00:00:00Z","image":"/errors-that-tell-the-user-what-to-do-next/cover-errors-that-tell-the-user-what-to-do-next.png","permalink":"/errors-that-tell-the-user-what-to-do-next/","title":"Errors that tell the user what to do next"},{"content":"Go\u0026rsquo;s embed package is one of those features that makes you slightly giddy the first time you use it. One //go:embed directive and your default config, your templates, your docs are all baked into the binary. The tool just works the moment it\u0026rsquo;s installed, with nothing external to lose or forget to ship.\nAnd then you go and build something modular on top of it, and you discover the catch nobody warned you about.\nembed.FS is an island An embed.FS has a property that\u0026rsquo;s easy to miss until it bites: it\u0026rsquo;s local to the package that declared it. The //go:embed directive can only see files at or below its own source file. So in any project bigger than a toy, you don\u0026rsquo;t have an embedded filesystem. You have many. The root package embeds one. Each feature, each subcommand that ships its own templates or defaults, embeds another. They\u0026rsquo;re islands, one per package, and Go gives you no native way to make them behave as a whole.\nFor most files that\u0026rsquo;s perfectly fine. A feature\u0026rsquo;s templates can stay on the feature\u0026rsquo;s island; nothing else needs them.\nIt stops being fine the moment features need to contribute to something shared.\nThe shared-config problem Here\u0026rsquo;s the case that forces the issue. A go-tool-base tool has a global config.yaml of defaults, embedded at the root. Now you add a feature, and that feature has its own configuration keys, with their own sensible defaults.\nWhere do those defaults go?\nThe naive answer is: edit the root config.yaml and add the feature\u0026rsquo;s section. And that\u0026rsquo;s a genuinely bad answer, because it inverts the dependency. The root config now has to know about every feature. Add a feature, edit the centre. Remove one, edit the centre again. The central file becomes a pinch point that every feature has to reach into, and a modular architecture where you can\u0026rsquo;t add a module without editing the core isn\u0026rsquo;t really modular at all\u0026hellip; it just has more files.\nWhat you actually want is for the feature to ship its own slice of default config, on its own island, and for the global config the tool reads to somehow already contain it. The feature contributes; the centre doesn\u0026rsquo;t budge.\nprops.Assets: merge the islands That\u0026rsquo;s the job of props.Assets. (Yes, it lives on Props, the load-bearing container I keep going on about. Most of the good stuff does.) It\u0026rsquo;s a layer that implements the standard fs.FS interface, and into it you Register each embed.FS under a name:\n// root main.go Assets: props.NewAssets(props.AssetMap{\u0026#34;root\u0026#34;: \u0026amp;assets}), // a feature\u0026#39;s command constructor //go:embed assets/* var assets embed.FS func NewCmdFeature(p *props.Props) *cobra.Command { p.Assets.Register(\u0026#34;feature\u0026#34;, \u0026amp;assets) // ... } Now Props carries one Assets value that represents all the islands as a single filesystem. The root\u0026rsquo;s files and every registered feature\u0026rsquo;s files, addressable through one fs.FS. Each registration is named, so the islands stay individually identifiable, but they read as one.\nThat alone solves the addressing problem. The genuinely clever part is what happens for structured files.\nOpening a file that exists in several places When you Open a path through props.Assets and that path has a structured extension (.yaml, .yml, .json, .csv) it doesn\u0026rsquo;t simply return the first match it stumbles across. It does this:\nDiscovery. It finds every instance of that path, across every registered filesystem. Parsing. It unmarshals each one. Merging. It deep-merges the parsed data, using mergo. Re-serialisation. It hands you back a single fs.File whose contents are the combined, merged result. So picture the shared-config problem again, only solved this time. The root ships a config.yaml with the base defaults. Each feature ships a config.yaml on its own island carrying only its own keys. Nobody edits anybody else\u0026rsquo;s file. When the init command opens config.yaml through props.Assets, it doesn\u0026rsquo;t get the root\u0026rsquo;s copy. It gets the deep-merge of the root\u0026rsquo;s copy and every registered feature\u0026rsquo;s copy: one config.yaml that contains every default in the tool, assembled at runtime from contributions that never knew about each other.\nA feature contributes its defaults simply by existing and registering. The centre never changes. That\u0026rsquo;s the modular property the naive approach couldn\u0026rsquo;t give you, and it generalises well beyond config\u0026hellip; the same merge applies to a shared commands.csv, or any structured file features want to add rows or keys to.\nThere\u0026rsquo;s also a Mount method for attaching an arbitrary fs.FS at a virtual path, which is handy for surfacing something external (a temp directory, say) as part of the same tree. But the structured merge is the feature that really earns Assets its place.\nBoiling it down embed.FS is per-package by design, so a modular CLI ends up with many embedded filesystems, one island per feature. Most of the time that\u0026rsquo;s fine. It fails specifically when features need to contribute to a shared resource like the global config.yaml, because the naive fix forces every feature to reach in and edit a central file.\nprops.Assets merges all the registered islands into a single fs.FS, and for structured files it goes further: opening a .yaml, .json or .csv discovers every copy across every island, deep-merges them, and returns the combined whole. A feature drops its own defaults onto its own island, registers, and the merged config the tool reads already includes them. Contribution without coupling, which is rather the whole point of being modular in the first place.\n","date":"2026-03-21T00:00:00Z","image":"/many-embedded-filesystems-one-merged-view/cover-many-embedded-filesystems-one-merged-view.png","permalink":"/many-embedded-filesystems-one-merged-view/","title":"Many embedded filesystems, one merged view"},{"content":"I name-dropped Props back in the introduction and then rather glossed over it, which was a bit unfair of me, because it\u0026rsquo;s the single most important design decision in the whole framework. So let\u0026rsquo;s give it the attention it actually deserves.\nAnd the best place to start, oddly enough, is the name.\nStart with the name The container at the centre of go-tool-base is called Props, and the name is doing real work, so we\u0026rsquo;ll start there.\nIt is not short for \u0026ldquo;properties\u0026rdquo;, though it does hold a few. A prop is the heavy timber or steel beam that stops a structure quietly collapsing in on itself. And for anyone who follows the rugby: a prop is the position in the scrum, the broad-shouldered forward whose entire job is to provide structural support so everyone else can get on with the game.\nThat\u0026rsquo;s the design brief, in a single word. Props is not where the clever, flashy work happens. It scores no tries. It\u0026rsquo;s the unglamorous, load-bearing thing that holds the framework up so that your actual command logic gets to be the interesting part. Understand the name and you understand what the struct is for.\nWhat it carries Props is the single object passed to every command constructor in a go-tool-base tool. It holds the dependencies a command might need:\nTool, metadata about the CLI (name, summary, release source). Logger, the logging abstraction. Config, the loaded configuration container. FS, a filesystem abstraction (afero), so a command never touches the real disk directly. Assets, the embedded-resource manager. Version, build information. ErrorHandler, the centralised error reporter. A command constructor\u0026rsquo;s signature is, accordingly, boring on purpose:\nfunc NewCmdExample(p *props.Props) *cobra.Command { ... } One parameter. Everything the command could possibly need is reachable through it. No globals, no init()-time wiring, no twelve-argument constructor that quietly grows a thirteenth argument next month.\nWhy a struct, and not context.Context Here\u0026rsquo;s the design decision I actually want to defend, because it\u0026rsquo;s the one Go developers tend to raise an eyebrow at. Go already has a well-known way to carry things through a call tree: context.Context. So why not just put the logger and the config in the context and pass that around?\nBecause context.Context carries its values as interface{}, and that\u0026rsquo;s the wrong trade for dependencies.\nPull a dependency out of a context and you get this:\nl := ctx.Value(\u0026#34;logger\u0026#34;).(logger.Logger) // a runtime type assertion That one line has two separate ways to hurt you. The key is a bare string, so a typo compiles perfectly happily and then fails at runtime. The type assertion is unchecked, so if the wrong thing is sitting under that key, your tool panics in front of a user. Neither failure is visible to the compiler. Neither is visible to your IDE. You find out when it breaks, which is to say at the worst possible time.\nPull the same dependency out of Props and you get this:\np.Logger.Info(\u0026#34;starting\u0026#34;) // a field access p.Logger is a typed field. If it doesn\u0026rsquo;t exist, or you\u0026rsquo;ve used it wrong, the code simply doesn\u0026rsquo;t compile. Your IDE autocompletes it. Refactor the Logger interface and every misuse lights up at build time. There\u0026rsquo;s no runtime type assertion, because there\u0026rsquo;s no interface{} to assert from in the first place.\ncontext.Context is the right tool for what it was designed for: cancellation, deadlines, request-scoped signals that genuinely cross API boundaries. It\u0026rsquo;s the wrong tool for \u0026ldquo;here are my program\u0026rsquo;s services\u0026rdquo;, because it trades away the compiler\u0026rsquo;s help for a flexibility you really don\u0026rsquo;t want here. Dependencies should be declared, somewhere the compiler checks them. Props is that somewhere.\nWhat you get back for it That one decision pays out in three currencies.\nTestability. A command is now a pure function of its Props. To test it, you build a Props with the doubles you want (an in-memory FS instead of the real disk, a no-op Logger, a config you\u0026rsquo;ve populated by hand) and call the constructor. No global state to reset between tests, no monkey-patching, no init() order to puzzle over. The dependency is an argument, so the test just passes a different one.\nConsistency. Cross-cutting changes have exactly one place to happen. When the global --debug flag flips the log level, it does so on the Logger inside Props, and because every command reads its logger from the same Props, every command gets the new level. No command can drift, because none of them owns its own copy.\nExtensibility. Adding a new framework-wide service is just adding a field to one struct. Every command can immediately reach it; none of them needed touching to make it reachable.\nTo sum up Props is the dependency-injection container at the heart of go-tool-base: one struct, passed to every command, holding the logger, config, filesystem, assets, error handler and tool metadata. It\u0026rsquo;s a concrete struct rather than a context.Context payload entirely on purpose, because dependencies belong somewhere the compiler can check them, not behind a string key and a hopeful runtime type assertion. That single choice buys you testability, consistency and easy extension.\nThe name says it best, really. Props doesn\u0026rsquo;t score the tries. It\u0026rsquo;s the broad-shouldered thing in the scrum that stops the whole framework folding, so the rest of your code is free to go and play.\n","date":"2026-03-21T00:00:00Z","image":"/props-the-container-that-does-the-heavy-lifting/cover-props.png","permalink":"/props-the-container-that-does-the-heavy-lifting/","title":"Props: the container that does the heavy lifting"},{"content":"Here\u0026rsquo;s a question that sounds trivial and really isn\u0026rsquo;t: where, exactly, does a CLI tool\u0026rsquo;s structure live? Not the logic of each command\u0026hellip; the structure. Which commands exist, what they\u0026rsquo;re called, which flags they take, what\u0026rsquo;s nested under what.\nI\u0026rsquo;d never properly thought to ask it until go-tool-base forced me to, and the honest answer turned out to be a little bit embarrassing.\nWhere does a CLI\u0026rsquo;s structure actually live? Picture a CLI tool with twenty commands, some nested under others. In a typical project, where does its structure live? The honest answer is \u0026ldquo;smeared across the codebase\u0026rdquo;. It\u0026rsquo;s in twenty cmd.go files. It\u0026rsquo;s in the AddCommand calls that stitch them together. It\u0026rsquo;s in the flag registrations. To understand the shape of the tool you have to read all of it and assemble the picture in your head, because the picture exists nowhere as a single thing you can point at.\nThat\u0026rsquo;s a strange state of affairs for the single most important design fact about a CLI. The command tree is the tool\u0026rsquo;s interface, it\u0026rsquo;s the thing users actually touch, and yet it hasn\u0026rsquo;t got a home.\nThe manifest gives it one go-tool-base\u0026rsquo;s generator gives that structure a home: .gtb/manifest.yaml. The manifest is a single readable file describing the command tree. Every command, its name, its short description, its flags, its place in the hierarchy, whether it carries assets or an initialiser. The shape of the whole tool, in one place you can open and read top to bottom.\nAnd the manifest isn\u0026rsquo;t documentation about the project. It\u0026rsquo;s the thing the project\u0026rsquo;s wiring is generated from. When you run regenerate project, the generator reads the manifest and rebuilds the boilerplate to match it: the command registration, the AddCommand wiring, the flag definitions. The manifest is the source of truth, and the Go wiring is its output.\nDesign-first, when you want it This unlocks a way of working that the smeared-across-the-codebase approach simply can\u0026rsquo;t offer. You can design the interface first, in the manifest, and let the code follow.\nWant to rename a command? Edit one line in the manifest, run regenerate, and the rename propagates through every wiring file that ever mentioned it. Want to move a subcommand under a different parent? Change its place in the manifest hierarchy and regenerate. Want to add a flag to three related commands? Add it in the manifest, in three obvious places, and regenerate, instead of going on a little hunting expedition for three flag-registration blocks scattered across the tree.\nYou\u0026rsquo;re editing the tool\u0026rsquo;s interface as a design, in the file whose entire job is to hold that design, and the generator does the mechanical donkey-work of making the code reflect it. The thing you change is the thing that describes the structure. The code is downstream.\nIf that shape sounds familiar, it should. It\u0026rsquo;s the same instinct behind spec-driven and test-driven development: write down what the thing should be before you assemble how it works, and keep that statement of intent as a first-class, living artefact rather than a comment that quietly rots in a corner. The manifest is a spec for your command tree, and regenerate is what keeps the implementation honest to it.\nIt doesn\u0026rsquo;t trap you There\u0026rsquo;s an obvious worry about any generated-from-a-manifest system: am I now locked into editing the manifest? What if I just want to open a Go file and write some Go like a normal person?\nYou can. The generator is careful not to own everything. It owns the wiring (the registration and the structural boilerplate) and it leaves your command logic well alone. The RunE function where your command actually does its work is yours; the manifest hasn\u0026rsquo;t got an opinion about it. And the generator tracks the files it produces by content hash, so if you do hand-edit something it generated, regeneration notices and asks before overwriting rather than steamrolling you. That mechanism turned out interesting enough to get its own post.\nSo the manifest is an option, not a cage. Design-first via the manifest when that suits the change. Drop into Go directly when that suits it better. The two stay in sync because regeneration reconciles them, not because one of them has been forbidden.\nPulling it together A CLI\u0026rsquo;s command tree is its most important design surface, and in most projects it has no single home\u0026hellip; it gets reconstructed in your head from twenty scattered files every time you need to reason about it. go-tool-base gives it one: .gtb/manifest.yaml, a readable description of the whole tree that the generator rebuilds the wiring code from. Edit the manifest, run regenerate, and the boilerplate follows.\nIt makes CLI structure something you design in one place, in the spirit of spec-driven development, while still leaving you free to write Go directly when that\u0026rsquo;s the better tool for the job. The manifest is the spec for your interface. The generator just keeps the code faithful to it.\n","date":"2026-03-20T00:00:00Z","image":"/design-your-whole-cli-in-one-file/cover-design-your-whole-cli-in-one-file.png","permalink":"/design-your-whole-cli-in-one-file/","title":"Design your whole CLI in one file"},{"content":"When I introduced go-tool-base I made a passing promise to come back to \u0026ldquo;the generator that won\u0026rsquo;t clobber your edits\u0026rdquo;. This is me keeping it, partly because it\u0026rsquo;s the feature I\u0026rsquo;m quietly most proud of, and partly because it took the most head-scratching of anything to get right.\nThe problem it solves is one that every code generator runs into eventually, usually the hard way and usually at the worst possible moment.\nThe generator\u0026rsquo;s awkward second act A project generator has an easy first act. gtb generate skeleton, and you\u0026rsquo;ve got a complete, wired, idiomatic Go CLI project. Everyone\u0026rsquo;s happy, me included.\nThe second act is the hard one. The framework moves on. A convention changes, a new built-in capability appears, the recommended CI shape shifts. Your project, scaffolded three months ago, is now subtly out of date, and you\u0026rsquo;d quite like the generator to drag it back up to spec.\nExcept by now it isn\u0026rsquo;t a fresh scaffold. It\u0026rsquo;s your project. You tuned the CI workflow. You rewrote the justfile. You added a stanza to the Dockerfile that took an afternoon and a fair bit of swearing to get right. The generated files and your edited files are one and the same files.\nA naive generator handles this with breathtaking confidence: it regenerates everything from the template and overwrites the lot. Run it once, lose your afternoon. You learn that lesson exactly once and then never run regeneration again, which means the upkeep feature you were sold is dead on arrival. A scaffold you can\u0026rsquo;t safely re-run is just a one-shot cp with extra steps.\nWhat the generator needs to know The thing standing between \u0026ldquo;safe to overwrite\u0026rdquo; and \u0026ldquo;absolutely do not\u0026rdquo; is a single fact: has this file changed since the generator last wrote it?\nIf it hasn\u0026rsquo;t, the file is still pristine boilerplate and the generator owns it. Overwrite away. If it has, a human has been in there, and the generator must not touch it without asking first.\nThe generator can\u0026rsquo;t just eyeball that, of course. It needs a record. So every time gtb generate writes a file, it computes a SHA-256 of the content and stores it in the project\u0026rsquo;s manifest, .gtb/manifest.yaml, as a Hashes map of relative path to hash. The manifest is the generator\u0026rsquo;s memory of the exact bytes it last produced.\nRegeneration becomes a three-way decision With that record in hand, regeneration stops being \u0026ldquo;overwrite everything\u0026rdquo; and becomes a per-file decision with three branches.\nThe file doesn\u0026rsquo;t exist. Easy. Write it, store its hash.\nThe file exists and its current hash matches the manifest. It\u0026rsquo;s byte-for-byte what the generator last wrote, so nobody has touched it. The generator owns it outright, regenerates from the template and updates the stored hash. No prompt, no fuss. This is the common case, and it\u0026rsquo;s silent precisely because it\u0026rsquo;s safe.\nThe file exists and its hash does not match. Someone has been in there since generation. The generator stops and asks. It will not silently overwrite your hard-won afternoon. You decide: take the new version, or keep yours.\nThe detail I\u0026rsquo;m genuinely fond of is what happens when you decline. Declining is non-fatal. Generation carries on with the rest of the files, and the manifest keeps the file\u0026rsquo;s stored hash rather than dropping it. That matters more than it looks, because it means the file stays tracked. Next time you regenerate, the generator can still tell that file has been modified, and still asks. Skipping a file once doesn\u0026rsquo;t quietly evict it from the generator\u0026rsquo;s awareness forever. It stays a known, watched, customised file across every future run.\nWhen you want it to stop asking Per-file prompting is the right default, but for files you\u0026rsquo;ve permanently taken ownership of, being asked on every single regeneration is just noise. If you\u0026rsquo;ve rewritten the CI workflows wholesale and you are never, ever going back to the generated version, you don\u0026rsquo;t want a prompt. You want the generator to leave them well alone and not bring it up again.\nThat\u0026rsquo;s what .gtb/ignore is for. It sits next to the manifest and takes gitignore-style patterns:\n# I own the CI workflows now .github/workflows/** # ...except the release workflow, keep that managed !.github/workflows/release.yml # and my build config justfile Dockerfile Anything matching is skipped during regeneration with no prompt at all. Patterns evaluate top to bottom and later ones win, so the negation (!) behaves the way you\u0026rsquo;d expect from .gitignore: exclude a whole directory, then claw one file back.\nIt\u0026rsquo;s a deliberate escalation ladder. Unmodified files are handled silently. Modified files get a prompt. Files you\u0026rsquo;ve formally claimed get total silence. Each rung asks for less of your attention than the last, and you choose how far up to climb, file by file.\nStepping back A generator earns its keep twice: once when it scaffolds your project, and then continuously, every time it drags that project back up to the framework\u0026rsquo;s current shape. The second job is worth nothing if regeneration flattens your customisations, because you\u0026rsquo;ll simply stop running it, and who could blame you.\ngo-tool-base\u0026rsquo;s generator gets around that by remembering. It hashes every file it writes into .gtb/manifest.yaml, and on regeneration it re-hashes before overwriting: unchanged files it owns and updates silently, changed files it stops and asks about, and .gtb/ignore lets you mark files as permanently yours. Skipped files stay tracked, so the generator never loses sight of what you\u0026rsquo;ve made your own.\nThe point of a scaffold isn\u0026rsquo;t the first five minutes. It\u0026rsquo;s that you can still run it in month three without holding your breath.\n","date":"2026-03-20T00:00:00Z","image":"/scaffolding-that-respects-your-edits/cover-scaffolding-that-respects-your-edits.png","permalink":"/scaffolding-that-respects-your-edits/","title":"Scaffolding that respects your edits"},{"content":"\u0026ldquo;Make it work with AI\u0026rdquo; has become one of those requests that lands on a developer\u0026rsquo;s desk with a thud and not much further detail attached. My instinct, the first time, was to brace for a big lump of integration work\u0026hellip; a bespoke adapter for this assistant, another for that one, a treadmill of little wrappers stretching off into the distance.\nTurns out I\u0026rsquo;d already done most of the work. So have you, if your CLI tool is any good. Let me explain what I mean.\nYou already described your capabilities Stop and think for a second about what a well-built CLI tool actually is. It\u0026rsquo;s a set of named operations, each with a human-readable description, each taking a set of typed, named, documented parameters. You wrote all of that already, because a CLI without it is unusable by people.\nNow look at what an AI assistant needs in order to call a tool. A set of named operations. A description of each, so it knows when to reach for them. A typed parameter schema for each, so it knows how to call them.\nIt\u0026rsquo;s the same list! A good CLI is already, structurally, a description of a set of capabilities. The information an AI agent needs isn\u0026rsquo;t extra work you have to go and do. It\u0026rsquo;s work you finished the moment your --help output was any good.\nThe only thing missing is a translator. Something that takes \u0026ldquo;this is a CLI\u0026rdquo; and presents it as \u0026ldquo;this is a set of tools an AI can call\u0026rdquo;.\nMCP is that translator, and it\u0026rsquo;s a standard The temptation, when you want your tool to be AI-usable, is to sit down and write an integration. A little adapter for Claude Desktop. Another for Cursor. Another for whatever turns up next month. Each one a bespoke wrapper, each one a thing to maintain, and the list never stops growing because new assistants keep appearing. That\u0026rsquo;s the treadmill I was bracing for.\nThe Model Context Protocol exists to kill that list. MCP is an open standard for how an AI model discovers and calls local tools. Implement it once and your tool works with every assistant that speaks it. Write once, not once-per-client.\nSo go-tool-base implements it once, in the framework, for everyone. (That\u0026rsquo;s rather the theme of this whole series, if you hadn\u0026rsquo;t spotted it yet\u0026hellip; do the annoying thing once, properly, in a place where every tool inherits it.)\nThe mcp command, and the mapping it does for free Every tool built on go-tool-base inherits a built-in mcp command. Run it:\nmytool mcp and the tool starts a JSON-RPC server over standard I/O, speaking MCP. That\u0026rsquo;s the whole user-facing surface. One command.\nBehind it, the framework walks your Cobra command tree and maps it straight onto MCP tool definitions:\nEach command becomes a tool. Each command\u0026rsquo;s short description becomes the tool\u0026rsquo;s description, the text the AI reads to decide whether this is the tool it wants. Each command\u0026rsquo;s flags and arguments become the tool\u0026rsquo;s JSON Schema parameters. There\u0026rsquo;s no second schema to write and then keep in sync (and we all know how well \u0026ldquo;keep these two things aligned by hand\u0026rdquo; tends to go). The command tree is the schema. Add a new command to your CLI and it\u0026rsquo;s a new tool for the agent, automatically, with the description and flags you already gave it. Nobody has to remember to update an MCP manifest, because there\u0026rsquo;s no separate MCP manifest to forget about.\nConfiguring an assistant to use it On the assistant\u0026rsquo;s side it\u0026rsquo;s just as undramatic. You tell your AI client (Claude Desktop, Cursor, anything MCP-aware) to launch mytool mcp. From then on the assistant:\nStarts your tool in MCP mode when it boots. Discovers every command as a callable tool. Calls the right one, with the right parameters, when a user\u0026rsquo;s request needs it. Your CLI tool has quietly become something the AI can pick up and use, mid-conversation, on its own initiative.\nThe safety property worth noticing Now, \u0026ldquo;let an AI run things on my machine\u0026rdquo; is rightly a sentence that makes people nervous. It makes me nervous, and I built the thing. So it\u0026rsquo;s worth noticing the constraint sitting quietly in this design.\nThe AI can only call what you defined. The tools it sees are exactly the commands in your tree, and the parameters it can pass are exactly the flags and arguments you declared, validated against the JSON Schema generated from them.\nIt can\u0026rsquo;t invent a command. It can\u0026rsquo;t pass a parameter you never defined. The boundary of what the agent can do is the boundary of what your CLI does, and you drew that boundary already, back when you built the tool. Exposing the CLI over MCP doesn\u0026rsquo;t widen the surface one inch. It just makes the existing surface reachable. The AI isn\u0026rsquo;t running things. It\u0026rsquo;s running your commands, the ones you wrote, tested and shipped, and nothing else.\nThe gist A CLI tool, built properly, is already a structured description of a set of capabilities: named operations, descriptions, typed parameters. Which is also exactly what an AI agent needs in order to call a tool. The gap between the two is only a translator, and writing a bespoke one per assistant is a treadmill you don\u0026rsquo;t need to step onto.\ngo-tool-base puts the translator in the framework. Every tool gets an mcp command that serves the command tree over the Model Context Protocol\u0026hellip; commands become tools, descriptions become descriptions, flags become JSON Schema parameters, with no second schema to maintain. Point any MCP-aware assistant at it and your CLI is an agent-callable tool, bounded to exactly the commands you shipped.\nYou did the hard part when you built a good CLI. MCP just opens the door you\u0026rsquo;d already framed.\n","date":"2026-03-19T00:00:00Z","image":"/your-cli-is-already-an-ai-tool/cover-your-cli-is-already-an-ai-tool.png","permalink":"/your-cli-is-already-an-ai-tool/","title":"Your CLI is already an AI tool"},{"content":"If you\u0026rsquo;ve written more than two or three command-line tools in Go, you\u0026rsquo;ll recognise the shape of the first afternoon. I certainly do! You reach for Cobra for the command tree, Viper for config, and then you start the part nobody ever puts in the README\u0026hellip; the plumbing.\nWhere does config live? A file, an env var, an embedded default? In what order do they override each other? How does the tool tell the user there\u0026rsquo;s a newer version, and how does it actually update itself? What does logging look like, and is it the same logging the next tool will use? And how do you wire all of that into each command without every command reaching into a pile of globals?\nNone of it is hard. That\u0026rsquo;s the problem! It\u0026rsquo;s not hard, it\u0026rsquo;s just there, every single time, and every single time I\u0026rsquo;d find myself reinventing it slightly differently to the last time. Different override precedence here. A subtly different update flow there. Logging that didn\u0026rsquo;t quite match the tool I\u0026rsquo;d written three months earlier. Each new tool was a fresh re-litigation of decisions I\u0026rsquo;d already made and then promptly forgotten.\nNow, I\u0026rsquo;ve banged on about the Boy Scout rule for years (leave the codebase better than you found it), but it has an uncomfortable corollary. If you keep turning up to the same campsite and finding it in the same mess, at some point the honest thing to do is to stop tidying it and go and build a better campsite.\nFirst, just packages So I started pulling the recurring pieces out into their own packages. Nothing grand. A config package that did the hierarchical merge the way I always ended up doing it anyway. A version package that knew how to compare semver and spot a development build. A setup package that handled first-run bootstrap and self-updating from a release. They lived as separate repos, and if you go digging through my GitHub history you can still find the scruffy ancestors of them scattered about.\nSeparate packages was the right first move. It forced each piece to stand on its own and earn its keep on a real project before I trusted it on the next one. A package that\u0026rsquo;s only ever been used in the repo it was born in hasn\u0026rsquo;t really been tested\u0026hellip; it\u0026rsquo;s just been agreed with.\nBut separate packages come with a tax. Each one has its own release cadence, its own changelog, its own CI. Worse, they have to agree with each other at the seams, and when they\u0026rsquo;re versioned independently those seams drift. I\u0026rsquo;d bump the config package, and the setup package that depended on it would quietly need a matching bump, and the tool that used both would need telling about both. I\u0026rsquo;d traded \u0026ldquo;reinvent the wheel\u0026rdquo; for \u0026ldquo;keep a dozen wheels in sync\u0026rdquo;, and I\u0026rsquo;m really not convinced that\u0026rsquo;s a better deal.\nThen, one library Once the packages had been used enough (used in anger, on real tools, by people who weren\u0026rsquo;t me) the shape of them stopped moving. The interfaces settled. The arguments about precedence and defaults were over, because the answers had survived contact with reality.\nThat\u0026rsquo;s the point where separate packages stop being a virtue and start being friction. So I forged them into one and called it go-tool-base. One module, one version number, one changelog, and one set of seams that are now internal and can\u0026rsquo;t drift, because they ship together.\nThe heart of it is a dependency-injection container, a Props struct, that holds the things every command needs: the logger, the config, the embedded assets, the filesystem handle, the error handler, the tool\u0026rsquo;s own metadata. Commands are handed Props explicitly rather than reaching for globals, which means a command is just a function of its inputs and is therefore trivially testable. That one decision has quietly paid for itself on every tool I\u0026rsquo;ve built since.\nAround that container sits all the stuff I was so tired of rewriting: hierarchical config, structured logging, version checking, self-update from GitHub or GitLab releases, an interactive TUI documentation browser, AI integration, service lifecycle management. A new tool inherits the lot and gets to spend its first afternoon on the thing that\u0026rsquo;s actually novel\u0026hellip; its own logic.\nFinally, a generator A library still leaves you staring at a blank main.go. You still have to know the conventions, wire the container, lay out the directories, register the commands. All knowable, but all boilerplate. And boilerplate is exactly the enemy I set out to kill in the first place.\nSo go-tool-base ships a generator. gtb generate skeleton scaffolds a complete, working, idiomatic project: directory layout, the wired Props container, the command tree, CI, the whole lot. gtb generate command adds a new command and registers it for you. The generator also handles upkeep: when the framework\u0026rsquo;s conventions move, it can regenerate the scaffolding of an existing project without trampling all over the code you\u0026rsquo;ve written on top. (That last bit turned out to be a properly interesting problem in its own right, and a future post.)\nThe goal is blunt. Creating a CLI tool should be about the tool, not the scaffolding. The first afternoon should be spent on the part that\u0026rsquo;s actually worth writing.\nOne thing I was careful about There\u0026rsquo;s a nasty failure mode with \u0026ldquo;batteries-included\u0026rdquo; frameworks: the day you outgrow them, they hold you hostage. You either stay inside the framework\u0026rsquo;s worldview forever, or you face a rewrite. I\u0026rsquo;ve been burned by that before and I had no intention of inflicting it on anyone else.\nSo go-tool-base generates idiomatic, standard-library-compliant Go. There\u0026rsquo;s no magic runtime you can\u0026rsquo;t see, no clever code you couldn\u0026rsquo;t have written by hand. If you ever outgrow the framework the generated code stands on its own and you walk away with a perfectly normal Go project. A framework should be a starting point you\u0026rsquo;re glad you took, not a room you can\u0026rsquo;t get out of.\nWhere this leaves me go-tool-base exists because I was spending the first afternoon of every Go CLI tool rebuilding the same plumbing, and rebuilding it slightly wrong relative to last time. It started life as separate packages so each piece could earn its place on real projects; once they\u0026rsquo;d stopped moving I forged them into a single library so the seams couldn\u0026rsquo;t drift; and then I wrapped a generator around it so a new tool starts as a working project rather than a blank file.\nIt\u0026rsquo;s a framework for the unglamorous 80% (config, versioning, updates, logging, lifecycle) so you can spend your time on the 20% that\u0026rsquo;s actually yours.\nOver the coming posts I\u0026rsquo;ll dig into the individual pieces\u0026hellip; the generator that won\u0026rsquo;t clobber your edits, the credential handling, the self-update integrity checks, and a few Go techniques I\u0026rsquo;m rather pleased with along the way. Stay tuned!\n","date":"2026-03-18T00:00:00Z","image":"/introducing-go-tool-base/cover-introducing-go-tool-base.png","permalink":"/introducing-go-tool-base/","title":"go-tool-base: I got tired of reinventing the wheel"},{"content":"I like Mediawiki, it is a simple tool capable of doing a lot and can be very flexible and easy to customise. However its not always the right solution! I had a situation where we needed to migrate away from using it for a combination of security and usability reasons. So I thought it would be good to document it.\nAfter reviewing a few things it was decided to move things over to the companies already existing O365 SharePoint as a new site. This sounded laborious as first, but actually turned out to be pretty straight forward.\nWe start with getting data out of Mediawiki, thankfully we only wanted the most recent revision and not the full history of a page. We use PostgreSQL as a back-end so it was reasonably straight forward to figure out how to extract the data in a sensible query.\nSELECT page_id as id, page.page_title as title, pagecontent.old_text as content, page_touched as edited FROM mediawiki.page LEFT JOIN mediawiki.slots ON page.page_latest = slots.slot_revision_id LEFT JOIN mediawiki.content ON content.content_id = slots.slot_content_id LEFT JOIN mediawiki.pagecontent ON pagecontent.old_id = CAST(OVERLAY(content.content_address placing \u0026#39;\u0026#39; from 1 for 3) as integer) ORDER BY page_touched DESC; It tool a little sleuthing to realize that the slots table was the pivotal in extracting the latest page version. With the right join and a little mangling of the content_address field from the contents table to remove the \u0026ldquo;tt:\u0026rdquo; from the value and convert to an integer we now have a result set of all the page names and the latest revision of that page. I also added in the date the page was last updated to allow me to see when it was last edited as it was a live system migration and helped me to ensure content remained sync while both were still in play.\nOnce I had the query it was a simple job of writing an \u0026ldquo;Exporter\u0026rdquo; using Go Lang to extract the data and write it to files, I\u0026rsquo;ll chuck a snippet of code at the bottom of the post for you.\nMediawiki uses wikitext as a format so I needed to convert it to something more widely understood. Having used Pandoc in the past successfully I plumped for this as I knew it would handle a lot of options and was simple to use to convert to the markdown_mmd format\nI Installed it via the ubuntu apt package available on my system and wired this in as a hacky exec command into my script\u0026hellip; and voila! I had hardcopies of all the Mediawiki pages on my system in both wikitext and markdown_mmd format.\nWhy markdown_mmd I hear you ask\u0026hellip; mainly because it gave me the cleanest conversion for use with the new markdown web page widget for Sharepoint\u0026rsquo;s modern interface.\nNow we have the files we could do a little munging and parsing to convert URLs into the format needed for the new location in Sharepoint, easily done with a bit of regex pattern matching, which I wont go into as yours will be very different from mine\u0026hellip; suffice to say looking for \u0026quot;wikilink\u0026quot; in my regex helped massively in finding all the occurrences I needed to update. I used sed but you could use whatever tool you like or add it into your version of the exporter\n\u0026#39;SysAdmin/(.+) \u0026#34;wikilink\u0026#34;\u0026#39; and with a little back referencing to substitute the values we need to keep and its all good.\nNext came the import of the data into Sharepoint, but that is a post for another day.\npackage data import ( \u0026#34;bytes\u0026#34; \u0026#34;fmt\u0026#34; \u0026#34;github.com/jmoiron/sqlx\u0026#34; \u0026#34;github.com/rs/zerolog/log\u0026#34; \u0026#34;io/ioutil\u0026#34; \u0026#34;os\u0026#34; \u0026#34;os/exec\u0026#34; \u0026#34;path/filepath\u0026#34; \u0026#34;time\u0026#34; \u0026#34;wiki-export/src/util\u0026#34; ) type Page struct { Id int Title string Content string Edited time.Time } type Exporter struct { Config util.ExporterConfig DB *sqlx.DB } func (l *Exporter) Export() { stmt := ` SELECT page_id as id, page.page_title as title, pagecontent.old_text as content, page_touched as edited FROM mediawiki.page LEFT JOIN mediawiki.slots ON page.page_latest = slots.slot_revision_id LEFT JOIN mediawiki.content ON content.content_id = slots.slot_content_id LEFT JOIN mediawiki.pagecontent ON pagecontent.old_id = CAST(OVERLAY(content.content_address placing \u0026#39;\u0026#39; from 1 for 3) as integer) ORDER BY page_touched DESC ;` page := Page{} rows, err := l.DB.Queryx(stmt) util.CheckErr(err) for rows.Next() { util.CheckErr(rows.StructScan(\u0026amp;page)) wikiFilename := fmt.Sprintf(\u0026#34;%s.mediawiki\u0026#34;,filepath.Base(page.Title)) mdFilename := fmt.Sprintf(\u0026#34;%s.md\u0026#34;,filepath.Base(page.Title)) path := filepath.Dir(page.Title) wikiDir := fmt.Sprintf(\u0026#34;%s/mediawiki\u0026#34;,l.Config.TargetDir) mdDir := fmt.Sprintf(\u0026#34;%s/%s\u0026#34;,l.Config.TargetDir, l.Config.TargetFormat) if path != \u0026#34;.\u0026#34; { wikiDir = fmt.Sprintf(\u0026#34;%s/mediawiki/%s\u0026#34;,l.Config.TargetDir , path) mdDir = fmt.Sprintf(\u0026#34;%s/md/%s\u0026#34;,l.Config.TargetDir , path) } util.CheckErr(os.MkdirAll(wikiDir, 0777)) util.CheckErr(os.MkdirAll(mdDir, 0777)) wikiTarget := fmt.Sprintf(\u0026#34;%s/%s\u0026#34;, wikiDir, wikiFilename) mdTarget := fmt.Sprintf(\u0026#34;%s/%s\u0026#34;, mdDir, mdFilename) log.Debug().Msgf(\u0026#34;%s =\u0026gt; %s -\u0026gt; %s\u0026#34;, page.Title, wikiTarget, mdTarget) c := []byte(page.Content) util.CheckErr(ioutil.WriteFile(wikiTarget, c, 0777)) cmd := exec.Command(\u0026#34;pandoc\u0026#34;, \u0026#34;-f\u0026#34;,\u0026#34;mediawiki\u0026#34;, \u0026#34;-t\u0026#34;, l.Config.TargetFormat, wikiTarget) var errorBuffer bytes.Buffer var outputBuffer bytes.Buffer cmd.Stdout = \u0026amp;outputBuffer cmd.Stderr = \u0026amp;errorBuffer err := cmd.Run() if err != nil { log.Err(err).Msgf(\u0026#34;ERROR: %s\u0026#34;, errorBuffer.String()) util.CheckErr(err) } util.CheckErr(ioutil.WriteFile(mdTarget, outputBuffer.Bytes(), 0777)) } } ","date":"2020-08-19T00:00:00Z","permalink":"/migrating-away-from-mediawiki-and-how-to-export-its-data/","title":"Migrating away from Mediawiki and how to export its data"},{"content":"Recently there has been an uptake in the use of Neo4j by the Data Scientists. This is a good thing! they are wanting to use the right tool for the job. However we need to run it inside our k8s cluster as a portable readable data source that has been dynamically populated from a pile of data in a combination of PostgreSQL and MongoDB.\nThis isn\u0026rsquo;t a problem for them working locally, they install and spin up a local copy of Neo4j and can interact with it quite happily. They even realised that they can generate CSV\u0026rsquo;s from PostgreSQL and MongoDB and then import them, blindingly fast, into Neo4j using the neo4j-admin tool that comes bundled. Fantastic!\nAt least until they come to want to run their Neo instance inside our k8s cluster. That\u0026rsquo;s where I step in and turn them aside from creating their own custom neo4j image with a bespoke entry point that loads all the data for them in some crazy threaded bash scripting!\n\u0026ldquo;No, No, No!\u0026rdquo; I tell them. \u0026ldquo;It\u0026rsquo;s far easier to just add an init container to your pod, that will preload the data before Neo starts up\u0026rdquo;.\nInit containers, if you haven\u0026rsquo;t come across before, them are a special type of container that lives inside a k8s pod and are set to run BEFORE your main container runs. In this case it means we can easily sequence a bash script to run the neo4j-admin import before Neo4j is even started. And here is how we did it!\nThe script The data scientists had been using Neo4j 3.5.x locally because they had a need for the graph algorithms plugin (https://github.com/neo4j-contrib/neo4j-graph-algorithms) which at the time they were looking didn\u0026rsquo;t support Neo4j 4.x. The plugin is now deprecated and its replacement (https://github.com/neo4j/graph-data-science) thankfully supports 3.5.x and 4.x.\nAs Neo4j 4.x introduces a lot of new features and improves performance so I recommended we switch to using that. This meant a refactor of their bash script for neo4j-admin there some very subtle differences and a few caveats to work with. This is what they came up with\n#!/bin/bash DBNAME=\u0026#34;neo4j\u0026#34; if [ \u0026#34;$#\u0026#34; -eq 1 ]; then DBNAME=$1 fi # extract data from SQL python3 extract_data.py # remove old db for rebuild rm -rf \u0026#34;/data/databases/$DNBAME\u0026#34; neo4j-admin import \\ --database=$DBNAME \\ --delimiter=\u0026#34;|\u0026#34; \\ --nodes=Protein=${NODE_DIR}/nodes_protein_header.csv,${DATA_DIR}/nodes_proteins.csv \\ --nodes=UniProtKB=${NODE_DIR}/nodes_uniprot_header.csv,${DATA_DIR}/nodes_uniprot.csv \\ --relationships=HAS_AMINO_ACID_SEQUENCE=${EDGE_DIR}/edges_protein_sequence_header.csv,${DATA_DIR}/edges_protein_sequence.csv \\ --relationships=HAS_AMINO_ACID_SEQUENCE=${EDGE_DIR}/edges_chembl_protein_biotherapeutic_molregno_header.csv,${DATA_DIR}/edges_chembl_protein_biotherapeutic_molregno.csv \\ --skip-bad-relationships=true \\ --skip-duplicate-nodes=true The import command here is significantly shorter for example purposes, as the original is about 120 lines long. As you can see it\u0026rsquo;s pretty straight forward, they had another script in extract_data.py, that I wont bore you with suffice to say that it pulled out all the data they wanted from PostgreSQL and MongoDB, which got saved to disk as CSV files in the relevant directories.\nGreat, it worked on their local version!\nThe Dockerfile ROM neo4j:latest ENV NEO4JLABS_PLUGINS [\u0026#34;graph-data-science\u0026#34;] RUN apt update \u0026amp;\u0026amp; apt install -y python3 WORKDIR /srv COPY src /srv/src COPY headers /srv/headers The plan is always to keep it simple. We have one image that we can run for both the init container and the main container. This docker file gives a vanilla neo4j instance with python and our scripts for extracting the data loaded into it\nThe k8s Manifest apiVersion: v1 kind: Pod metadata: name: neo4j spec: containers: - name: neo4j env: - name: NEO4J_AUTH value: neo4j/password image: registry.example.com/phpboyscout/rnd_graph:latest imagePullPolicy: Always volumeMounts: - mountPath: /data name: neo4j subPath: data initContainers: - name: importer args: - neo4j_import.sh command: - /bin/bash env: - name: DATA_DIR value: /import/data - name: HEADER_DIR value: /srv/headers image: registry.example.com/phpboyscout/rnd_graph:latest imagePullPolicy: Always stdin: true workingDir: /srv/src volumeMounts: - mountPath: /data name: neo4j subPath: data - mountPath: /import name: neo4j subPath: import - name: neo4j persistentVolumeClaim: claimName: neo4j Now we can pull it all together with our k8s manifest. From here you can see that we have our default neo4j container that we pass in our default authentication details to and an init container that runs our import.sh script. Both containers have access to a shared volume for the /import and /data folders.\nAnd now we get to\u0026hellip;\nTroubleshooting So right off the bat it didn\u0026rsquo;t work! No surprises there but here are a few things that caused us some issues and how we resolved them.\nDatabase offline At first glance everything seemed to work. Until we tried to connect to the neo4j database with the default UI, at which point we were presented with the error message\nDatabase \u0026#34;neo4j\u0026#34; is unavailable, its status is \u0026#34;offline.\u0026#34; This took a little sleuthing and shelling into the neo4j container to take a look at the /var/debug.log file which gives significantly more useful information about whats going on with the server. First we were getting stack traces that contained messages like\nComponent \u0026#39;org.neo4j.kernel.impl.transaction.log.files.TransactionLogFiles@59d6a4d1\u0026#39; was successfully initialized, but failed to start. Please see the attached cause exception \u0026#34;/data/transactions/neo4j/neostore.transaction.db.0\u0026#34; From experience this sounded like a permissions issue and lo and behold, checking the files on the filesystem showed that because the import script was run as root the database files were owned by root. We resolved this by adding:-\nchown -R neo4j:neo4j /data/ to the bottom of the import script. Next we were then presented with an error that looked like\n2020-07-14 16:56:33.919+0000 WARN [o.n.k.d.Database] [neo4j] Exception occurred while starting the database. Trying to stop already started components. Mismatching store id. This one seems like it would be an obvious one to google and I did come up with few pages that seemed to describe what was happening to me but gave some varied solutions, from starting and stopping the sever and running neo4j-admin unbind in between to deleting various files. It seemed very strange because we did test this with the 3.5.17 version of Neo and it worked fine.\nThe solution we needed was to wipe the slate clean properly. The line in our script to remove the previous build of the db\n# remove old db for rebuild rm -rf \u0026#34;/data/databases/$DNBAME\u0026#34; just didn\u0026rsquo;t cut it. It turns out that because the 4.x version of Neo4j supports multiple databases the import command writes additional information to the system database and transactions database in the form of some identifiers for each database, BUT if you don\u0026rsquo;t do something to clear that value for the database your are building it wont match up when the server starts and you get a declaration of Mismatching store id\nI\u0026rsquo;m not sure if the developers are aware of this flaw, so in the mean time we have to expand our cleanup to:\n# clean up for fresh import rm -rf /data/databases/* rm -rf /data/transactions/* removing the neoj4, system and store_lock databases and transaction logs from the data store. This solved the problem and the server was able to start and we could connect to neo4j database successful.\nIts not an ideal solution, I can foresee definite situations we will have to work around when we get to a point where multiple databases may be needed and are built separately and independently from each other. but it will suffice for now.\nMalloc(): Error message goes here Once it was up and running we noticed that we were getting lots of restarts on the main neo4j container a quick look at the stdout log and we could see each restart ending with something that looked like\nmalloc(): corrupted top size instantly this looks like an issue with memory sizing inside the container for the JVM. Thankfully the team at Neo4j have accounted for this and give you a nice tool in the form of\nneo4j-admin memrec which interrogates the databases and gives some sensible values you can set in the output which in our case looked like\n# Memory settings recommendation from neo4j-admin memrec: # # Assuming the system is dedicated to running Neo4j and has 376.6GiB of memory, # we recommend a heap size of around 31g, and a page cache of around 331500m, # and that about 22400m is left for the operating system, and the native memory # needed by Lucene and Netty. # # Tip: If the indexing storage use is high, e.g. there are many indexes or most # data indexed, then it might advantageous to leave more memory for the # operating system. # # Tip: Depending on the workload type you may want to increase the amount # of off-heap memory available for storing transaction state. # For instance, in case of large write-intensive transactions # increasing it can lower GC overhead and thus improve performance. # On the other hand, if vast majority of transactions are small or read-only # then you can decrease it and increase page cache instead. # # Tip: The more concurrent transactions your workload has and the more updates # they do, the more heap memory you will need. However, don\u0026#39;t allocate more # than 31g of heap, since this will disable pointer compression, also known as # \u0026#34;compressed oops\u0026#34;, in the JVM and make less effective use of the heap. # # Tip: Setting the initial and the max heap size to the same value means the # JVM will never need to change the heap size. Changing the heap size otherwise # involves a full GC, which is desirable to avoid. # # Based on the above, the following memory settings are recommended: dbms.memory.heap.initial_size=31g dbms.memory.heap.max_size=31g dbms.memory.pagecache.size=331500m # # It is also recommended turning out-of-memory errors into full crashes, # instead of allowing a partially crashed database to continue running: #dbms.jvm.additional=-XX:+ExitOnOutOfMemoryError # # The numbers below have been derived based on your current databases located at: \u0026#39;/var/lib/neo4j/data/databases\u0026#39;. # They can be used as an input into more detailed memory analysis. # Total size of lucene indexes in all databases: 0k # Total size of data and native indexes in all databases: 17300m So how to get these values into the container\u0026hellip; Thankfully this is handled for you in the form of Environment Variables you can pass into the docker image. A bit of a google and i found this little snippet which is a goldmine for telling us how to translate settings into environment variables.\n# Env variable naming convention: # - prefix NEO4J_ # - double underscore char \u0026#39;__\u0026#39; instead of single underscore \u0026#39;_\u0026#39; char in the setting name # - underscore char \u0026#39;_\u0026#39; instead of dot \u0026#39;.\u0026#39; char in the setting name # Example: # NEO4J_dbms_tx__log_rotation_retention__policy env variable to set # dbms.tx_log.rotation.retention_policy setting As for getting the variables into the container, you could do this from the pod and inject it in. I this case because the data we are going to be using is reasonably stable and tested we decided to stick them into the Docker file with the ENV directive.\nENV NEO4J_dbms_memory_heap_initial__size 31g ENV NEO4J_dbms_memory_heap_max__size 31g ENV NEO4J_dbms_memory_pagecache_size 331500m And so far we haven\u0026rsquo;t had a restart yet!\n","date":"2020-07-15T00:00:00Z","image":"/pre-populating-neo4j-using-kubernetes-init-containers-and-neo4j-admin-import/maxresdefault.jpg","permalink":"/pre-populating-neo4j-using-kubernetes-init-containers-and-neo4j-admin-import/","title":"Pre-populating Neo4J using Kubernetes Init Containers and neo4j-admin import"},{"content":"I\u0026rsquo;m a Dungeon Master! I don\u0026rsquo;t mean that in the S\u0026amp;M sense! As in the game Dungeons \u0026amp; Dragons (https://dnd.wizards.com), where I run a weekly game as well as take part in a couple of campaigns as a player. It\u0026rsquo;s a lot of fun and something I would definitely recommend you have a go at if you are so inclined\nThere is a vast amount of tooling \u0026amp; tech out there that allows you to play remotely such as Virtual Table Tops, Character builders, online resources, etc. One such tool that gets used quite often is a chat service called Discord (https://discord.com) It\u0026rsquo;s really useful and allows you to easily be part of and manage communities of people\u0026hellip;. Think IRC \u0026amp; Slack, but more up to date than IRC and less \u0026ldquo;workish\u0026rdquo; than Slack.\nAs part of my online games I like being able to have ambient music to match the surroundings the players are traveling through, as well as some active elements thrown in for good measure. This is possible in a few different ways using discord but the way I want to set it up can be somewhat frustrating to set up. Let me explain:\nI have taken a shine to two tools in particular\u0026hellip; Syrinscape (https://syrinscape.com) and Table Top Audio (https://tabletopaudio.com). The former being a windows app with an nice interactive mixing UI that allows you to combine and generate unique sounds, the latter being a lovely web service that has some fantastic loop-able ambient background tracks all 100% free.\nI am wanting to be able to pipe the audio from these two services into my Discord server so that I can make use of the fantastic audio they offer. This is the journey of how I managed to get this working, partly as a reminder for me if I ever need to do this again and also to help others that may be looking to do the same.\nMy Setup I\u0026rsquo;ve been a big fan of Ubuntu for a number of years, but since 20.04 I\u0026rsquo;ve found that the shine I\u0026rsquo;ve had for it has waned significantly. I wont go into the why and wherefore of it but I\u0026rsquo;m now running the excellent Pop_OS! from System76 (https://pop.system76.com) its an Ubuntu variant but with the bits I dislike removed. So assume that anything I\u0026rsquo;m doing is compatible with Ubuntu 20.04.\nThe Requirements The ideal solution should see me being able to have a single instance of discord running that allows me to still use my mic to be able to talk, and to have my selected background playing with the ability to control the volumes of both the mic and the background independently.\nFinding A Solution A lot of googling led me to realise that there isn\u0026rsquo;t a perfect solution to fit my brief. The hardest part being not actually knowing what to google and a lot of the terminology being somewhat foreign to me as I\u0026rsquo;m not much of an audio engineer. However I finally stumbled upon a blog post by Emma Anderson dated June 2016 and thankfully it gives me a lot of the heavy lifting that I needed along with some explanation of what I\u0026rsquo;m trying to achieve, though I\u0026rsquo;m hopefully going to be more verbose here in what this all means and how it works.\nPulseAudio The first thing we need to do is make sure the packages for pulseaudio and pavucontrol are installed. These will allow us to manipulate the way we capture sound and redirect it to the appropriate inputs and outputs.\napt install pulseaudio pavucontrol Virtual Input \u0026amp; Virtual Mic What we are going to try to achieve, is to create two new elements inside of Pulseaudio;\na Virtual input that we can channel the applications creating our background sounds which will allow us to control the volume independently. a Virtual Microphone that we can channel our both our normal microphone and the new Virtual input into. By creating these elements we can then use the pavucontrol tool to select what needs to be redirected where. so lets get started.\npactl load-module module-null-sink sink_name=VirtualInput pacmd update-sink-proplist VirtualInput device.description=VirtualInput pacmd update-source-proplist VirtualInput.monitor device.description=VirtualInput.monitor Here we have two commands, the first will create our new Virtual Input as what is referred to as a \u0026ldquo;null sink\u0026rdquo;. This on its own is not really very useful for us as we also need what is referred to as a \u0026ldquo;source\u0026rdquo;, thankfully when we run this command it also created a new \u0026ldquo;source\u0026rdquo; for us.\nOn it\u0026rsquo;s own that should be more than enough, but running the 2nd \u0026amp; 3rd command makes our live a lot easier because it will apply some very useful labels to both of the newly created sink and source. In this case VirtualInput for the sink and VirtualInput.monitor for the source. Having these in place makes it a lot simpler to configure things with pavucontrol.\nNext we need to create our Virtual Mic using some very familiar looking commands.\nVM=$(pactl load-module module-null-sink sink_name=VirtualMic) pacmd update-sink-proplist VirtualMic device.description=VirtualMic pacmd update-source-proplist VirtualMic.monitor device.description=VirtualMic.monitor again we have now created a new new pair of sink and source with some nice easy to recognise labels that we will use when we start working with pavucontrol.\nThe next piece to our puzzle is creating the components that will let us define a connection from the VirtualInput and our physical microphone to the newly created VirtualMic. We do this with two identical commands;\npactl load-module module-loopback sink=VirtualMic pactl load-module module-loopback sink=VirtualMic we now have most of the elements that we need to configure everything to work.\nListening to my own Ambience Before we can start wiring it all together we need to ensure we can also listen back to our own ambience. This involves us creating one more \u0026ldquo;loopback\u0026rdquo; module that points to the speakers we are wanting to listen to. Lets find out what our options are by running;\npacmd list-sinks | awk \u0026#39;/index:/ {print $0}; /name:/ {print $0}; /device.description/ {print $0} This lists all of the available \u0026ldquo;sinks\u0026rdquo; that we can use. on my daily driver laptop I get;\n* index: 1 name: \u0026lt;alsa_output.pci-0000_00_1f.3.analog-stereo\u0026gt; device.description = \u0026#34;Built-in Audio Analogue Stereo\u0026#34; This tells us the \u0026ldquo;index\u0026rdquo; for the device, its name and also some kind of description. The important bit for us here is the name as we will need that to create our new \u0026ldquo;loopback\u0026rdquo; with the command;\npactl load-module module-loopback sink=alsa_output.pci-0000_00_1f.3.analog-stereo This creates the last piece for our puzzle!\nConnecting it all together I\u0026rsquo;m now going to assume you have logged yourself into the Discord client and fired up your copy of Syrinscape\u0026hellip; but you should just as easily swap out these for something else of your choice.\nNow we can start pavucontrol either from the command line or you can look for it in your applications menu. Once it loads you will hopefully be presented with something that looks like;\nFor this next step I am specifically starting on the \u0026ldquo;Recording\u0026rdquo; tab of pavucontrol this is to allow us to set up what is going to be captured. I have updated the drop-down at the bottom left to show \u0026ldquo;All Streams\u0026rdquo; as this will make it quicker to configure… Starting at the top we have two entries for;\nLoopback to VirtualMic from: These are the result of the first two \u0026ldquo;loopback\u0026rdquo; modules we created with the pactl command we ran previously. They are going to allow us to capture the audio streams from our physical microphone, mine here is the TONOR TC-777 and our newly created VirtualMic.\nfollowed by a single entry for;\nLoopback to Built-in-Audio Analougue Stereo from: which is the last \u0026ldquo;loopback\u0026rdquo; module that we create to let us hear our own Ambience, Having this set to our VirtualInput means that anything that pipe into our VirtualInput will also come out of our speakers.\nand finally;\nWEBRTC VoiceEngine: Once you connect to a voice channel in discord this will appear and it allows us to specify which of our devices it should be reading the audio feed from. For our purposes we have this set to our VirtualMic so that we can have our mixed audio feeds\nNow that recording is configured we can sort out our playback.\nHere we can see the \u0026ldquo;Playback\u0026rdquo; tab of pavucontrol, again set to show \u0026ldquo;All Streams\u0026rdquo;. This time I\u0026rsquo;m going to run through the elements here starting from the bottom of the list and working my way up\u0026hellip;\nWEBRTC VoiceEngine: This again is our connection to a Discord voice channel, as you can see I have this set to play back all of its output via Built-in Audio Analogue Stereo which is how my Operating system has labelled my physical speakers.\nSyrinscape.exe: This is the Syrinscape application, that I run through PlayOnLinux (https://www.playonlinux.com), and I will use to generate all of my lovely ambient sounds. This is set to play all of its \u0026ldquo;audio stream\u0026rdquo; on our VirtualInput.\nThe next two items in our list, providing you have configured the Recording tab first should look as in the image. Changing the \u0026ldquo;Loopback to VirtualMic\u0026rdquo; entries on the Recording tab will change the labels of these two entries.\nLoopback of VirtualInput.monitor on: it seems we have two of these entries and where we can tell the to pip all of the audio we are now capturing on our VirtualInput In this case we want it to go to two places, our VirtualMic so that it can be sent to both our Discord audio channel and also to our Built-in Audio Analogue Stereo speakers.\nLoopback of Built-in Audio Analogue Stereo on: is where we now direct the input from our physical microphone and feed that straight into our VirtualMic.\nThe other entries in the list here are for firefox and the system itself and are not relevant to what we are trying to achive.\nWinner Winner Chicken Dinner That\u0026rsquo;s is effectively all we need to do\u0026hellip; From here on in anything you play via the Syrinscape app will be merged with your microphone input and passed to Discord. You can then use the volume sliders in pavucontrol to adjust the levels of all the inputs to suit your own personal preference.\nThough I will make a few small suggestions about how to configure your discord settings. You shouldn\u0026rsquo;t need to make any adjustments to the input and output devices which should now be set to Default as your \u0026ldquo;Input Device\u0026rdquo; if you change this it will override the changes we have made and you will need to go back to the Recording tab of pavucontrol and switch WEBRTC VoiceEngine back to InputMic , but\u0026hellip;\nI would recommend disabling automatic input sensitivity and lowering the sensitivity slider all the way down to -100dB\u0026hellip; this is to allow for the potential low and subtle tones and ambient elements you may want to play\u0026hellip; be warned though it makes it very very easy for an low quality microphone (such as the Built-in Audio Analogue Stereo microphone found on my laptop) to pick up other noises such as your systems fans, mouse clicks and typing. A simple way to combat this is to get a reasonable quality external cardioid condenser microphone which eliminates a lot of this unwanted background.\nOne last thing That should be it for now\u0026hellip; I\u0026rsquo;ll leave you with one final thing. This is a simple little bash script I threw together that I run in can a terminal to create all the components and if I want will then clean them all up and remove them. If you really want you could set it up as a permanent implementation, but I\u0026rsquo;ll let you google for that solution!\n#!/bin/bash LB1=\u0026#34;\u0026#34; listenback() { echo \u0026#34;\u0026#34; echo \u0026#34;Listing all possible output devices\u0026#34; pacmd list-sinks | awk \u0026#39;/index:/ {print $0}; /name:/ {print $0}; /device\\.description/ {print $0}\u0026#39; echo \u0026#34;\u0026#34; echo \u0026#34;Please enter the name of the output device to create a loopback for (leave blank to skip): \u0026#34; read S1 if [ \u0026#34;$S1\u0026#34; != \u0026#34;\u0026#34; ]; then echo \u0026#34; * Creating Loopback for \u0026#39;$S1\u0026#39;\u0026#34; LB1=$(pactl load-module module-loopback sink=\u0026#34;$S1\u0026#34;) fi } cleanup() { while true; do read -p \u0026#34;Finished? do you want to clean up and remove modules [Yn]: \u0026#34; yn case $yn in [Yy]* ) return 0;; [Nn]* ) return 1;; * ) echo \u0026#34;Please answer yes or no.\u0026#34;;; esac done } listenback echo \u0026#34; * Creating VirtualInput\u0026#34; VI=$(pactl load-module module-null-sink sink_name=VirtualInput) pacmd update-sink-proplist VirtualInput device.description=VirtualInput pacmd update-source-proplist VirtualInput.monitor device.description=VirtualInput.monitor echo \u0026#34; * Creating VirtualMic\u0026#34; VM=$(pactl load-module module-null-sink sink_name=VirtualMic) pacmd update-sink-proplist VirtualMic device.description=VirtualMic pacmd update-source-proplist VirtualMic.monitor device.description=VirtualMic.monitor echo \u0026#34; * Creating loopbacks for VirtualMic\u0026#34; VML1=$(pactl load-module module-loopback sink=VirtualMic) VML2=$(pactl load-module module-loopback sink=VirtualMic) echo \u0026#34;All modules have been loaded have been configured! Run pavucontrol to configure your devices.\u0026#34; if cleanup; then pactl unload-module \u0026#34;$VML2\u0026#34; pactl unload-module \u0026#34;$VML1\u0026#34; pactl unload-module \u0026#34;$VM\u0026#34; pactl unload-module \u0026#34;$VI\u0026#34; if [ $LB1 != \u0026#34;\u0026#34; ]; then pactl unload-module \u0026#34;$LB1\u0026#34; fi echo \u0026#34;All modules have been unloaded\u0026#34; else if [ $LB1 != \u0026#34;\u0026#34; ] then echo \u0026#34;Modules $LB1, $VI, $VM, $VML1 \u0026amp; $VML2 remain loaded\u0026#34; else echo \u0026#34;Modules $VI, $VM, $VML1, \u0026amp; $VML2 remain loaded\u0026#34; fi fi ","date":"2020-06-30T00:00:00Z","image":"/adding-ambient-sounds-to-your-discord-server-on-linux/tfOnZwZBwA-e1593539639171.jpg","permalink":"/adding-ambient-sounds-to-your-discord-server-on-linux/","title":"Adding Ambient Sounds to your Discord Server On LInux"},{"content":"Encryption is king nowadays with everyone having mobile devices. We have a significant number of people on laptops that travel around and also workstations that live in open plan offices. This means we encrypt all of our disks\u0026hellip; just in case. 99% of the time is super simple to do as most OS installers give you the option to do it, some now ven enforce it as a default option. This post however is about adding an additional disk to the system and making it automatically mount on system startup.\nSo let me set the scene, we have a data-scientist that\u0026rsquo;s running out of disk space for a task they are running on their Ubuntu 18.04 Workstation. At some point the workstation had an upgrade to the HDD in the past to a shiny new SSD, and the old 4Tb spinning disk was left in the chassis that they want to use for this very specific task.\nNow this workstation has been through a couple of data-scientists over the last 12 months and unfortunately the LUKS password that had been set up for the old spinning disk has gone walkabouts\u0026hellip; so the plan is as follows\nflatten the old disk and set up a new partition using the whole disk generate a new secure encryption key set up LUKS encryption on the new partition use Ext4 as a filesystem enable auto decryption of the disk add the new partition to the fstab to mount on system startup N.B Assume that we are running everything as the root user\nFlatten the disk As we cant recover anything we are going to flatten the disk using parted (apt install parted to install) to allow is to create a partition greater than 2Tb, but first we are going to identify the disk we are working with\u0026hellip; I tend to favour using either fdisk -l or as a more concise option lsblk -p which gives us a an easy to interpret overview something like:\n/dev/sda 8:0 0 1.8T 0 disk ├─/dev/sda1 8:1 0 512M 0 part /boot/efi ├─/dev/sda2 8:2 0 732M 0 part /boot └─/dev/sda3 8:3 0 1.8T 0 part └─/dev/mapper/sda3_crypt 253:0 0 1.8T 0 crypt ├─/dev/mapper/ubuntu--vg-root 253:1 0 1.8T 0 lvm / └─/dev/mapper/ubuntu--vg-swap_1 253:2 0 976M 0 lvm [SWAP] /dev/sdb 8:16 0 3.7T 0 disk I can tell from this that we are looking at using the disk that is currently at /dev/sdb and its showing as being 3.7Tb in size.\nGreat\u0026hellip; now to set up our new partition using the command parted /dev/sdb which gives us an interactive shell to work with (you can see the prompts in the output below are prefixed with (parted)\nGNU Parted 3.2 Using /dev/sdb Welcome to GNU Parted! Type \u0026#39;help\u0026#39; to view a list of commands. (parted) mklabel gpt The command mklabel gpt will wipe the partition table for /dev/sdb and give us a clean slate to work from\n(parted) unit TB We now set parted to think in Terabytes as the default reference size using the command above.\n(parted) mkpart primary 0.00TB 3.70TB Now we get to create the actual partition. You can see from the command above that we are using the command mkpart and telling it to create a primary partition type.\n(parted) print Model: ATA WDC WD4005FZBX-0 (scsi) Disk /dev/sdb: 4.00TB Sector size (logical/physical): 512B/4096B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 1 0.00TB 4.00TB 4.00TB primary We can chek everything went smoothly using the print command which gives us confirmation that a new primary partition is present. We can now leave parted with a simple.\n(parted) quit And we can now use fdisk -l or lsblk -p to see that we now have a partition waiting for us at /dev/sdb1.\n/dev/sda 8:0 0 1.8T 0 disk ├─/dev/sda1 8:1 0 512M 0 part /boot/efi ├─/dev/sda2 8:2 0 732M 0 part /boot └─/dev/sda3 8:3 0 1.8T 0 part └─/dev/mapper/sda3_crypt 253:0 0 1.8T 0 crypt ├─/dev/mapper/ubuntu--vg-root 253:1 0 1.8T 0 lvm / └─/dev/mapper/ubuntu--vg-swap_1 253:2 0 976M 0 lvm [SWAP] /dev/sdb 8:16 0 3.7T 0 disk └─/dev/sdb1 8:17 0 3.7T 0 part Generating an encryption key Our disk is now ready for use, but not yet encrypted, so our next step is to create a key that can be used when we encrypt the disk. As we are going to be mounting it automatically we want to use a keyfile to store the key. You can of course create a key by mashing the keys on the keyboard, but I tend to prefer letting something else do the hard part for me.\nFirst we create somewhere to store the key\u0026hellip; I opted for,\nmkdir -p /etc/crypt/keys But feel free to put it wherever you want just as long as its only accessible by the root user. Next we generate the keyfile using the command:\ndd bs=512 count=4 if=/dev/urandom of=/etc/crypt/keys/sdb1 iflag=fullblock Here I am using /dev/urandom as my randomness generator, but you could use any valid generator of your choice. With this set of parameted dd will read the stream of \u0026ldquo;randomeness\u0026rdquo; and write 2048 bytes to our keyfile at /etc/crypt/keys/sdb1. If you want to be a little more complex about teh size and shape of your key then have a look at https://man7.org/linux/man-pages/man1/dd.1.html\nEncrypting the Disk Hopefully it will already be installed because you encrypted your root disk at installation, but if not you can run apt install cryptsetup to get going.\nThe command to do the encryption is actually very simple.\ncryptsetup luksFormat /dev/sdb1 /etc/crypt/keys/sdb1 You can see that using the cryptsetup tool we are asking it to execute teh command luksFormat but while it says format in the command this is a little misleading as it doesn\u0026rsquo;t actually format the disk but just rewrites a portion of bytes at the beginning of the partition to enable encryption. we then tell it the partition we want encrypting, here its /dev/sdb1 and finally we pass in the keyfile we just generated and saved at /etc/crypt/keys/sdb1. If you omit the keyfile it will still encrypt teh disk but will prompt you to enter the key manually.\nAs soon as you press enter you will be warned of teh danager of what you are doing\u0026hellip; so double check you are encrypting the right partition and follow the instructions that should look something like :\nWARNING! ======== This will overwrite data on /dev/sdb1 irrevocably. Are you sure? (Type uppercase yes): YES Command successful. And that\u0026rsquo;s it\u0026hellip; the disk is encrypted and ready to use. There are a few ways you can now work with the disk. the quickest and easiest is to just decrypt the disk manually using cryptsetup to open the disk.\ncryptsetup open /dev/sdb1 sdb1_crypt -d /etc/crypt/keys/sdb1 Here we open /dev/sdb1 and give is a new name of sdb1_crypt and we unlock it using the -d argument to tell it the keyfile we generated before.\nThat is the dis decrypted and ready to roll\u0026hellip; you can now use fdisk -l or lsblk -p to confirm that it is now available at /dev/mapper/sdb1_crypt.\n/dev/sda 8:0 0 1.8T 0 disk ├─/dev/sda1 8:1 0 512M 0 part /boot/efi ├─/dev/sda2 8:2 0 732M 0 part /boot └─/dev/sda3 8:3 0 1.8T 0 part └─/dev/mapper/sda3_crypt 253:0 0 1.8T 0 crypt ├─/dev/mapper/ubuntu--vg-root 253:1 0 1.8T 0 lvm / └─/dev/mapper/ubuntu--vg-swap_1 253:2 0 976M 0 lvm [SWAP] /dev/sdb 8:16 0 3.7T 0 disk └─/dev/sdb1 8:17 0 3.7T 0 part └─/dev/mapper/sdb1_crypt 253:3 0 3.7T 0 crypt /mnt/4tb-1 This tells us that the newly decrypted disk is now available at /dev/mapper/sdb1_crypt and is a volume of 3.7Tb\u0026hellip; Exactly what we were hoping for!\nAll finished with your encrypted disk\u0026hellip; you can just as easily close it again using:\ncryptsetup close sdb1_crypt Setting up the Filesystem Ok, we have an encrypted partition, we can decrypt it but we cant mount it yet as we don\u0026rsquo;t have a file system to work with. Let\u0026rsquo;s take care of that real quick by opening up the partition again.\ncryptsetup open /dev/sdb1 sdb1_crypt -d /etc/crypt/keys/sdb1 And now that its available we are going to set up an ext4 filesystem using the command mkfs.ext4 /dev/mapper/sdb1_crypt which, all going according to plan, should look something like:\nmke2fs 1.44.1 (24-Mar-2018) Creating filesystem with 976753664 4k blocks and 244195328 inodes Filesystem UUID: d797be67-c53e-49d3-897e-c624b21a22d3 Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, 102400000, 214990848, 512000000, 550731776, 644972544 Allocating group tables: done Writing inode tables: done Creating journal (262144 blocks): done Writing superblocks and filesystem accounting information: done we are now good to go\u0026hellip; lets try mounting the filesystem with\nmount -t ext4 /dev/mapper/sdb1_crypt /mnt I\u0026rsquo;m just mounting straight to /mnt but obviously this can be any folder you want. If the command worked we can easily confirm it with a quick df -h:\nFilesystem 1K-blocks Used Available Use% Mounted on udev 32846708 0 32846708 0% /dev tmpfs 6578140 2308 6575832 1% /run /dev/mapper/ubuntu--vg-root 1919562064 993193376 828790500 55% / tmpfs 32890688 200 32890488 1% /dev/shm tmpfs 5120 4 5116 1% /run/lock tmpfs 32890688 0 32890688 0% /sys/fs/cgroup /dev/sda2 721392 276068 392860 42% /boot /dev/sda1 523248 6232 517016 2% /boot/efi /dev/mapper/sdb1_crypt 3844637680 0 3844637680 1% /mnt Excellent\u0026hellip; you can now start working with your new partition\u0026hellip; however lets un-mount and close the drive quickly with a\numount /mnt \u0026amp;\u0026amp; cryptsetup close sdb1_crypt And then we can move onto\u0026hellip;\nAutomatic Decryption This is a lot simpler that you may realise\u0026hellip; all we need to do is add a new line to the file /etc/crypttab! But first we need one last piece of information we don\u0026rsquo;t yet have, but we can easily get with the command\nsudo cryptsetup luksDump /dev/sdb1 | grep \u0026#34;UUID\u0026#34; This will use luksDump to get information about the encrypted partition and then uses grep to specifically target the property UUID which we will need to identify the partition in the next step.\nNow in your favourite editor of choice add the following line, replacing the spoof UUID here with the one we just found.\nsdb1_crypt UUID=1111111111-2222-3333-4444-555555555555 /etc/crypt/keys/sdb1 luks Here we are giving the decrypted volume a unique label for teh decrypted label to be made available at the appropriate /dev/mapper/* location. We also specify the UUID to identify the partition to decrypt\u0026hellip; we could use the path /dev/sdb1 but using the UUID is more explicit and prevents any confusion if another partition happens to present itself as /dev/sdb1 at some point in the future. Third we have the path to our newly generated keyfile and finally we have the encryption mode that we are using for encryption which here is luks. For more info on crypttab have a look at https://www.freedesktop.org/software/systemd/man/crypttab.html\nWe can now test that auto decryption is working using:\ncryptdisks_start sdb1_crypt which if successful should have an output like:\n* Starting crypto disk... * sdb1_crypt: INSECURE MODE FOR /etc/crypt/keys/sdb1, see /usr/share/doc/cryptsetup/README.Debian. * sdb1_crypt (starting).. * sdb1_crypt (started)... All that\u0026rsquo;s left to do now is set up\nAuto-mount the filesystem Hopefully we now are on really familiar ground\u0026hellip; we can now treat /dev/mapper/sdb1_crypt as a bog standard ext4 partition that can be mounted via the /etc/fstab by adding the line:\n/dev/mapper/sdb1_crypt /mnt ext4 defaults 0 2 As you can see its pretty ordinary, exactly as you would expect, obviously swapping out /mnt with the location of your choice to mount the filesystem. If you are not wholly familiar with fstab then its definitely worth having a look at https://help.ubuntu.com/community/Fstab as it gives a good overview for those who are new to it\u0026hellip;\nFinally we can check that it all works with:\nmount -a And at this point I am pretty sure I can hear the fat lady singing\u0026hellip;\n","date":"2020-06-29T00:00:00Z","image":"/encrypting-additional-drives-with-luks-on-linux/encryption-encoding-hashing.jpg","permalink":"/encrypting-additional-drives-with-luks-on-linux/","title":"Encrypting additional drives with LUKS on Linux"},{"content":"We have a mix of different setups that the Software Engineer and Data Scientists use to get their work done. There are some using just Linux on laptops, Some on MacBooks and some on the various versions of Windows.\nFor those not using Linux as their primary OS we have a bunch of Desktops that run Ubuntu 18.04+ for them to connect to. SSH can do quite a lot but a few of the team work remotely and in house we prefer RDP for that kind of thing rather than VNC.\nWe have had some issues with connections in the past so this post exists to remind me how next time I need to set it up. First we need to install the xRDP server package.\nsudo apt install xrdp Next we need to ensure that we have the right ports open on the workstation. If like me you also use UFW to manage your firewall rules then open port 3389 using\u0026hellip;\nsudo ufw allow 3389 The issue left, is that you will get an annoying pop up when you log in about a colour management profile needing to be set up and asking you to provide your password. Even then you may still get some annoying crash pop-ups.\nI found a really good solution to this at http://c-nergy.be/blog/?p=12043 which I\u0026rsquo;ve cribbed and paraphrased below\nCreate the file /etc/polkit-1/localauthority/50-local.d/45-allow-colord.pkla (using your editor of choice and sudo)and add the following contents\n[Allow Colord all Users] Identity=unix-user:* Action=org.freedesktop.color-manager.create-device;org.freedesktop.color-manager.create-profile;org.freedesktop.color-manager.delete-device;org.freedesktop.color-manager.delete-profile;org.freedesktop.color-manager.modify-device;org.freedesktop.color-manager.modify-profile ResultAny=no ResultInactive=no ResultActive=yes We now need to clear any crash dumps from the workstation\nsudo rm /var/crash/* You should then be good to connect to using whatever RDP client you prefer\u0026hellip; I like Remmina myself but each to their own.\n","date":"2019-05-20T00:00:00Z","permalink":"/connecting-to-ubuntu-18-04-using-rdp/","title":"Connecting to Ubuntu 18.04+ using RDP"},{"content":"Recruiting people is effing hard!\nThat\u0026rsquo;s all\u0026hellip; I\u0026rsquo;ll get back to reading through CVs now and let you get on with your day!\nMedicines Discovery Catapult is, at the time of writing, recruiting Software Engineers, and as \u0026ldquo;Head of\u0026rdquo; it falls to me to start filtering through the CVs that land in my inbox. But what a mess! I get all sorts, from 1-page masterpieces that look amazing and all glossy, but tell me nothing about the individual to 10 pages of war and peace that have so much in it spanning 20 years of experience that 70% is actually completely irrelevant to the job they have applied for.\nSo now I get to say that this is what the perfect portrait of a CV should look like\u0026hellip; I\u0026rsquo;m sorry to disappoint but I don\u0026rsquo;t think there is such a thing as a \u0026ldquo;perfect\u0026rdquo; CV. It\u0026rsquo;s far too subjective and open to interpretation. Instead, I\u0026rsquo;m gonna rip apart my own CV and explain why I chose to write it like I have and justify reasons why I would expect to see similar things on the CVs I am forced to read.\nIf you want the \u0026ldquo;TL;DR\u0026rdquo; version you can find a copy of my CV at https://docs.google.com/document/d/1MM_6nXIVU_wbrvkhdGL5xaJEwc46L7JblUqUot_g1ec but if you have gotten this far you may as well stick around and read the rest. I promise it won\u0026rsquo;t take long.\nThe Basics OK! let\u0026rsquo;s start with some general observations I follow with regard to my own CV. Then we can get into the nitty-gritty of how bad mine is!\nRecruiters Love them or loathe them they are a part of the recruitment ecosystem and they will help you get hired. When you send them a copy of your CV bear in mind that they will most likely try to squeeze it into a format that will include their own branding and a cover sheet with some pertinent details on it. They may also then start redacting information off your CV in an effort to anonymise you. It is also possible that they will use the same CV for multiple roles meaning you will need to insist that they send your tailored CV if you are really bothered about winning a specific position.\nMake sure to discuss this with your recruiter before having them put you forward for a role you are keen on.\nKeeping Up to Date Design your CV so that it is easy to tweak and update! Keeping your CV up to date is essential, and I don\u0026rsquo;t mean just adding your last role when you finally get fed up with your current employer abusing your good nature and rage quit!\nKeeping your CV in top form means that you need to review the entire thing whenever you make an update. As your career progresses you will find that your perspective will change on what previous roles consisted of and how they would impact/reflect on the position you now desire.\nYou will also find that an up-to-date CV is easier to tailor to any specific job you might be applying for\nPrettiness I know his name is Riccardo, but I\u0026rsquo;m struggling to focus on anything else\nHaving a pretty CV is great! and if creativity is relevant to the vacancy you are hoping to fill then go for it\u0026hellip; make it gorgeous. It is however very subjective so make sure that you understand what your potential employer is looking for. Sometimes less is more, especially considering they may be scanning through hundreds of CVs. If they have to spend half an hour looking for something specific, then it\u0026rsquo;s gotten lost in all that creativity. If you want to see some gorgeous-looking CVs take a look at https://weare.guru/creative-cvs/ all of them are creative and beautiful\u0026hellip;\nBut\u0026hellip; they could take 20 minutes, or more, for someone to read and extract the right information. Which will waste a lot of time for your potential new boss. Plus we are talking about technical CVs and I am not the most creative of individuals, so keeping it clean and well formatted with consistent fonts usage and sizing\nLength I\u0026rsquo;ve heard a lot of people say different things about how long your CV should be. 1 side of A4\u0026hellip; 2 sides of A4 but only if it\u0026rsquo;s printed double-sided, even \u0026ldquo;length doesn\u0026rsquo;t matter\u0026rdquo; because the more information you put in the better.\nI would recommend taking a middle-of-the-road approach. Keeping things concise is paramount\u0026hellip; but if you need 3 or 4 pages then that\u0026rsquo;s OK\u0026hellip; as long as you make the content captivating and interesting for the reader, that is what matters\nPDF This one feels like it should be a \u0026ldquo;no brainier\u0026rdquo;\u0026hellip; Make sure to submit your CV as a PDF!\nThere are two very good reasons behind this. First, it ensures that the reader will view it exactly as you intended. If you send it over as a Word, Google, OpenOffice or other such documents, then you are not guaranteed the reader will be using the exact same tools. I for one don\u0026rsquo;t use Word and an awful lot of presentation is lost because someone used a fancy word feature or have a font that I can\u0026rsquo;t get hold of.\nSecond, While it is not impossible it discourages recruiters from tampering with it and spoiling all your hard work. I have known only a few recruiters in my time that are willing to learn (or pay for) a good PDF editing suite. It is possible for them to alter things, but they tend to have to jump through hoops.\nBeginnings And so we have the start of My CV. Looks pretty boring, doesn\u0026rsquo;t it? Black text on a white background. Nothing fancy!\nWe have some basic contact details that can be used to contact me and a very short profile statement. That is all I feel an employer needs to see before we get into the next section of my CV. On the surface, it doesn\u0026rsquo;t actually say much but lets scratch a little deeper.\nFont I have specifically chosen a font that I think looks clean and professional, With a nice easy typeface, it becomes easy for an employer to scan the CV. I keep my CV in Google Docs so I went with Raleway, which I think is nice, clean, professional and easy to read.\nI\u0026rsquo;ve also chosen consistent font sizes and spacing;\n16 for the main title 14 for headings 12 for subheadings 11 for everything else Limiting Liability Middle-aged, liberal, heterosexual white male programmer Swipe right to Hire Me!!! Please!\nIt may sound odd but I put as little personal information in my CV as possible. Hence why just my name and contact details exist. I don\u0026rsquo;t mention my age, gender, race, driver status or political preferences. People come with inherent biases, that is just a fact of life. so putting as little as possible negates triggering these biases.\nI\u0026rsquo;ve seen people put all sorts of information into CVs\u0026hellip; a personal bugbear is photos. As they can introduce a massive amount of opinion in the eyes of the beholder. Think Tinder but for recruitment!\nSo unless specifically asked for I would strongly recommend keeping it to a minimum and using the space it would consume in selling the things that matter.\nProfile statement 1 paragraph! that\u0026rsquo;s all I needed to say regarding what is effectively a personal statement. This is not a UCAS application and I don\u0026rsquo;t need to detail a million things about myself. What is to come next will be the real sales pitch\nMy profile statement focuses specifically on what my future employer is going to get from me if they hire me. Passion! It is a statement of intent, specifically written to demonstrate that I will strive to bring my \u0026ldquo;A game\u0026rdquo; to anything I do in the future and also that I intend to encourage those around me to do the same. In effect, it is the opening line to what is a sales pitch.\nI could expand on this to state other goals and ambitions but that would just detract from the main objective of the CV. Plus we will get an opportunity to elaborate later on in the recruitment process.\nSkills Now we come to the meat and bones of the CV. This is the headline! the bit that you will tailor the most in order to impress whoever is scheduling those interviews and guarantee you a chance to shine. If you can make this captivating enough for the person you need to impress they will then be more than happy to read the rest\nMy CV has a LOT of skills listed. This is mainly because I\u0026rsquo;m a show off more than anything else and in reality, I will tailor this heavily to suit whatever role I\u0026rsquo;m applying for. For example, if it doesn\u0026rsquo;t have a requirement for managerial skills I would drop that section entirely. Depending on the required attributes and skills asked for I would happily add/remove bits to any section of my skills.\nWith different types of skills, I will define them differently. As you can see I am quite generic with Managerial skills. A lot of people understand these and are looking for confidence in your ability to do certain types of tasks. Technical skills define different aspects of what I do that are not specifically tied to languages. It can be very difficult to be curt and to the point here. I favour labelling a technology area and then listing a few of the most prominent or recent items from my repertoire. A bad example on my CV is databases\u0026hellip; I can use quite a few as you can see, but in reality, I\u0026rsquo;ve been greedy by listing MySQL, MariaDB \u0026amp; Percona\u0026hellip; they are all effectively connected to each other, but my need to show off is far too much for me to resist.\nExperience When I first started out as a Dev I had a hiring manager who once told me that most of the time the main thing he wanted to see was;\ndid the applicant have the skills he wanted what was the candidates\u0026rsquo; personal assessment of their ability how much genuine commercial experience do they have Three really simple things that stuck with me. So the next time I came to write my CV I added a table for Skills which had only a few things on it back then but has grown massively over the years. I\u0026rsquo;ve tweaked and tailored it over the years, adding and removing things that were relevant as needed.\nBy quite clearly labelling my own perception of my skills, I give the person reading it an indication of my potential value. It allows them to tailor the interviews and technical tests to fit me as an individual. It also acts as a double-edged sword as the higher I grade myself the more chance of falling flat on my face when I get asked a question I can\u0026rsquo;t answer\u0026hellip; So it\u0026rsquo;s always better to try and be accurate, but also not to be timid, there are always other competitors in the recruitment race.\nI favour using really simple words to identify my skill level; beginner, intermediate, expert and expert+. It makes it simple for the reader to gauge and also allows me to mix the terms around\nStating how many years of commercial experience you have lets the reader understand your commercial experience. The number of times I\u0026rsquo;ve had recruiters do a typical keyword match and see that I mention c# once on my CV and then try to wave a job advert under my nose. I did .NET for 6 months 10 odd years ago. I legitimately can claim that experience, but it should not be the core on which I base my next adventure.\nAs a useful side effect, by stating a duration in years prompts me to review and update my CV regularly to keep it up to date.\nPrevious Employment By this point, I\u0026rsquo;m hoping that I\u0026rsquo;ve captured the readers\u0026rsquo; imagination. That they now envision a Development God has graced them with a CV worthy of filling any role they have\u0026hellip;\nAnd then come back down to earth with a bump! Next up in the firing line is Work Experience. Here I define some of the previous positions I have held in my illustrious career in somewhat chronological order. I say some because I have a general rule of limiting what I put on the CV to either 10 years or 10 positions, whichever comes first. This hasn\u0026rsquo;t been a hard and fast rule over the years, and it\u0026rsquo;s fine to flex in order to suit the roles I\u0026rsquo;ve gone for. There is no reason whatsoever though to go so far back in time as to describe my time as a pot-washer when I was 17.\nIn some cases, I\u0026rsquo;ve also omitted some things from my CV. Somewhere in and around 2012 - 2015; I happened to found and run a co-working space in Manchester City Centre. It was an interesting venture that I\u0026rsquo;m really proud of and am glad to say is still running even though I\u0026rsquo;m no longer a part of it. But in truth, it does not add anything of value to my CV for the positions I plan to go for in the future.\nTo the point I don\u0026rsquo;t like big sprawling paragraphs of text if you haven\u0026rsquo;t already got the gist. So for me, bullet points are the way forward. I try to keep it fairly obvious and detail achievements and document things I have done, and not the tools I have used.\nI try to explain why I joined the company and what my purpose was. I highlight important successes and demonstrate improvements I made to the company. Short succinct sentences should be chosen to illustrate aspects of your accomplishments that actually relate to your future goals in your new position.\nBy doing it this way you also make it easier to tweak and change things without having to restructure a whole piece of prose in the future.\nThat said my own CV has a glaring exception! When I was doing freelance work I was not able to accurately describe everything I was doing due to some pesky Non Disclosure Agreements. So instead I have a simple paragraph providing a positive high-level explanation of what benefits I brought to my clients.\nFormatting There is a lot of information condensed into this section which makes could make it hard to read if left unformatted. I stuck to the font sizes I had chosen previously and decided to use a more subtle combination of indentation, italics, underline and bold to make it more pleasing to the eye and easier to scan.\nEach company name acts as a subheading with a font size of 12. I take a little liberty and include on the same line the dates that I was with them. This gives a clear timeline of events that a manager can then refer to quickly when they need it, such as in an interview.\nIndenting everything under a subheading makes it easier for the reader to separate out the content easily. I include an address for the company. I am not actually sure why if I\u0026rsquo;m honest, it\u0026rsquo;s just something I\u0026rsquo;ve always done. I put this in Italics mainly because it\u0026rsquo;s an aside to the core of the information.\nThe job title will come next in bold of course to help it stand out. Followed immediately by the relevant bullet points. My last role obviously plays the most prominent part as it\u0026rsquo;s my headliner. In the case of Medicines Discovery Catapult. I\u0026rsquo;ve held two roles, \u0026ldquo;DevOps Engineer\u0026rdquo; and \u0026ldquo;Head of Software Engineering\u0026rdquo; so I break these into their own sections within this piece of experience. Prior to that, I was with Wakelet and here I merge the two roles I held there in order to conserve space.\nEducation The bulk of the hard work is now done! We have put in the sales pitch and hopefully, we are close to being invited to an interview. Time to put in some supporting information\nThis may come as a surprise to some people but I didn\u0026rsquo;t do the whole University thing\u0026hellip; I mean I lived in a University city and frequented the student union bars, but was never actually enrolled in a course. To this effect, I bolstered my CV when I started out by going and obtaining some (now significantly outdated) professional certifications.\nRegardless of my lack of educational achievements I would always recommend keeping it simple unless it is your first position and you have no work experience (writing a CV in that situation probably needs to be another blog post entirely). List them in chronological order with some dates and a summary of what you obtained\u0026hellip; no one needs to know that I completely fluked getting a GCSE in Art.\nWere I to have a Degree I would obviously have the institution and dates in there along with my final marks. I would also look at listing the modules I completed that are relevant to my career, providing I did a Computer Sciences degree and not something like Biology. Again, it\u0026rsquo;s all about keeping to the point and providing specific information to bolster everything else you may have done.\nWrapping up Time to finish off with a little bit of something personal. I don\u0026rsquo;t want to spend too much time here but I want to show that there is more to who I am than just work. I decided to keep it simple, a simple list of things that I enjoy doing in my spare time (When I have any, being a father of 3). The idea here is that these can become conversation pieces with the people that may be interviewing you. I\u0026rsquo;ve ended up a number of times having interviews where I talk about Scouts.\nThe final flourish here should be something simple but gives someone a helping hand at learning more about you if they feel so inclined. I include a link to my blog and my GitHub account, but you could include anything that may be relevant.\nLast but not least\u0026hellip; References are always available on request. My referees are varied and have changed over time. Adding them to your CV doesn\u0026rsquo;t actually impart any other information that could get you hired. Name-dropping is not the right way to get a job.\nConclusion OK! so it\u0026rsquo;s not a pretty CV by any stretch of the imagination. It\u0026rsquo;s a bit on the long side in its full un-tailored, raw form, though not as long as it could be if I wasn\u0026rsquo;t being diligent in how I want to present myself. But this is the format I\u0026rsquo;ve used as my CV for at least 15 years now and I would say that I\u0026rsquo;ve been really successful in getting interviews out of it. I would say I get at least an 80% success rate of conversions from seeing my CV to a first-stage interview (watch me now jinx myself for the future).\nA CV will never get you the job! It\u0026rsquo;s all down to you excelling in an interview situation and proving how awesome you are and that you can do everything you say you can on your CV. All it is meant to do is get your foot in the door. Hopefully, this breakdown of my CV will help you to take a look at your own CV and work on ways to improve your chances.\n","date":"2019-05-15T00:00:00Z","image":"/technical-cv-writing/writing-a-cv.jpg","permalink":"/technical-cv-writing/","title":"Technical CV writing is hard"},{"content":"I love Ubuntu\u0026hellip; I\u0026rsquo;m pretty fond of dell kit too!\nSo I was rather chuffed when I started working at Medicines Discovery Catapult because they let me have both. When you look at my desk it looks like it could be an advert for Dell. Laptop, monitors, dock, keyboard and mouse\u0026hellip;. its great when you have a corporate account with a Dell reseller\nHowever while I\u0026rsquo;ve had a lot of success with the D3000 DisplayLink dock on Ubuntu I found that I\u0026rsquo;m now having to deal with the upgraded D6000\u0026hellip; which doesn\u0026rsquo;t play very nicely with the more recent versions of Ubuntu (we are talking 18.04 and later)\nI kept finding that after a random amount of time the D6000 would randomly seem to power down\u0026hellip; I would lose the screens, audio, networking and USB. and the only way I could fix it is to unplug it from teh laptop and plug it back in. Not ideal, especially if I\u0026rsquo;m in the middle of a video call or debugging something on the net\nBeing the kind of techie I am my first port of call checking my logs\u0026hellip; but I couldn\u0026rsquo;t see anything that would cause this random disconnect. So off to google I went\u0026hellip; eventually I found a lot of information telling me it was part of power management causing things to start powering down\u0026hellip; In this case it implied that it was something trying to suspend USB\u0026hellip; which sounded really plausible!\nSo a little more research suggested that I should be using laptop mode tools to disable the ability for USB to be suspended. I gave it a go, though I was dubious as in my mind I shouldn\u0026rsquo;t have needed to install an additional package (albeit a great one for tweaking your power management on a laptop running Linux)\nAlas no joy! And I had too much to do to start debugging in depth and ripping apart other peoples code to figure it out.\nWhat did I do? you ask. Well, I just put up with it for a few weeks, but gradually it began to grate on my nerves. However there was that one day where it didn\u0026rsquo;t turn off\u0026hellip; and that left me perplexed\u0026hellip; I checked if any updates had been applied in my last apt update \u0026amp;\u0026amp; apt upgrade \u0026hellip; nothing\u0026hellip;. it then dawned on me that I had plugged in the headset I used for conference calling into the audio in/out on the dock instead of directly into the laptop.\nNow I had a little more information I was able to deduce (with googles help) that the laptop was actually suspending USB, but that the trigger was actually pulseaudio. At this point it becomes really easy to solve the problem.\nSolution Edit /etc/pulse/default.pa using your preferred editor (and sudo)\nFind the line\n### Automatically suspend sinks/sources that become idle for too long load-module module-suspend-on-idle And comment it out and save!\nLastly, because its run as a user service you need to restart the Pulse Audio daemon using the command\nsystemctl --user restart pulseaudio.service or you could just logout and back in again\n","date":"2019-05-14T00:00:00Z","image":"/dell-displaylink-d6000-ubuntu-18-04-issues/20190514_124153.png","permalink":"/dell-displaylink-d6000-ubuntu-18-04-issues/","title":"Dell DisplayLink D6000 \u0026 Ubuntu 18.04+ Issues"},{"content":"So\u0026hellip; my last post was a good 2 years ago now\u0026hellip;. Hi how have you been?\nIt\u0026rsquo;s been a very busy couple of years with a lot of stuff shifting in my personal life meaning things inevitably take a back seat. However its been long enough that I needed to give myself a new start and see about blogging again and getting back into speaking again.\nStill PHPBoyScout? Since 2016 I\u0026rsquo;ve jumped around in my career a lot! This has exposed me to a lot of new languages and tech, all of which has been awesome but it now means that the moniker of PHPBoyScout probably isn\u0026rsquo;t accurate all that much any more. That said I still have a love for PHP and my first thought when presented with a new challenge is still \u0026ldquo;How would I approach this using PHP?\u0026rdquo;, so I think that means I can keep my twitter handle for a little longer rather than re-brand myself :-D\nI\u0026rsquo;ve always had a penchant for picking up languages pretty quickly and have been an advocate of the idea that as long as you can think in the right way then languages can be learnt easily enough.\nEmployed and Employable October 2017 saw me take on a new role with a company called Medicines Discovery Catapult. They are a grant funded not for profit organisation that is specifically focused on shaking up the medicine discovery pipeline by helping SMEs \u0026amp; CROs innovate and collaborate.\nMy role is very broad as I appear to have fallen in to the role of Head of Software Engineering very quickly. Though I started as a DevOps Engineer it has evolved very quickly. We work with a wide array of languages and tools and nothing is off the table if it helps to solve the problems we are working on. Currently we actively have code being written in Python, Node, Go \u0026amp; Scala (I live in hope that I\u0026rsquo;ll be ale to bring PHP into that mix eventually). Vast data sets and AI are the name of the game with my team of engineers helping to support the data scientists and Informaticians.\nAt the time of writing we are also hiring so if you are curious take a look at https://md.catapult.org.uk/about/careers/ and see if you like the sound of what we are doing and the challenge on offer.\nMoving forward The plan moving forward is to post about the tech we are working with, talk about the types of challenges we are facing and also maybe restart my public speaking career. Stay tuned for more and if you don\u0026rsquo;t hear from me soon then give me a nudge\n","date":"2019-02-26T00:00:00Z","permalink":"/a-reboot-and-a-legacy-moniker/","title":"A reboot and a legacy moniker"},{"content":"The first of my new round of talk abstracts! In all honesty this isn\u0026rsquo;t a talk but something more that came out of a very drunken Saturday night at #phpbnl19. There were a bunch of us sat talking and somehow the topic of D\u0026amp;D came up which sent my mind racing with this idea\u0026hellip; By the time 2am rolled around I had a fully formed idea along with some willing players to help with the idea. Now I just need to find a conference willing to take a chance on it! If you know a conference that would be interested let me know\nDescription/Abstract Our heroes have just completed our latest quest! Having successfully delivered the latest iteration of their project they are approached by the aged and holy sage \u0026ldquo;PeeEm\u0026rdquo; with a new quest! How quickly can they implement the sacred and forgotten art of \u0026ldquo;Lo-ging\u0026rdquo; into the codebase\nDo they accept? If they do will they be able to defeat the trials and tribulations that await them? Will they find treasure and glory, or suffer defeat at the hands of the vile and depraved Stakeholder?\nWhat will people learn The only way to find out if our heroes will complete this epic quest will be to join us and see if the dice of fate will be kind to them!\nAlong the way we will will learn some truths about feature implementation and how our heroes handle the challenges that lie ahead. And hopefully gain enough experience to level up!\nAdditional Information This is an extremely unique talk! It takes the form of a live Dungeons \u0026amp; Dragons game. I will be on stage playing the part of Dungeon Master looking to guide our team of adventurers through the process of delivering a new feature for a project.\nOur heroes currently consist of a heroic bard who will be inspiring our heroes, and audience, with ballads of past glories. A warlock with the demonic power to fork and merge code like no other in existence\u0026hellip;. Our sorcerer has the innate and wild magic of Fire(base)! Finally our team is held together with the support of our cleric, worshipping the ancient god Rasmus.\nThe outcome of this adventure will genuinely be determined by the roll of the dice! It will be a game of 5th edition D\u0026amp;D that we can probably fit into an hour\u0026hellip; but 2 or more would be better and far more fun.\nAudience participation is expected! Cosplay is hoped for! I will, of course, be in full scout uniform!\n","date":"2019-02-26T00:00:00Z","image":"/project-slayer-the-critical-path/c7c3a029d172b33287003d26a0c693f9.png","permalink":"/project-slayer-the-critical-path/","title":"Project Slayer: The Critical Path"},{"content":"As part of the attempt to develop my profile as a speaker, I\u0026rsquo;ve realised that I sometimes need to explain a few of my current talk abstracts a bit too much. This is mainly due to my lack of experience writing them and that the majority of my current talk ideas cover large topics that are not as technical as I would like.\nMy favourite so far is one title \u0026ldquo;Python explains why your project failed\u0026rdquo;. This is a tongue in cheek talk which aims to poke fun at the Developer, PM and of course the client!\nThe TL;DR This talk has yet to be accepted by anyone\u0026hellip; but will be eventually I hope. In the mean time I wanted to share some of the funny thoughts and comparisons I\u0026rsquo;ve had coming up with the content for the talk. I plan on doing this by writing a series of blog posts one for each topic or sketch that features in the talk.\nThe Abstract Python is fantastic! If you haven’t seen it you really need to. Its simple, elegant, powerful and gives you an amazing perspective on what we do as Developers, it is also hilariously funny…. Yes Funny!!!!\nWait! You thought I was talking about Python the programming language didn\u0026rsquo;t you? I\u0026rsquo;m sorry to tell you but we are actually talking about the most awesome of British comedy acts\u0026hellip; Monty Python.\nThroughout this talk I will take you through the development life-cycle of a project and use the Comedy of Monty Python to illustrate both the Good and the Bad (mainly the bad) aspects of our industry. All the way from Client introduction, requirements gathers, spec writing, team selection, planning and scoping all the way through Development to Testing, Delivery and Support!\nThe Delivery This is a little harder to explain as I\u0026rsquo;ve not given the talk (yet) and I don\u0026rsquo;t think I could ever match the delivery better than the Pythons themselves.\nHowever I have been known to dress for the occasion, so it\u0026rsquo;s quite possible that you may find me standing on stage in a red cassock at some point.\nI\u0026rsquo;ve also managed to convince @phpcodemonkey, who is as big a Python fan as myself, that this talk should really be performed as a 2 man show, rather than me monologuing at a room full of people.\nThe Sketches Due to the prolific variety of skits and sketches that Pythons created I found it extremely hard to select the few needed to fill a single talk. I have however managed to select a few and will change them around from time to time to suit the audience. I\u0026rsquo;ve provided a short list of a few of my favourites Sketches and a couple of words describing what they explain:\nSpanish inquisition - Client Indecision Dead parrot sketch - Stubborn Project Managers Ministry of silly (array) walks - Tool Selection Brian\u0026rsquo;s Latin Lesson - Planning and Preparation We demand a shrubbery - Demanding the Impossible Black knight - Solution fixation / Code Blindness Camelot Song - Stakeholder Morale The People\u0026rsquo;s Front - Team Fragmentation The Silly Job Interview - Stakeholder Communications Four Yorkshireman - Rockstar Developers Argument clinic - Product Delivery Architects sketch - Taking Shortcuts and Cutting Corners The Finale These are just a few of the potential topics I will be looking to cover in the coming posts, but while your reading them I want you to remember to\u0026hellip;\nhttps://youtu.be/WlBiLNN1NhQ\n","date":"2016-02-07T00:00:00Z","image":"/monty-python-explains-project-failed/9780563558200.jpg","permalink":"/monty-python-explains-project-failed/","title":"Monty Python explains why your project failed!"},{"content":"https://www.youtube.com/watch?v=Tt0lnauF5lI\nJust before the Christmas period I was lucky enough to be able to give my \u0026ldquo;Are you a good Code Scout?\u0026rdquo; talk as a lightning talk for NomadPHP. Here is the video that was recorded from it.\n","date":"2016-01-06T00:00:00Z","permalink":"/good-code-scout-nomadphp-lightning-talk-video/","title":"Are you a Good Code Scout? - NomadPHP lightning talk video"},{"content":"If your anything like me you have a large number of email aliases that you use with Gmail which is great. However I use Evolution as a mail client more often than not when using Gnome3 as a desktop.\nIt\u0026rsquo;s very easy to set up Evolution to create separate outbound email accounts that you can use for handling all of your aliases. It doesn\u0026rsquo;t yet support OAuth2 as an authentication mechanism for any account that is not set up using the built-in Gnome Online Accounts integration.\nThis is a real pain as Google have disabled the more common \u0026lsquo;plain\u0026rsquo; and \u0026rsquo;login\u0026rsquo; authentication mechanisms for use with an SMTP only account. Meaning that any time that you try to connect to smtp.gmail.com:587 with STARTTLS you will get some form of error message to the effect of \u0026ldquo;Bad Authentication\u0026rdquo;.\nHopefully I\u0026rsquo;ll find a workaround at some point in the near future or Evolution will add the facility to enable OAuth2 as an available authentication mechanism.\nIn the mean time there is a workaround if you visit https://www.google.com/settings/security/lesssecureapps you can enable these less secure authentication mechanisms allowing you to once again connect and send email via email addresses using SMTP\n","date":"2016-01-06T00:00:00Z","permalink":"/using-gmail-aliases-with-evolution/","title":"Using Gmail aliases with Evolution"},{"content":"Never one to shy away from coming up with a metaphor for explaining something technical I found myself having to come up with one on the spot for PSR-7 and Middleware while at the recent PHPNW15 Conference.\nNormally my brain will come up with something completely inappropriate but this time round I found I quite liked the imagery that came to mind.\nIf you would like to find out more of the specifics about PSR-7 you can take a look at http://www.php-fig.org/psr/psr-7/ which will make a better job of explaining it than I could ever do.\nNow on to the metaphor\nImagine a house on fire, a bizarre way to start I know but bear with me. The nearest well with water that can put out the fire is 500 meters away! We then have a human chain stretching between the well and the house with a bucket going back and forth between trying to put the fire out. So lets break this down, the house represents the internet, or more specifically you and your browser. The fact you are on fire means that you are desperately needing water to quench the flames. At this point you send an empty bucket which represents your \u0026ldquo;request\u0026rdquo;, along the human chain, which in itself represents the application, to the well.\nAt the start of the chain the bucket is pretty normal, it\u0026rsquo;s a bucket of course, its round, made of wood with a rope handle, lets say it has a small leak in it.\nAs it travels down the chain it\u0026rsquo;s passed from person to person, everyone in it has the opportunity to do something with the bucket, or not as the case may be and could just pass it to the next person in the chain. Others may attempt to fix the leak in the bucket, someone may choose to replace it with a metal bucket, change the handle or make it bigger. Regardless of what may be done to the bucket in essence it remains a bucket.\nInexorably the bucket will continue to move down the chain to the well. When it reaches the well it changes state because now it has been filled with water. All of the interaction with the bucket thus far, mean that what happens at the well could vary depending on the changes have been made . If its been made bigger, for example, it could be filled with significantly more water, if swapped for a metal one it could imply that the bucket descends the well to get the water quicker because its heavier. Either way it is filled with water and begins its journey back towards the house.\nAgain it passes through the hands of each person in the chain, but now that its state has changed it now has the opportunity to be modified again. Someone may empty some water out as there is too much in the bucket, others may say that there is not enough and send it back down the line towards the well to be refilled. Either way the bucket continues to change hands over and over until it reaches the house and the contents are thrown on the fire to complete the request for water.\nDuring this whole time the human chain could have been in flux. Some people may have swapped places, left the chain, added to the chain, some extraordinary people may have played leapfrog in the chain and appeared to handle the bucket more than once. Regardless of these changes the chain remains and continues to pass the bucket from one person to the another as long as the requests for water keep coming.\nThis, in the simplest possible form, explains PSR-7 and the concept of Middleware.\nThe bucket remains a bucket because PSR-7 says that is what is needed to complete the request for water, it also defines how you should interact with it regardless of what modifications have been made. If the bucket cant be used according to how PSR-7 describes a bucket to be, then the middleware can\u0026rsquo;t complete the request.\nEvery person in the human chain can be classed as a piece of middleware all the way from the house to the well and back again. If at any point someone enters the chain that doesn\u0026rsquo;t agree that the bucket is a bucket or doesn\u0026rsquo;t know how to handle it, then the it is dropped on the ground and the request fails.\n","date":"2015-10-08T00:00:00Z","image":"/metaphor-psr7-middleware/fire-bucket-brigade.jpg","permalink":"/metaphor-psr7-middleware/","title":"A metaphor about PSR-7 and Middleware for non-developers"},{"content":"One of the most prominent things I\u0026rsquo;ve been asked about regarding my promoting being a Good Code Scout, is where can we get the badges?\nFollowing on from a number of questions and subsequent tweets about it\nhttps://twitter.com/stuherbert/status/650591775732670466\nWell\u0026hellip; I\u0026rsquo;ve decided that (providing I can get permission from all the right people) I\u0026rsquo;ll start creating a range of Badges \u0026amp; Stickers for you to earn as a Good Code Scout.\nSo if your interested in having some stickers or badges let me know using the form below and if I get enough interest I will most definitely get some made up for you.\nUpdate Unfortunately I\u0026rsquo;m have no more stickers left, Though I\u0026rsquo;ll be working on diversifying some of the designs in the near future I wont be looking at ordering any more for a little while yet. Keep your eyes on twitter as I\u0026rsquo;ll most likely post there when they are available again.\n","date":"2015-10-04T00:00:00Z","image":"/badges-and-stickers/Screen-Shot-2015-10-04-at-12.03.38.png","permalink":"/badges-and-stickers/","title":"Badges \u0026 Stickers"},{"content":"So I attended the PHPNW15 conference this weekend and what a weekend. I\u0026rsquo;ve been an attendee of the conference for a number of years and have always enjoyed it immensely. However this year turned out to be something special.\nFollowing on from my first ever appearance as the PHP Boy Scout I decided to submit to the Unconference at PHPNW15.\nIt was a good talk, an extension of the previous lightening talk I\u0026rsquo;d given and felt really good to give. Unbeknownst to me however there was mischief afoot. Normally the Unconference talks are rated by the organisers and the one that they selected as the best gets to have a guaranteed slot in next years PHPNW conference. All of which I had genuinely either no idea about or had forgotten had happened in previous conferences,\nAs you may have guessed from the fact this post exists, I ended up winning that slot.\nHowever\u0026hellip;. it appeared that a Speaker had taken ill at the last minute and couldn\u0026rsquo;t make it at which point I was asked a mere 3 minutes before it was announce that I was also going to be given the hangover slot on track three for the Sunday sessions!!!!!\nSuffice to say I had an interesting evening to say the least, in preparing my talk for \u0026ldquo;the big time\u0026rdquo;\nAmazingly I felt really calm about everything, and even had a good chuckle about managing to find some props to help break the ice!\nEverything is ready, I\u0026rsquo;ve practiced, knowing my talk was going to be a bit short\u0026hellip; but that was ok considering the short notice, and I had plenty of anecdotes I could use as filler. I\u0026rsquo;m sat there waiting for the moment I have to put my head above the parapet and all of a sudden\u0026hellip;\u0026hellip;..\nnothing\nMy mind goes blank!\nThe long and the short is that I survived, and the feedback I have had has been amazing and I\u0026rsquo;ll be taking all of it on board to make sure that next time its even better!\nThe recordings should be available in the near future so when they are I will share a link so you can judge how it went for yourselves. In the mean time I\u0026rsquo;ve published the revised slide deck for you on slideshare.net/phpboyscout/are-you-a-good-scout-phpnw15-track-3\nI\u0026rsquo;m hoping that I can now find some opportunities to practice for my slot at #phpnw16\n[slideshare id=53508432\u0026amp;doc=areyouagoodscout-151004073451-lva1-app6892]\n","date":"2015-10-04T00:00:00Z","permalink":"/wow-phpnw15-conference/","title":"Wow... What a Conference"},{"content":"Recently I\u0026rsquo;ve had a lot of people asking me what a PHP Scout is! I thought it would be a good opportunity to explain.\nTo understand what a PHP Scout is it helps to know a little of the background basics of Scouting in general. Knowing this helps to make it easier later on as well as we draw some direct parallels. If you would like to investigate more about the history of Scouting you can find a good starting point at http://scouts.org.uk/about-us/history/.\nFor now I\u0026rsquo;m going to give a tl;dr version;\nScouting started in 1908 as a movement for training young people to encourage them to develop physically, mentally and spiritually by Robert Baden-Powell. Over the next 100+ years it has evolved to encompass people of all ages, races, colours and creeds to get involved and try to be the best they can be.\nThe primary ethos of the movement today is to bring Everyday Adventure to young people and this is achieved through a comprehensive programme scheme that is designed to touch on all aspects of that young persons development. This is then rewarded in a variety of ways with the primary reward being the experience itself, the awarding of badges also strengthens then sense of achievement and desire to work towards the next goal.\nAll members of the Scouting movement are required to make and frequently renew a promise:\nOn my honour, I promise that I will do my best to do my duty to {insert deity/monarchy here}, to help other people and to keep the Scout Law.\nThe key part here is I will do my best. Scouts are continually encouraged to improve themselves in everything they do.\nThe Boy Scout Rule Lets start with something easy! There is a pretty common piece of guidance that gets bandied about in a lot of different circles that is normally referred to as \u0026ldquo;The Boy Scout Rule\u0026rdquo; which promotes leaving things better than you found it. It came about as common practice for scouts to always try to leave a campsite cleaner and tidier than when they arrived so that its in a good state for the next group.\nThis is quite generic but can easily be made very specific to us as programmers:\nLeave the codebase better than you found it.\nSo what do I mean by this? Ultimately I mean that regardless of the state of the code you are working on you should always try to find a way to improve it.\nThis can be something as simple as;\nrefactoring the code to make it more readable adding some docblock to explain a file/class/method/function/variable create a Readme file or add some documentation remove obsolete code, old backup files, stray files, unused components fix a failing test write a new test even Its not an exhaustive list at all but it gives you an idea of what kind of things you can be doing to improve your codebase. Any good Scout group leaving a campsite would also make sure to put out the fire and close the gate on your way out. Which is exactly what you should be doing by making sure all your Acceptance, Functional, Integration \u0026amp; Unit tests pass and writing a good commit message.\nRight tool for the job In every activity that a Scout takes part in they are always taught the correct way to work with their tools and equipment, such as how to use an penknife property. They are then encouraged to explore different ways of using those to achieve their goals. This is no different for a PHP Scout, by knowing how to use their languages and tools properly they can then use it to maximum effect.\nSelf Development As a child you assimilate massive amounts of information every day that helps you to grow and develop. This is creatively harnessed by Scouts through a variety of different activities that are designed to help them learn new skills that can help them grow as people.\nNow that we are older our brains don\u0026rsquo;t have the same capacity to soak up that volume of information. But that doesn\u0026rsquo;t mean we shouldn\u0026rsquo;t be trying! A good PHP Scout will continually strive to push the boundaries of what they know, to pick up new skills that can be used to make them more capable. This can be learning a new technique, or language or tool be it via formal training, conferences, social events or even just a good Google.\nGranted the Scouts are rewarded with some cool badges, but I\u0026rsquo;m sure its only a matter if time before some entrepreneurial PHP Scout decides to start creating some achievement badges of their own (see http://phpboyscout.uk/php-scout-membership-badge)\nHelping Others We\u0026rsquo;ve all heard the adage of a Scout helping someone across the street. It\u0026rsquo;s a somewhat stereotypical example but extremely apt as it highlights that they are encouraged to take into consideration other peoples needs and to provide assistance wherever possible. Modern Scouting however goes far beyond aiding with avoiding getting run over on a road.\nBy encouraging Scouts to not only help individuals, communities and groups we make them more considerate of the needs of others as well as developing their sense of self. A fantastic example of this is the 2015/16 initiative A Million Hands which promotes finding ways to identify the needs of others and to take action to provide aid.\nThis is a fantastic trait to be teaching children and is something that any good PHP Scout would applaud, and would then go forward to do the same things but with the development community. This can be something as simple as;\nhelping a colleague at work (without being told to do so by your boss) organising an event with a local user group contributing to an open source project All very simple stuff to do and all it takes up is a little of your time! Where is the reward? I here some of you ask! I would say that the act itself is its own reward, and in reality that is true as when working helping others your generate some very positive Karma which will eventually be paid back when the day comes that you yourself need some help. You should (hopefully) also have an opportunity to maybe learn something new and improve your ability to communicate, a soft skill yes, but essential to your growth if your are striving to be better than you are now.\nProblem Solving The Scout motto is a very simple two words\u0026hellip; \u0026ldquo;Be Prepared\u0026rdquo;, but be prepared to do what? Its quite open ended really it could be anything at all! I like to think that its nearly impossible to be equipped with every possible skill and tool possible to meet any and every task you will encounter through life, though being a Scout does try to help arm you with as many as possible.\nYet as a PHP Scout we should always \u0026ldquo;Be Prepared\u0026rdquo; to solve problems. If we are doing our jobs right we should be looking to solve problems through the solutions we provide every day. Quite often I talk to developers and hear them make pigeon holing statements like\nI am a WordPress developer\nAnd then complaining that they are bored at work or that they cant get jobs working with anything other than their chosen platform. Now this infuriates me as a PHP Scout would never do this, when asked they profess loud and clear\nI am a Problem Solver\nand be prepared to prove it by making sure they are prepared by knowing more than one or two platforms or frameworks or even programming languages. This can be encouraged by actively seeking \u0026ldquo;problems\u0026rdquo; that you can solve with tools and techniques you are not familiar with.\nThis is echoed throughout the challenges that are presented to Scouts, where they are tasked with solving a practical problem such as putting up a tent without any instructions, the best way to light a fire with two sticks and a bit of kindling, how to cross a stream with only a few bits of wood and rope. In solving these types of problems the Scouts not only receive the obvious of shelter, food, heat etc but they also become more prepared for the next time a similar scenario presents itself.\nTeam work No man is an island as the saying goes and the same goes for being a Scout. By being organised into lodges,packs \u0026amp; patrols they have a ready made team to work with and the only way they can progress is to work together. They may not like the people in their team (and as a Scout Leader I will quite readily admit to putting Scouts into groups with others they may clash with).\nWe may all have teams that we work with as part of our Jobs, and a PHP Scout will take the opportunity to work with as many different combinations of teams as possible both in and outside of the workplace. By diversifying the people you have to interact with you develop a broader understanding about the problems you may be trying to solve.\nThis can then be expanded upon as mentioned previously by then branching out into the community and working with user groups and opensource projects.\nTo Summarise The ethos behind the Scouting movement is a solid foundation not only for children aged 7-18 but for everyone. By being a PHP Scout you strive to keep improving your ability to create great code, solve problems, work with others and in doing so become a better developer.\nAs with all Scouts they are Hard Working, Determined, Ingenious \u0026amp; Tenacious and so is a PHP Scout.\n","date":"2015-10-02T00:00:00Z","permalink":"/php-scout/","title":"What is a PHP Scout"},{"content":"As the PHP Boy Scout I\u0026rsquo;m having some badges made and I wanted to introduce the all new PHP Scout Membership Badge.\nThis badge shows that you are more than just a PHP Developer but also a good PHP Scout. This means that you have all the qualities it takes to be a PHP Scout and will :\nalways leave the codebase better than you found it help other PHP developers be good Scouts get involved with your local User Group \u0026amp; PHP Community Contribute to at least one open source project If you would like to find out how to get hold of a PHP Scout Membership Badge then fill out the form below.\n[contact-form to='matt@phpboyscout.uk\u0026rsquo; subject=\u0026lsquo;Someone wants a Membership Badge\u0026rsquo;][contact-field label=\u0026lsquo;Name\u0026rsquo; type=\u0026lsquo;name\u0026rsquo; required=\u0026lsquo;1\u0026rsquo;/][contact-field label=\u0026lsquo;Email\u0026rsquo; type=\u0026lsquo;email\u0026rsquo; required=\u0026lsquo;1\u0026rsquo;/][contact-field label=\u0026lsquo;I will always leave the codebase better than I found it\u0026rsquo; type=\u0026lsquo;checkbox\u0026rsquo; required=\u0026lsquo;1\u0026rsquo;/][contact-field label=\u0026lsquo;I will help other PHP developers\u0026rsquo; type=\u0026lsquo;checkbox\u0026rsquo; required=\u0026lsquo;1\u0026rsquo;/][contact-field label=\u0026lsquo;I will get involved with my local User Group\u0026rsquo; type=\u0026lsquo;checkbox\u0026rsquo; required=\u0026lsquo;1\u0026rsquo;/][contact-field label=\u0026lsquo;I will contribute to an open source project\u0026rsquo; type=\u0026lsquo;checkbox\u0026rsquo; required=\u0026lsquo;1\u0026rsquo;/][/contact-form]\n","date":"2015-08-11T00:00:00Z","image":"/php-scout-membership-badge/elephpant.png","permalink":"/php-scout-membership-badge/","title":"The PHP Scout Membership Badge"},{"content":"So it\u0026rsquo;s finally happened!\nI stood up in front of a group of developers and gave a lightning talk about how Scouting Principles should be applied to every day development.\n[slideshare id=51344270\u0026amp;doc=areyouagoodscout-150806121954-lva1-app6892]\nThe amazing thing is that I didn\u0026rsquo;t get any rotten tomatoes thrown at me! quite the contrary in fact. Even with me doing the talk in full Scout uniform.\nNow to see about finding some more places to speak and actually fleshing out the talk into something that can last a full hour and not just shy of 5 minutes.\n","date":"2015-08-06T00:00:00Z","image":"/public-appearance-phpboyscout/20111036639_d7c8ec153d_z.jpg","permalink":"/public-appearance-phpboyscout/","title":"My first ever public appearance as PHPBoyScout"},{"content":"Over all the time that I\u0026rsquo;ve been a developer I\u0026rsquo;ve had people telling me that I should get in front of an audience and speak. However I\u0026rsquo;ve always suffered from a rather bad case of \u0026lsquo;Imposter Syndrome\u0026rsquo; which meant my automatic response to those kind of statements has always been\u0026hellip; I don\u0026rsquo;t really know enough about any one topic.\nThis is very true, I\u0026rsquo;ve spent a lot of my career learning a really broad swathe of technologies and techniques so I can turn my hand to any task that\u0026rsquo;s been presented to me so far. Even so people continue to try convince me that it would be a worthwhile pursuit.\nNow that I work at Magma Digital I find that I\u0026rsquo;m often talking with @phpcodemonkey about all sorts of things and the topic of creating a talk came up while we were enjoying the most excellent PHP South Coast Conference. He knows I\u0026rsquo;ve been a Scout Leader for around 4 years now, and he suggested that I do a talk on the \u0026lsquo;Boy Scout Rule\u0026rsquo;.\nI don\u0026rsquo;t know if he was serious or not at the time but it set my mind racing! This is a topic that I actually know quite a lot about!\nSo I\u0026rsquo;m now going to leave behind the Dev in Charge and have now rebranded as the PHP Boy Scout. I\u0026rsquo;ve already managed to pull together the basis of a talk on how Scouting principles can be used in conjunction with what we do as Developers and have a few other ideas that I\u0026rsquo;m going to work on over the next few weeks.\nFingers crossed I will be better at doing this kind of thing than I suspect I will be\u0026hellip; but nothing ventured and nothing gained!\n","date":"2015-08-05T00:00:00Z","permalink":"/goodbye-dev-charge/","title":"Goodbye Dev in Charge"},{"content":"I\u0026rsquo;ve been a Scout Leader for a few years now and the District I work within have very little by way of internet presence. As a bit of a pet project I started building a simple Scout based website for them to use.\nIts nothing too fancy, I created a simple module and theme for the Silverstripe CMS and have now put it into a GitHub Repository to share with the wider scouting community.\nI chose Silverstripe because of the speed with which I could develop something usable as well as providing a super simple management interface that can be handled by users of all skill levels.\nThe module itself extends some very common extensions available for the CMS and makes them scout focused. Features include.\nCustomisable theme Multi tiered Event Calendars Customisable Group/Section Pages Dynamic Forms A reliable News/Blog system These are just a few of the most obvious features and hopefully I will continue to add more.\nI\u0026rsquo;m also offering to help any Scout Groups/Districts/Counties if they are wanting to use these modules and get their sites built and up and running for them free of charge.\nIf you want to take a look at the code and have a play yourselves you can find it online at\nhttps://github.com/phpboyscout/silverstripe-scouts https://github.com/phpboyscout/silverstripe-scouts-theme If you want to get in touch or would like more information about having a website built for you please fill in the form below.\n[contact-form to='matt@phpboyscout.uk\u0026rsquo; subject=\u0026lsquo;Request for Scouts Website\u0026rsquo;][contact-field label=\u0026lsquo;Name\u0026rsquo; type=\u0026lsquo;name\u0026rsquo; required=\u0026lsquo;1\u0026rsquo;/][contact-field label=\u0026lsquo;Email\u0026rsquo; type=\u0026lsquo;email\u0026rsquo; required=\u0026lsquo;1\u0026rsquo;/][contact-field label=\u0026lsquo;Scout Group/District/County\u0026rsquo; type=\u0026lsquo;url\u0026rsquo; required=\u0026lsquo;1\u0026rsquo;/][/contact-form]\n","date":"2015-01-12T00:00:00Z","image":"/free-open-source-website-scouts/scouts-snapshot.png","permalink":"/free-open-source-website-scouts/","title":"Free Open Source Website for Scouts"},{"content":"So\u0026hellip; Its been a long time since I posted anything of any relevance. This is due to having been super busy with my previous company Zucchi.\nHowever that has all changed now! After three and a half years of running my own company I have decided that its not for me. I gave it my all, but in the end I was becoming too much of a Salesman and I missed getting stuck in with code.\nI\u0026rsquo;ve now moved on and have joined the fantastic team at Magma Digital who have been leaders in PHP software development for somewhere in the region of 14 years as well as heavily involved in the PHP community having been a essential part of the PHPNW user group and conference.\nThis means I should be able to pick up where I left off all those years ago and start being more active again.\nSee you soon\n","date":"2014-12-02T00:00:00Z","permalink":"/time-change/","title":"Its time for a change"},{"content":"Despite having been around for a while and having been through a couple of revisions, its support across browsers can vary greatly. From \u0026ldquo;Candidate Recommendation\u0026rdquo; on Chrome/Opera, \u0026ldquo;legacy flexbox\u0026rdquo; on Firefox and no support at all on IE9 and earlier.\nMaking flexbox work consistently across browsers was a challenge for us on a recent project, but I have found a solution that seems to work quite well.\nBelow is an SCSS @mixin that will attempt to handle compatibility between CR and legacy cross browsers flexbox.\n@mixin flex($content: flex-start, $items: stretch, $direction: row, $wrap: wrap) { $packLegacy: $content; @if $packLegacy == flex-start { $packLegacy: start; } @else if $packLegacy == flex-end { $packLegacy: end; } $alignLegacy: $items; @if $alignLegacy ==flex-start { $alignLegacy: start; } @else if $alignLegacy == flex-end { $alignLegacy: end; } $oritentLegacy: $direction; $directionLegacy: normal; @if $oritentLegacy == row { $oritentLegacy: horizontal; } @else if $oritentLegacy == column { $oritentLegacy: vertical; } /** SAFARI **/ display: -webkit-box; -webkit-box-orient: $oritentLegacy; -webkit-box-pack: $packLegacy; -webkit-box-align: $alignLegacy; /** FIREFOX LEGACY **/ display: -moz-box; -moz-box-orient: $oritentLegacy; -moz-box-direction: $directionLegacy; -moz-box-pack: $packLegacy; -moz-box-align: $alignLegacy; /** LEGACY **/ display: box; box-orient: $oritentLegacy; box-direction: $directionLegacy; box-pack: $packLegacy; box-align: $alignLegacy; /** IE 10+ **/ display: -ms-flexbox; -ms-flex-wrap: $wrap; -ms-flex-direction: $direction; -ms-justify-content: $content; -ms-align-items: $items; /** CHROME **/ display: -webkit-flex; -webkit-flex-wrap: $wrap; -webkit-flex-direction: $direction; -webkit-justify-content: $content; -webkit-align-items: $items; /** NATIVE **/ display: flex; flex-wrap: $wrap; flex-direction: $direction; justify-content: $content; align-items: $items; } //@mixin flex @mixin flexItem($width) { -webkit-box-flex: $width; -moz-box-flex: $width; box-flex: $width; -ms-flex: $width; -webkit-flex: $width; flex: $width; min-height: 0; } Firefox however only half supports flexbox (all revisions) and to get around this I would recommend using Modernizr as this will add the class \u0026ldquo;no-flexbox\u0026rdquo; to the tag. This provides us with a simple work around that allows non flexbox supporting browsers render correctly by using specifically crafted and targeted CSS for non-flexbox browsers\nI found that IE9 support could be implemented using the flexie javascript plugin. In IE8 Modernizr will add the class \u0026ldquo;no-flexboxlegacy\u0026rdquo; which can again allow you to create targeted CSS that wont affect your Flexbox layout.\nFor a great overview of the \u0026ldquo;CR\u0026rdquo; of flexbox, CSS Tricks has an amazingly comprehensive coverage of the functionality here http://css-tricks.com/snippets/css/a-guide-to-flexbox/\n","date":"2013-08-16T00:00:00Z","permalink":"/flexbox-cross-browser/","title":"Flexbox cross browser"},{"content":"We wanted to create a Route to our custom Products Controller in our products module for SilverStripe 3.1, such as: \u0026ldquo;http://www.examplesite.com/products/\u0026rdquo;\nHowever looking at the Controller Documentation it was not clear how to create a route without an Action being supplied. In our example above the action is not specified, as we just want to use \u0026lsquo;view\u0026rsquo;.\nSolution:\nCreate a /_config/routes.yml file containing the following:\n--- Name: productsroutes After: \u0026#39;framework/routes#coreroutes\u0026#39; --- Director: rules: \u0026#39;product\u0026#39;: \u0026#39;Product_Controller\u0026#39; --- The above will redirect any Url that starts with \u0026ldquo;/product\u0026rdquo; to our Product_Controller. Note that everything after the rule, so after \u0026ldquo;/product\u0026rdquo;, is used in the next bit for matching.\nNow we need to add private static $url_handers to Product_Controller to match our path, so in this example we need to match \u0026ldquo;$Slug!\u0026rdquo; which will match \u0026ldquo;\u0026rdquo;. Note the ! means the slug is required. Of course we want to direct this to a specific action, in this case \u0026ldquo;view\u0026rdquo;, this gives us:\nprivate static $url_handlers = array( \u0026#39;$Slug!\u0026#39; =\u0026gt; \u0026#39;view\u0026#39;, ); Now just add \u0026ldquo;view\u0026rdquo; to the $allow_actions and add the \u0026ldquo;view\u0026rdquo; function. This gives the final Product_Controller as follows:\nclass Product_Controller extends Page_Controller { private static $url_handlers = array( \u0026#39;$Slug!\u0026#39; =\u0026gt; \u0026#39;view\u0026#39;, ); private static $allowed_actions = array(\u0026#39;view\u0026#39;); public function view(SS_HTTPRequest $request) { // Your action code goes here return $this-\u0026gt;render(); } } Handy note:\nYou can put ?debug_request=1 on the end of your URL to see how it determines which Controller to use.\n","date":"2013-07-31T00:00:00Z","permalink":"/creating-custom-routes-silverstripe/","title":"Creating Custom Routes in Silverstripe 3.1"},{"content":"While working with Silverstripe we found ourselves having to run \u0026ldquo;?flush=1\u0026rdquo; a lot to clear the Cache. To switch it off, while you work, add the following to your mysite/_config.php:\nSS_Cache::set_cache_lifetime(\u0026#39;default\u0026#39;, -1, 100); ","date":"2013-07-31T00:00:00Z","permalink":"/disabling-cache-silverstripe/","title":"Disabling Cache in Silverstripe 3.1"},{"content":"We recently tried to use composer to set up SilverStripe 3.1, but ended up with a dependency nightmare. In order to work around this we decided to make use of Git submodules.\nFirst set up your Git repository and run:\ngit init Next set up a site directory for the code inside your Git repository. Then navigate to SilverStripe Installer in your browser and Download a copy. Extract files, and copy contents to site folder. Now we need to add the CMS and Framework. Navigate in a browser to the Git Hub repositories for CMS and Framework. Now copy the HTTPS clone URL for each project and run the following, to add these as Git sub modules.\ngit submodule add https://github.com/silverstripe/silverstripe-framework.git site/framework git submodule add https://github.com/silverstripe/silverstripe-cms.git \u0026lt;path-to-site\u0026gt;site/cms Now delete mysite/_config.php and load the site. Follow the normal install instructions displayed and you will have a running version of SilverStripe 3.1\n","date":"2013-07-29T00:00:00Z","permalink":"/set-up-silverstripe-3-1-using-only-git/","title":"Set up SilverStripe 3.1 using only Git (No Composer)"},{"content":"We recently ran into problem using Doctrine 2 connecting to a Rackspace Cloud Database using the MySqli Driver.\nProblem:\nWe have a long running PHP script that can sometimes run for hours at a time whilst processing information. This script requires a connection to a database, but has long periods of inactivity where there is no actual interaction with MySQL. By default MySQL uses the \u0026ldquo;wait_timeout\u0026rdquo; setting which states, how long an inactive connection can exist before it is killed. This is normally fine with web pages requests, as it is usually a short lived request. Unfortunately you do not have the ability to alter this setting when using Rackspaces Cloud Database.\nSolution:\nWhen using the MySQLi extension you can create a connection in \u0026ldquo;interactive mode\u0026rdquo; by passing the \u0026ldquo;MYSQLI_CLIENT_INTERACTIVE\u0026rdquo; flag, which will then use the \u0026ldquo;interactive_timeout\u0026rdquo; setting. On Rackspace this is set to 8 hours!\nAnnoyingly Doctrine does not allow you to pass any flags to the MySQLi Connection. So we overrode Doctrine\\DBAL\\Driver\\Connection with our own Driver which then allows us to pass a \u0026ldquo;flags\u0026rdquo; parameter through.\nFeel free to look at some of the other helpful features in we have added to Doctrine 2 here: ZucchiDoctrine\n","date":"2013-07-26T00:00:00Z","permalink":"/mysql-client-interactive-with-doctrine-on-rackspace/","title":"Enabling MYSQL_CLIENT_INTERACTIVE with Doctrine 2 on Rackspace Cloud Database"},{"content":"Recently we have revisited using Zend Server for some of our projects and decided to give the new version 6 a chance to prove itself.\nOverall its a big improvement over version 5. There are still some things that are extremely annoying but we have decided that we can overlook them.\nHowever there is one thing that we couldn\u0026rsquo;t do without. By default you will find that a number of PECL extensions will not install out of the box (at least this is what we experience using the Debian based install).\nTo fix this you will need to make sure you install the additional packages in ubuntu\nphp-5.4-source-zend-server or php-5.3-source-zend-server depending on the php version you are using autoconf build-essential Once this is done you should now be able to install extensions from PECL without too much hassle.\n","date":"2013-05-13T00:00:00Z","permalink":"/installing-pecl-extensions-zend-server-6/","title":"Installing PECL extensions for Zend Server 6"},{"content":"If you ever find yourself using MySQL via command line and end up with something like this:\nAnd thought there must be another way, well here it is: Use \\G instead of ; at the end of your select command.\nFor example:\nselect * from CHARACTER_SETS\\G Below is an image of the output from this select:\nHappy Querying!\n","date":"2013-04-24T00:00:00Z","permalink":"/better-output-mysql-command-line/","title":"Better Output for MySQL Select Command Using \\G"},{"content":"We recently had the need to create a queuing system to replace an implementation of RabbitMQ that was being used on a previous project. The reasoning behind this is that the requirements of the project required a very custom implementation of a queuing system that would drastically alter in architecture as the project grew and RabbitMQ just wasn\u0026rsquo;t going to fit the bill. However to start with we required something super simple and efficient that could be expanded and developed as required. After a little investigation and a lot of recommendation from others we decided to use ZeroMQ as our transport layer for that very reason, as we could build something which could span across multiple servers and was fast.\nThe diagram above helps describe our basic queuing system. We have a queue daemon that is continuously listening for connections and two clients, one that populates the queue and the other that retrieves from the queue.\nThe Clients Each client is written in PHP and uses a 0mq socket to communicate with a service, in this case our queue service. We used a SOCKET_REQ type of socket in order to have a request/response communication with our queue service.\nfunction client_socket(\\ZMQContext $context) { // SOCKET_REQ used to create a client that sends requests to and receive from a service $client = new \\ZMQSocket($context,\\ZMQ::SOCKET_REQ); $client-\u0026gt;connect(\u0026#34;tcp://localhost:5555\u0026#34;); // SOCKOPT_LINGER = 0 Configure socket to not wait at close time $client-\u0026gt;setSockOpt(\\ZMQ::SOCKOPT_LINGER, 0); return $client; } public function injectIntoQueue() { $context = new \\ZMQContext(); $client = $this-\u0026gt;client_socket($context); $msg = \u0026#34;This is a message\u0026#34;; $retries_left = 3; $read = $write = array(); while ($retries_left) { // We send a request, then we wait to get a reply $client-\u0026gt;send($msg); $expect_reply = true; while ($expect_reply) { // Poll socket for a reply, with timeout $poll = new \\ZMQPoll(); $poll-\u0026gt;add($client, \\ZMQ::POLL_IN); $events = $poll-\u0026gt;poll($read, $write, 2500); // If we got a reply, process it if ($events \u0026gt; 0) { // We got a reply from the server, must match sequence $reply = $client-\u0026gt;recv(); if (intval($reply) == $msg) { $retries_left = 0; $expect_reply = false; } } elseif (--$retries_left == 0) { break; } else { // Old socket will be confused; close it and open a new one $client = $this-\u0026gt;client_socket($context); // Send request again, on new socket $client-\u0026gt;send($msg); } } } } You can see from the code above, have a 3 strike rule. The reasoning behind this is that if the client fails to connect to the queue service more than 3 times, we can stop trying to inject into the queue and move on to the next item. As we ultimately intend to adapt the lazy pirate pattern we have made it so that if the socket times out, we can then create a new socket and retry. Without this, as the architecture becomes more complicated we may then end up in a situation where we might have errors, thus the recommend solution is to create a new socket. Once the client has sent its message to the queue, we poll for a response (i.e. which is the message we sent returned back). Once we have a response that is valid, meaning that the queue has been populated, we can stop polling until the next message.\nThe Frontend Client\nprotected function getFromQueue() { $context = new \\ZMQContext(); $worker = new \\ZMQSocket($context, \\ZMQ::SOCKET_REQ); $read = $write = array(); // Set random identity to make tracing easier $worker-\u0026gt;connect(\u0026#34;tcp://localhost:5556\u0026#34;); // Tell queue we\u0026#39;re ready for work $worker-\u0026gt;send(\u0026#34;ready\u0026#34;); $reply = $worker-\u0026gt;recv(); return $reply; } Our Frontend Client is much simpler as it is part of a process that is being continually updated, therefore it doesn\u0026rsquo;t need the same connection retries are the Backend Client. We simply send \u0026ldquo;ready\u0026rdquo; to the queue system and if the queue is populated it will return us the first item.\nThe Queuing Service This is a continuously running executable created using c++.\nzmq::context_t context(1); zmq::socket_t frontend (context, ZMQ_ROUTER); zmq::socket_t backend (context, ZMQ_ROUTER); backend.bind(\u0026#34;tcp://*:5555\u0026#34;); frontend.bind(\u0026#34;tcp://*:5556\u0026#34;); We create two sockets of type ZMQ_ROUTER which is an advanced pattern used for extending request/reply sockets. This means when we improve our queuing system we will be able to route packets to specific recipients using an address in the message.\nAfter creating our sockets, we initialise them\n// Initialize poll set zmq::pollitem_t items [] = { { frontend, 0, ZMQ_POLLIN, 0 }, { backend, 0, ZMQ_POLLIN, 0 } }; //poll the sockets - this seems to poll both sockets at the same time zmq::poll (items, 2, -1); Backend Handler If we get a message from the backend, we check the contents to see if it contains purge at which point we empty the queue, otherwise we push the msg contents onto the queue. Finally we send the message back to the backend to show that the message has been received.\n//receive msg from client if (items [1].revents \u0026amp; ZMQ_POLLIN) { //get message from client zmq::message_t message(0); string client_addr = s_recv (backend); string empty = s_recv (backend); assert (empty.size() == 0); string msg = s_recv (backend); //allow the backend to purge the queue if(msg == \u0026#34;purge\u0026#34;) { while (!queue.empty()) queue.pop(); } else { queue.push(msg); } //send response back to the backend s_sendmore (backend, client_addr); s_sendmore (backend, \u0026#34;\u0026#34;); s_send (backend, msg); } Frontend Handler If we get the \u0026ldquo;ready\u0026rdquo; message from the frontend client, we pop a message off the queue and return it to the frontend client. If the queue is empty we send an \u0026ldquo;empty\u0026rdquo; message back instead.\n// Handle activity on frontend if (items [0].revents \u0026amp; ZMQ_POLLIN) { //get message from worker zmq::message_t message(0); string worker_addr = s_recv (frontend); string empty = s_recv (frontend); assert (empty.size() == 0);} string msg = s_recv (frontend); string queueMsg; if(msg == \u0026#34;ready\u0026#34;) { if(queue.size() \u0026gt; 0) { queueMsg = queue.front(); queue.pop(); } else { queueMsg = \u0026#34;empty\u0026#34;; } } //send reply to worker with contents of queue s_sendmore (frontend, worker_addr); s_sendmore (frontend, \u0026#34;\u0026#34;); s_send (frontend, queueMsg); } That is our complete queuing system using ZeroMQ with PHP and C++.\nSummary Using the above has allowed us to create a very simple in memory queue daemon that we can use to quickly pass data from one system to a another. On the whole it works well and we are looking to expand on it in the near future to increase both its functionality and scalability.\nYou can find the queueing daemon (christened as \u0026ldquo;ZuQ\u0026rdquo;) on github @ https://github.com/zucchi/ZuQ\n","date":"2013-03-19T00:00:00Z","permalink":"/introducing-zuq/","title":"Introducing ZuQ - A Simple ZeroMQ Queuing Daemon"},{"content":"A few of our projects recently called for a distributed file-system that provided high availability and redundancy. After a tip off from a fellow techie and a quick browse around the net it appeared that a solution called GlusterFS appeared to tick all the boxes for what we were wanting.\nHowever setting it up turned out not to be as trivial as I had originally anticipated. I\u0026rsquo;m going to try and put down the process we have evolved for setting it up on Ubuntu in the cloud\nA couple of things to clear up first.\nWe are using Rackspace for our cloud but beyond the setup of the servers it should still be relevant There are a number of ways to interact with Rackspaces set up but for this we are going to use the cloud control panel We use Ubuntu as our preferred server which means that our config tends to be all over the place compared to other guides You will need to set up a minimum of 2 servers and a separate block storage device for each. We have set up and broken a few different variations of gluster setup so far and make no guarantees that the setup in this blog is infallable but its the best wehave so far. Setting up the hardware First things first. We are going to need to set up are some servers.\nFeel free to create any size server you want. Just make sure to select Ubuntu 12.10 (or whatever version you may have that is newer).\nYou will also need to define a new network to work with. We use this to isolate the traffic between the nodes of our new gluster.\nYou can create a new network when creating the first of your servers. On the creation page under the networks heading you can find a \u0026ldquo;Create Network\u0026rdquo; button.\nHopefully this should be quite self explanatory. Now when you create subsequent servers you will then have the option to attach your new network (\u0026ldquo;GlusterNet\u0026rdquo; in my example).\nOnce the two starting nodes have been created then you need to add some additional block storage to store your data on. Make sure that you create blocks that have sufficient capacity for your needs. Something else to consider is using High Performance SSD storage. Its a little on the pricy side but well worth the expense if you are trying to eak out every ounce of performance from the implementation.\nYou will then need to attach one to each of your servers.\nOnce attached you will be able to see the details of the block mount point from the block storage details page.\nMake a note of the mount point (in this case \u0026ldquo;/dev/xvdb\u0026rdquo;) as we will need that in a minute.\nPrepare the Server Now that we have a the hardware ready we can shell into a server to set it up.\nFirst you need to shell into your server and update its OS as the images provided by most cloud supplier tends not to have the latest patches and updates. In our case it\u0026rsquo;s as simple as:\napt-get update apt-get upgrade Once that\u0026rsquo;s done we then need to prepare the Block Storage device ( henceforth refered to as a \u0026ldquo;brick\u0026rdquo;)\nif you run fdisk -l you should see that an entry that looks something like\nDisk /dev/xvdb: 107.4 GB, 107374182400 bytes 255 heads, 63 sectors/track, 13054 cylinders, total 209715200 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000000 Disk /dev/xvdb doesn\u0026#39;t contain a valid partition table This indicates that our brick needs a partition table and formatting. We can achieve this be doing the following\nDevice contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel Building a new DOS disklabel with disk identifier 0xe7da4288. Changes will remain in memory only, until you decide to write them. After that, of course, the previous content won\u0026#39;t be recoverable. Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite) Command (m for help): n Partition type: p primary (0 primary, 0 extended, 4 free) e extended Select (default p): p Partition number (1-4, default 1): 1 First sector (2048-209715199, default 2048): Using default value 2048 Last sector, +sectors or +size{K,M,G} (2048-209715199, default 209715199): Using default value 209715199 Command (m for help): w The partition table has been altered! Calling ioctl() to re-read partition table. Syncing disks. I\u0026rsquo;ve highlighted the prompts and my responses. All we are doing here is creating a default partition table that has a single partition which uses up the whole disk.\nnow running fdisk -l should give us something that looks like\nDisk /dev/xvdb: 107.4 GB, 107374182400 bytes 43 heads, 44 sectors/track, 110843 cylinders, total 209715200 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0xe7da4288 Device Boot Start End Blocks Id System /dev/xvdb1 2048 209715199 104856576 83 Linu As you can now see we have a valid device of /dev/xvdb1 that we can mount_._ However we need to create a valid filesystem on the new brick before we can mount it. I have been doing this with Ext4 rather than XFS (which is the recommened filesystem from gluster), this is mainly down to the fact that when i tried using XFS I kept getting some issues with performance and access. I\u0026rsquo;m sure that with further investigation I could resolve this but as of yet haven\u0026rsquo;t had chance to. So far though I have had zero issues using Ext4. To create the filesystem we run:\nmkfs.ext4 -j /dev/xvdb1 Next, create a folder to mount to, easily done by executing:\nmkdir -p /glusterfs/brick Finally, the simplest way to mount the device is via your /etc/fstab by adding the line\n/dev/xvdb1 /glusterfs/brick ext4 defaults 1 2 and running mount -a as root (this will also mean that it mounts on boot for you automatically as well.)\nNext we need to install the latest gluster version. At the time of writing this was v3.3.1. You can find a version to suit your OS at http://www.gluster.org/download. If you are using Ubuntu you can do the following\napt-get install software-properties-common add-apt-repository ppa:semiosis/ubuntu-glusterfs-3.3 apt-get update apt-get install glusterfs-server glusterfs-client By this point you will now have a single working server to continue on your going to need to set up your second server ready to create your new volume.\nOnce you have your second (or third, fourth, etc) setup its a good idea to add a reference to each one of them to your /etc/hosts file. This is not really necessary and you can just use the IP addresses of each server but it saves you having to remember each IP and makes it easier to identify.\nRemember that we are going to be working with the new network interface you created earlier (i.e \u0026ldquo;GlusterNet\u0026rdquo;). to get the IP of your GlusterNet interface a quick ifconfig will show you an interface with an IP that matched the CIDR from earlier. In my case I now have 2 IPs of 192.168.3.1 \u0026amp; 192.168.3.2.\nSo now I add the following lines to my /etc/hosts file\n192.168.3.1 gluster1 192.168.3.2 gluster2 Creating our volume Now that the servers are prepared we can now play with the the tool gluster.This tool is a life saver in getting everything configured quickly and you can easily get a list of what its capable of by running gluster help. Now Im not going to take you through every command and option and would recomend reading the gluster manual to learn more.\nWhat this tool actually does is help generate and manipulate all the required config that is then stored at /var/lib/glusterd/.\nFirstly we need to tell gluster is that we have a pool of servers that will communicate with each other. Gluster refers to these as peers. To do this you need to run gluster peer probe gluster2 on each server for each server that will be used, replacing \u0026ldquo;gluster2\u0026rdquo; with the name names you defined in your /etc/hosts file. This will then create the appropriate files at /var/lib/glusterd/peers/\nNow that all our peers have been defined we can get to actually creating the new distributed volume. This however requires a little consideration as there are some decisions you need to make.\nIf we take a look at the help for creating a new volume we can see that we need to decide on what options to use\nvolume create \u0026lt;NEW-VOLNAME\u0026gt; [stripe \u0026lt;COUNT\u0026gt;] [replica \u0026lt;COUNT\u0026gt;] [transport \u0026lt;tcp|rdma|tcp,rdma\u0026gt;] \u0026lt;NEW-BRICK\u0026gt; - what are we going to name our volume [stripe ] [replica ] - are we going to crate a striped or replicated volume and how many \u0026ldquo;bricks\u0026rdquo; are we going to create this volume with [transport {tcp|rdma|tcp,rdma\u0026gt;] - What transport protocol do you want the peers to communicate with - which servers/bricks do you want to use. for more information on how to create you volume and what all the options mean have a look at these links\nhttp://gluster.org/community/documentation/index.php/Gluster_3.2:_Configuring_Distributed_Replicated_Volumes\nhttp://gluster.org/community/documentation/index.php/Gluster_3.2:_Configuring_Distributed_Striped_Volumes\nfor our purposes we are going to run\ngluster volume create myvolume replica 2 transport tcp gluster1:/glusterfs/brick gluster2:/glusterfs/brick This now creates a new volume that spans both of our servers. you can confirm that this is the case by running gluster volume info and you should get something that looks like\nVolume Name: myvolume Type: Replicate Volume ID: d3dd24fd-9482-44c3-9503-24291fad8193 Status: Created Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: gluster1:/glusterfs/brick Brick2: gluster2:/glusterfs/brick running this on both servers should give you the same results.\nWhat you will now find is that the gluster command has created a plethora of files at /var/lib/glusterd/vols/myvolume/. As you work with gluster more and more you will find yourself drawn to these files as they control all the different aspects of how the volume works and performs. Most importantly we will need some information from these files when we come to configure a client to mount the volume.\nAll that is left to do now is start the volume which can be easily done with a quick gluster volume start myvolume\nAt this point we have now completed setting up our volume but we need to add some security. I would strongly recommend setting up a firewall using ufw to control access to the server. The easiest way to do this is to allow all traffic on your \u0026ldquo;GlusterNet\u0026rdquo; network interface as only the servers you attach to that network will have access. you can find a guide to using ufw at https://help.ubuntu.com/12.10/serverguide/firewall.html.\nMounting a Client Now that we have a working volume we need to add some clients. To do this you will need to create a new server as above that is attached to the \u0026ldquo;GlusterNet\u0026rdquo; network but without the block storage (unless you really want it that is).\nMake sure to add your gluster dfinitions to your /etc/hosts file\nOnce you have your new client server ready we can install the gluster client\napt-get install software-properties-common add-apt-repository ppa:semiosis/ubuntu-glusterfs-3.3 apt-get update apt-get install glusterfs-client I\u0026rsquo;ve seen a number of different guides that tell you to install glusterfs-server as well but I have as yet had no need to as it all works without it.\nNow there are a lot of way that you can mount your new Gluster volume. I have tried a few and have had varying results. What I have found is that the best way is to create a volume file. To do this we create a new file at /etc/glusterfs.vol.\nvolume gluster1 type protocol/client option transport-type tcp option remote-host gluster1 option remote-subvolume /glusterfs/brick option username \u0026lt;username\u0026gt; option password \u0026lt;password\u0026gt; end-volume volume gluster2 type protocol/client option transport-type tcp option remote-host gluster2 option remote-subvolume /glusterfs/brick option username \u0026lt;username\u0026gt; option password \u0026lt;password\u0026gt; end-volume volume replicate type cluster/replicate subvolumes gluster1 gluster2 end-volume volume writebehind type performance/write-behind option cache-size 1MB subvolumes replicate end-volume volume cache type performance/io-cache option cache-size 400MB subvolumes writebehind end-volume What you will notice is that there is a and required for this to work. You can find these details on one of your peer servers in the file /var/lib/glusterd/vols/myvolume/trusted-myvolume-fuse.vol.\nThis /etc/gluster.vol file is basically going to inform the gluster-client software about how to connect to the gluster volume and all the available nodes to connect to. This provides us with some level of fail-over so should one node become unavailable the gluster client will seamlessly switch to a different one. It also allows us to define additional \u0026ldquo;translators\u0026rdquo; such as the performance-io one that you can see here. I would strongly recommend reading through the available translators to see which may be useful to you.\nNow one of the main issues you will find with Ubuntu is that it will fail on boot if you try to add this mount to your fstab. To get around this you can use Upstart. if you create the following file at /etc/init/glusterfs-mount.conf making sure to change to the interface for your GlusterNet network (i.e. eth0 or eth1 or eth2, you get the idea)\nauthor \u0026#34;Matt Cockayne\u0026#34; description \u0026#34;Mount GlusterFS after networking available\u0026#34; start on net-device-up IFACE=\u0026lt;interface\u0026gt; stop on stopping network stop on starting shutdown script mount -t glusterfs /etc/glusterfs.vol /glusterfs end script As you can see we are using a straight mount command. The magic is that this will not be executed until the start clause validates which in this case is not until the network interface for \u0026ldquo;GlusterNet\u0026rdquo; is up and running properly. You will also see that we are mounting the /etc/gluster.vol file to /gluster (remember to create this folder to mount to) rather than mounting a network path as you might when mounting an NFS share.\nIf you wanted you could also add more to your upstart script to handle clean un-mounting of gluster thus allowing you to then use the service gluster-mount (start|stop|restart) commands\nA quick reboot of the client server should confirm that it boots successfully and you will now end up with your volume mounted at /gluster. You can now test this by creating a new file. I tend to create an empty file at /gluster/mounted just so I have a quick reference that the folder is mounted. Once that\u0026rsquo;s created if you now go and take a look at the /gluster/brick on your \u0026ldquo;peers\u0026rdquo; you should see that there is now a file called \u0026ldquo;mounted\u0026rdquo; sat there looking all smug that it worked.\nCaveats Some important things for you to be made aware of\nNever write directly to a brick. Make sure to write to the volume only through a configured client Beware of split-brain. http://community.gluster.org/q/what-is-split-brain-in-glusterfs-and-how-can-i-cause-it/ http://www.gluster.org/2012/06/healing-split-brain/ RTFM - Read The F***ing Manual. Gluster is big and complex and there is a lot for you to understand. You can download a copy of the manual from here ","date":"2013-03-15T00:00:00Z","permalink":"/gluster-licious/","title":"Glorious Gluster - How to setup GlusterFS on Rackspace Cloud and Ubuntu 12.10"},{"content":"tl;dr\u0026gt; I make a terrible assumption about Zend Optimizer+ and am corrected by Dominic in the comments;\nTerrible post title I know but its the best I could come up with.\nI\u0026rsquo;ve just come up for air after spending the majority of the day debugging some issues on our current development sandbox.\nNow our sandbox tends to be quite bleeding edge in some circumstances and as such we run a fair few bits of unstable code. On the sandbox in question we have been running PHP 5.4.11 and unfortunately we have struggled to get APC working with it just the way we need it to. The lack of APC tends to make this sandbox quite slow.\nWe recently saw that Zend have open-sourced their OptimizerPlus extension (https://github.com/zend-dev/ZendOptimizerPlus) and that it was compatible with 5.4\u0026hellip;. Fantastic, or so we thought.\nSo I added the new OptimiserPlus to the sandbox and everything was going swimmingly. That was until we had to run one of the utility scripts that we use to rebuild some of our data structures. These scripts make use of different parts of both Zend Framework and Doctrine which tend to rely on some heavy DocBlock annotations.\nNow having used both APC and Zend Server knowing that they done affect this kind of functionality I had expected that OptimizerPlus would be fine\u0026hellip;. Wrongo. It took me a good few hours of head scratching trying to figure out what had happened.\nIt turns out that OptimizerPlus suffers from the same flaws that eAccellerator does and strips Docblocks when caching the bytecode. This results in Reflection returning false when you call methods such as `getDocComment()`.\nAll in all its not the end of the world I just disable OptimizerPlus and have to wait till I can get APC working. Not my ideal scenario but I can live with it.\nSomething that does concern me is that there is currently an RFC that has gone to vote (https://wiki.php.net/rfc/optimizerplus) about integrating OptimizerPlus into the PHP 5.5 distribution. While this is great I do worry how many other things may break and will they be picked up and fixed for the 5.5 release.\nUpdate: Since writing this post the RFC has finished being voted upon and has been approved. You can expect to see Optimizer Plus appearing bundled with PHP soon.\nUpdate (15th Mar 13): Thanks to Dominics\u0026rsquo; comment I now know that you can tell Optimizer+ to retain your Docblocks by setting your config using\nzend_optimizerplus.save_comments (default \u0026#34;1\u0026#34;) If disabled, all PHPDoc comments are dropped from the code to reduce the size of the optimized code. Disabling \u0026#34;Doc Comments\u0026#34; may break some existing applications and frameworks (e.g. Doctrine, ZF2, PHPUnit) zend_optimizerplus.load_comments (default \u0026#34;1\u0026#34;) If disabled, PHPDoc comments are not loaded from SHM, so \u0026#34;Doc Comments\u0026#34; may be always stored (save_comments=1), but not loaded by applications that don\u0026#39;t need them anyway. That\u0026rsquo;ll teach me to write a blog post without investigating more first.\n","date":"2013-03-01T00:00:00Z","permalink":"/docblock-docblock-wherefore-art/","title":"Docblock, Oh Docblock, wherefore art thou Docblock (hint: Zend Optimizer Plus lost them)"},{"content":"We have been using redmine for quite a long time and a few months ago attempted to upgrade from 1.3 to 2.something. Unfortunately I (quite typically) borked the installation and since then its been hobbling along after my attempts to fix it left it crippled.\nYesterday it finally gave up the fight and my attempts to resurrect the installation were futile. After a quick funeral (the eulogy was very touching), and wake in a nearby emporium of alcoholic beverages to commiserate our loss, I set about trying to figure out what to do next.\nAlternatives Now while Redmine is a worthy tool and has always managed to do what I needed in the past, recently its just not cut the mustard. I\u0026rsquo;ve kept toying with the idea of creating our own project management system but as with all in-house projects that we dream up its just never going to happen.\nA quick google around our options are to either go for a hosted solution (not possible as we have some very specific requirements regarding our SCM that mean we have to host our own repos for client work) or Redmine (or chilli project).\nYes we looked at a number of other management tools and of them all Redmine is still the closes to what we needed.\nInstallation So I spin up a new server instance of ubuntu 12.10 on the cloud and get to work installing the latest version.\nAs root I then run through these steps (you should assume that ALL of these steps require you to be root and files should be owned by root)\n# update/upgrade base installation of ubuntu packages apt-get update \u0026amp;\u0026amp; apt-get upgrade # install the requisite scm tools that we use apt-get install git-core subversion mercurial cvs # set up ruby apt-get install ruby rubygems libruby ruby-dev # set up apache \u0026amp; mysql apt-get install apache2 libapache2-mod-passenger mysql-server mysql-client libmysqlclient-dev # install imagemagick and the magick wand apt-get install imagemagick libmagickcore-dev libmagickwand5 libmagickwand-dev # create our user and database in mysql # replace uniquePassword with your own password mysql -u root -p -e \u0026#34;create user \u0026#39;redmine\u0026#39;@\u0026#39;localhost\u0026#39; identified by \u0026#39;uniquePassword\u0026#39;\u0026#34; mysql -u root -p -e \u0026#34;create database redmine\u0026#34; mysql -u root -p -e \u0026#34;grant all on redmine.* to \u0026#39;redmine\u0026#39;@\u0026#39;localhost\u0026#39;\u0026#34; mysql -u root -p -e \u0026#34;flush privileges\u0026#34; # clone redmine code to target location cd /usr/local/share git clone git://github.com/redmine/redmine.git # set apache as the owner of redmine chown -R www-data:www-data redmine # move into our new redmine folder cd redmine # set up your database configuration cp config/database.yml.example config/database.yml vim config/database.yml production: adapter: mysql2 database: redmine host: localhost username: redmine password: uniquePassword # install bundler gem gem install bundler # use bundler to set up redmine installation and without specified dependencies bundle install --without development test postgresql sqlite # set up our secret token rake generate_secret_token # set up our database and load default configuration RAILS_ENV=production rake db:migrate RAILS_ENV=production rake redmine:load_default_data # edit /etc/apache2/sites-available/default \u0026lt;VirtualHost *:80\u0026gt; ServerAdmin webmaster@localhost ServerName mysite.co.uk ServerAlias www.mysite.co.uk DocumentRoot /usr/local/share/redmine/public \u0026lt;Directory /\u0026gt; Options FollowSymLinks AllowOverride None \u0026lt;/Directory\u0026gt; \u0026lt;Directory /usr/local/share/redmine/public\u0026gt; Options Indexes FollowSymLinks MultiViews AllowOverride All Order allow,deny allow from all \u0026lt;/Directory\u0026gt; ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/ \u0026lt;Directory \u0026#34;/usr/lib/cgi-bin\u0026#34;\u0026gt; AllowOverride None Options +ExecCGI -MultiViews +SymLinksIfOwnerMatch Order allow,deny Allow from all \u0026lt;/Directory\u0026gt; ErrorLog ${APACHE_LOG_DIR}/error.log # Possible values include: debug, info, notice, warn, error, crit, # alert, emerg. LogLevel warn CustomLog ${APACHE_LOG_DIR}/access.log combined \u0026lt;/VirtualHost\u0026gt; # restart apache service apache2 restart That should be enough for you to have a working installation of redmine ready for you to use/customise\nAdditional Config We typically have additional steps that we would configure for our own installation.\n# add plugin assets folder mkdir /usr/local/share/redmine/public/plugin_assets chown www-data:www-data /usr/local/share/redmine/public/plugin_assets # enable some additional apache modules a2enmod rewrite # disable mod ssl a2dismod ssl # install gnutls apt-get install libapache2-mod-gnutls # install ssl certificate bundle and key (this assumes that you have already copied the key and bundle to ~/) mv ~/my_certificate.bnd /etc/ssl/certs/my_certificate.bnd chmod 0644 /etc/ssl/certs/my_certificate.bnd mv ~/my_certificate.crt /etc/ssl/private/my_certificate.key chmod 0600 /etc/ssl/private/my_certificate.key # now configure your /etc/apache2/sites-available/default-tls \u0026lt;IfModule mod_gnutls.c\u0026gt; \u0026lt;VirtualHost _default_:443\u0026gt; ServerAdmin webmaster@localhost ServerName mysite.co.uk ServerAlias www.mysite.co.uk DocumentRoot /usr/local/share/redmine/public \u0026lt;Directory /\u0026gt; Options FollowSymLinks AllowOverride None \u0026lt;/Directory\u0026gt; \u0026lt;Directory /usr/local/share/redmine/public\u0026gt; Options Indexes FollowSymLinks MultiViews AllowOverride All Order allow,deny allow from all \u0026lt;/Directory\u0026gt; ErrorLog ${APACHE_LOG_DIR}/error.log # Possible values include: debug, info, notice, warn, error, crit, alert, emerg. LogLevel warn CustomLog ${APACHE_LOG_DIR}/ssl_access.log combined GnuTLSEnable On GnuTLSCertificateFile /etc/ssl/certs/my_certificate.bnd GnuTLSKeyFile /etc/ssl/private/my_certificate.key GnuTLSPriorities NORMAL:!DHE-RSA:!DHE-DSS:!AES-256-CBC:%COMPAT \u0026lt;/VirtualHost\u0026gt; \u0026lt;/IfModule\u0026gt; # Add some Rails / Passenger specific config to /etc/apache2/sites-available/default-tls RailsEnv production PassengerDefaultUser www-data PassengerSpawnMethod smart PassengerPoolIdleTime 300 PassengerMaxRequests 5000 PassengerStatThrottleRate 5 PassengerHighPerformance On # change your /etc/apache2/sites-available/default to redirect to ssl \u0026lt;VirtualHost *:80\u0026gt; ServerAdmin sysadmin@zucchi.co.uk ServerName mysite.co.uk ServerAlias www.mysite.co.uk RewriteEngine On RewriteCond %{HTTPS} off RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI} Options FollowSymLinks AllowOverride None ErrorLog ${APACHE_LOG_DIR}/error.log LogLevel warn CustomLog ${APACHE_LOG_DIR}/access.log combined \u0026lt;/VirtualHost\u0026gt; # enable your new default-tls vhost and restart apache a2ensite default-tls service apache2 restart # setup \u0026amp;amp; configure email # when prompted select \u0026#34;internet site\u0026#34; and enter the domain you are hosting redmine from i.e. mysite.co.uk) apt-get install postfix # create config file and uncomment the production settings for sendmail cp /usr/local/share/redmine/config/configuration.yml.example /usr/local/share/redmine/config/configuration.yml vim /usr/local/share/redmine/config/configuration.yml production: email_delivery: delivery_method: :sendmail service apache2 restart #install pixel cookers theme cos we like it git clone git://github.com/pixel-cookers/RedmineThemePixelCookers.git /usr/local/share/redmine/public/themes/pixel-cookers ","date":"2013-02-23T00:00:00Z","permalink":"/redmine-install-died-we-cried/","title":"Our Redmine install died, We all cried!"},{"content":"This was a head scratcher when I ran into this yesterday and I thought I would share my solution to the following scenario:\nI need to debug PHP Command Line script, located on Remote LAMP Virtual WebServer running in Virtual Box with a Shared Folder, using local PHPStorm 5.0.\nThe solution:\nYou first must set PHPStorm to use remote file paths. To set these go to the following:\nPHPStorm -\u0026gt; Peferences -\u0026gt; PHP -\u0026gt; Servers\nThis gives the following display:\nReplace the Name, Host and Absolute path on the server, to match your own settings. Note keep the Name and Host the same for ease.\nNext add some breakpoints in PHPStorm and set it to listen for any debug connections using the listener icon:\nNow login to your Remote Server via SSH etc.\nYou now need to change settings for Xdebug in either xdebug.ini or php.ini depending on how you installed it. You also need to know the IP of the local machine. This can permanently set in the Network Setting of your VM in Virtual Box, so you will never have to change it. In my example the local machine running PHPStorm is:\n192.168.56.1\nNow edit the ini file that contains your Xdebug settings and set the following:\nxdebug.remote_host = 192.168.56.1 xdebug.remote_connect_back = 0 xdebug.remote_port = 9000 xdebug.remote_handler = dbgp xdebug.remote_mode = req xdebug.remote_enable = 1 xdebug.idekey = phpstorm1 Be aware you might have to change the remote_host and the idekey based on your own environment. To better understand what each option does, see Xdebug Settings\nFinally, when running the script you must set the following variables:\nPHP_IDE_CONFIG=\u0026#34;serverName=dev.example.com\u0026#34; PHP_IDE_CONFIG will tell PHPStorm how to map the Remote File Paths to what it sees Locally. Again replace the URL with the Name/Host you set in PHPStorm. Note: You can export this, if your system is only running one site; mine is not.\nYou can run this inline with your script:\nPHP_IDE_CONFIG=\u0026#34;serverName=dev.example.com\u0026#34; ./testscript.sh This should send you to PHPStorm where you earlier placed breakpoints.\nHappy Debugging!\n","date":"2013-02-06T00:00:00Z","permalink":"/debug-cli-remote-server/","title":"Debug PHP CLI on Remote Server with Xdebug and PHPStorm"},{"content":"About NRPE NRPE (Nagios Remote Plugin Executor) is a useful tool that allows you to execute scripts on remote servers and return the output for ingestion by some form of monitoring software.\nSetup We currently have our own instance of Icinga running to monitor our servers and have recently started to offer access to it for our clients.\nThe majority of our servers (and our clients servers if we set them up) use one variant or another of Ubuntu. This means we can very quickly get our servers connected to a Nagios/Icinga instance.\nFirst things first we need to install the nrpe server and all the associated plugins\napt-get install nagios-nrpe-server \\ nagios-plugins-basic \\ nagios-plugins \\ nagios-plugins-extra Next we need to edit the main nrpe config file to be found @ /etc/nagios/nrpe.cfg. What your looking for is the lines\n# ALLOWED HOST ADDRESSES # This is an optional comma-delimited list of IP address or hostnames # that are allowed to talk to the NRPE daemon. # # Note: The daemon only does rudimentary checking of the client\u0026#39;s IP # address. I would highly recommend adding entries in your /etc/hosts.allow # file to allow only the specified host to connect to the port # you are running this daemon on. # # NOTE: This option is ignored if NRPE is running under either inetd or xinetd allowed_hosts=127.0.0.1 # COMMAND ARGUMENT PROCESSING # This option determines whether or not the NRPE daemon will allow clients # to specify arguments to commands that are executed. This option only works # if the daemon was configured with the --enable-command-args configure script # option. # # *** ENABLING THIS OPTION IS A SECURITY RISK! *** # Read the SECURITY file for information on some of the security implications # of enabling this variable. # # Values: 0=do not allow arguments, 1=allow command arguments dont_blame_nrpe=0 You will want to change this to the IP of your Nagios/Icinga instance and set the dont_blame_nrpe value to 1. Feel free to take a look round the rest of the file. Its all quite interesting and generally will documented. Be careful what you change though in case something breaks.\nYou will also want to look for some lines that are refererd to as \u0026ldquo;COMMAND DEFINITIONS\u0026rdquo; and look something like this\ncommand[check_users]=/usr/lib/nagios/plugins/check_users -w 5 -c 10 command[check_load]=/usr/lib/nagios/plugins/check_load -w 15,10,5 -c 30,25,20 command[check_hda1]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /dev/hda1 command[check_zombie_procs]=/usr/lib/nagios/plugins/check_procs -w 5 -c 10 -s Z command[check_total_procs]=/usr/lib/nagios/plugins/check_procs -w 150 -c 200 You can go ahead and comment these out as we will be adding our own definitions shortly. The main reason for removing these is that we will be configuring some specific scripts for our own use later that allow you to configure your requirements and thereshold from within your Nagios/Icinga config.\nConfiguration of Monitoring Server Once this is complete you can now configure a new \u0026ldquo;check command\u0026rdquo; for use with your nagios/icinga server.\ndefine command { command_name check_nrpe command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ } define command { command_name check_nrpe_command_args command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ -a $ARG2$ } Here you can see that we have set up 2 different check commands. The first is a simple command requiring only one argument of $ARG1$ which would be the name of the command we want to run on the remote server. The second command is almost identical except for the fact it takes a second argument which allows you to input a series of \u0026ldquo;arguments\u0026rdquo; to be passed to the command on your remote server. each argument should be separated by a space.\nNow that you have these you can then configure your hosts and services to make use of it. I would recommend having a trawl through the Nagios/Icinga sites \u0026amp; documentation to find out how to create a config that suits you.\nConfiguration of Remote Server Now that we have our monitoring server ready its time to add the command we want to run to the remote server.\nTo do this your /etc/nagios/nrpe.cfg shoudl hopefully have a line in it that looks like\ninclude=/etc/nagios/nrpe_local.cfg if it doesn\u0026rsquo;t have a line like that then add it and edit the `/etc/nagios/nrpe_local.cfg` file to look a little like this\ncommand[check_apt]=/usr/lib/nagios/plugins/check_apt command[check_users]=/usr/lib/nagios/plugins/check_users -w $ARG1$ -c $ARG2$ command[check_load]=/usr/lib/nagios/plugins/check_load -w $ARG1$ -c $ARG2$ command[check_disk]=/usr/lib/nagios/plugins/check_disk -w $ARG1$ -c $ARG2$ -p /dev/sda1 command[check_procs]=/usr/lib/nagios/plugins/check_procs -w $ARG1$ -c $ARG2$ -s $ARG3$ command[check_zombie_procs]=/usr/lib/nagios/plugins/check_procs -w 5 -c 10 -s Z command[check_total_procs]=/usr/lib/nagios/plugins/check_procs -w 150 -c 200 These are a few simple commands that I tend to use most often. These translate to your \u0026ldquo;check_nrpe\u0026rdquo; commands like so\n$ARG1$ = everything inside the square brackets [ ] $ARG2$ = each of the $ARG?$ keys as a single string separated by a space Once that\u0026rsquo;s done you should be able restart your nrpe server with `/etc/init.d/nagios-nrpe-server restart`\nIt really is that simple. Do bear in mind that because you can pass arbitrary arguments into nrpe this was you could leave yourself vulnerable to a bit of maliciousness so its a good idea to make sure your firewall restricts port 5666 (the default port) to IPs you trust.\n","date":"2013-02-06T00:00:00Z","permalink":"/quick-dirty-setup-nrpe-ubuntu/","title":"Quick and easy setup of and connection to NRPE on Ubuntu"},{"content":"I\u0026rsquo;ve decided that I need to up my game when it comes to webservers. However I\u0026rsquo;m not yet ready to switch to Nginx or one of the other webservers out in the wild as I need something up and running rapidly.\nGranted the numbers are definitely against Apache in a lot of benchmarks but historically I\u0026rsquo;ve always had a good experience and the entry level makes it much more appropriate for me to stick with it.\nHowever Apache 2.2 is rather long in the tooth, thankfully 2.4 has been out for a while now. The problem I have is that I tend to favour Ubuntu as a platform and there is no sign of a 2.4 version appearing on the horizon anytime soon as they are waiting for it to be implemented upsteam in Debian before including it in Ubuntu.\nNow there are PPAs available out there but im not overly happy using them (especially on production environments) So the only option is to compile.\nFirst thing is to install all the dependencies we are going to need. Thankfuly ubuntu has a nice and simple way of handling this.\napt-get build-dep apache2 We can then download the source code and start the compilation.\nSo from the root of our new copy of the source we need to run our configure.\n./configure --prefix=/usr/local/apache2 \\ --enable-mods-shared=all \\ --enable-http \\ --enable-deflate \\ --enable-expires \\ --enable-slotmem-shm \\ --enable-headers \\ --enable-rewrite \\ --enable-proxy \\ --enable-proxy-balancer \\ --enable-proxy-http \\ --enable-proxy-fcgi \\ --enable-mime-magic \\ --enable-log-debug \\ --with-mpm=event You will notice that I\u0026rsquo;m installing it using the event mpm. Hopefully I\u0026rsquo;ll be covering more about the event mpm in the future.\nNext we need to run make\nmake \u0026amp;\u0026amp; make install Once that\u0026rsquo;s complete you should be able to run\n/usr/local/apache2/bin/apachectl start and get the \u0026ldquo;it works\u0026rdquo; message through your webrowser when accessing the server IP.\nDont forget to configure apache to suit your specific requirements.\nSomething that will come up is how to start apache on boot. Seeing as Ubuntu uses Upstart it makes sense to utilise it for controlling apache.\nSo in the file `/etc/ini/apache.conf` we need to put\n# apache2 - http server # # Apache is a web server that responds to HTTP and HTTPS requests. # Required-Start: $local_fs $remote_fs $network $syslog # Required-Stop: $local_fs $remote_fs $network $syslog author \u0026#34;Matt Cockayne \u0026lt;matt@zucchi.co.uk\u0026#34; description \u0026#34;Apache 2.4 HTTP Server\u0026#34; start on runlevel [2345] stop on runlevel [!2345] console output pre-start script mkdir -p /var/run/apache2 || true install -d -o www-data /var/lock/apache2 || true # ssl_scache shouldn\u0026#39;t be here if we\u0026#39;re just starting up. # (this is bad if there are several apache2 instances running) rm -f /var/run/apache2/*ssl_scache* || true end script # Give up if restart occurs 10 times in 30 seconds. respawn limit 10 30 respawn script if test -f /usr/local/apache2/bin/envvars; then . /usr/local/apache2/bin/envvars fi ULIMIT_MAX_FILES=\u0026#34;ulimit -S -n `ulimit -H -n`\u0026#34; if [ \u0026#34;x$ULIMIT_MAX_FILES\u0026#34; != \u0026#34;x\u0026#34; ] ; then $ULIMIT_MAX_FILES fi /usr/local/apache2/bin/httpd -k start -D FOREGROUND end script This is a rather simple upstart script and I will be looking to update it at some point\u0026hellip; but it works\nOnce that\u0026rsquo;s done you should find that on reboot Apache will start and take advantage of all the management features of upstart including attempting to respawn Apache should it end unexpectedly. You should also be able to then use the following commands to control Apache.\n# how to start start apache start apache # or initctl start apache # how to stop apache stop apache # or initctl stop apache # how to restart apache restart apache # or initctl restart apache # check the status of apache status apache # or initctl status apache I generally tend to avoid using the apachectl script found at /usr/local/apache/bin/apachectl once upstart takes control.\n","date":"2012-11-06T00:00:00Z","permalink":"/compiling-apache-2-4-ubuntu-12-04/","title":"Compiling Apache 2.4 on Ubuntu 12.04"},{"content":"So recently I\u0026rsquo;ve been working with PHP 5.4 a LOT. Unfortunately Ubuntu (my main dev environment) is behind the times. So I\u0026rsquo;m resorting to compiling PHP manually.\nNot a daunting as it may first appear. The really tricky part is working out your dependencies and `configure` script.\nHence the reason for this post as a reminder for myself and others that may want to do a quick compile. (I would recommend that if your compiling for a production/live environment that you make sure you understand what it is your compiling though before just using what\u0026rsquo;s here)\nSo where to start. Dependencies first I think\nUbuntu allows you to install dependencies for building source `apt-get build-deps`. We will use this and install any extras we may need.\napt-get install \\ libxml2 \\ libxml2-dev \\ libssl-dev \\ pkg-config \\ curl \\ libcurl4-nss-dev \\ enchant \\ libenchant-dev \\ libjpeg8 \\ libjpeg8-dev \\ libpng12-0 \\ libpng12-dev \\ libvpx1 \\ libvpx-dev \\ libfreetype6 \\ libfreetype6-dev \\ libt1-5 \\ libt1-dev \\ libgmp10 \\ libgmp-dev \\ libicu48 \\ libicu-dev \\ mcrypt \\ libmcrypt4 \\ libmcrypt-dev \\ libpspell-dev \\ libedit2 \\ libedit-dev \\ libsnmp15 \\ libsnmp-dev \\ libxslt1.1 \\ libxslt1-dev And now the configure\n./configure \\ --prefix=/usr/local/php \\ --with-apxs2=/usr/local/apache2/bin/apxs \\ --enable-fpm \\ --with-fpm-user=www-data \\ --with-fpm-group=www-data \\ --with-config-file-path=/usr/local/php/conf \\ --with-config-file-scan-dir=/usr/local/php/conf.d \\ --enable-debug \\ --with-openssl \\ --with-kerberos \\ --with-zlib \\ --enable-calendar \\ --with-curl \\ --with-curlwrappers \\ --with-enchant \\ --enable-exif \\ --enable-ftp \\ --with-gd \\ --with-jpeg-dir=/usr \\ --with-png-dir=/usr \\ --with-vpx-dir=/usr \\ --with-freetype-dir=/usr \\ --with-t1lib \\ --enable-exif \\ --enable-gd-native-ttf \\ --enable-gd-jis-conv \\ --with-gettext \\ --with-gmp \\ --with-mhash \\ --enable-intl \\ --enable-mbstring \\ --with-mcrypt \\ --with-mysql \\ --with-mysqli \\ --enable-pcntl \\ --with-pdo-mysql \\ --with-pdo-pgsql \\ --with-pgsql \\ --with-pspell \\ --with-libedit \\ --with-readline \\ --enable-shmop \\ --with-snmp \\ --enable-soap \\ --enable-sockets \\ --enable-sysvmsg \\ --enable-sysvshm \\ --with-xsl \\ --enable-zip \\ --with-pear \\ --enable-zend-signals \\ --enable-maintainer-zts Once these are done then we follow the standard make process. Notice we are also running make test\u0026hellip; very important as it givges more data for the developers to work with.\nmake \u0026amp;\u0026amp; make test \u0026amp;\u0026amp; make install The next thing is configuring your php.ini file as the install doesn\u0026rsquo;t have one yet so we copy either the production or development default from the source code to the new conf dir and edit to suit your needs.\ncp {php-source-dir}/php.ini-(development|production) /usr/local/php/conf Thats it. All ready to roll\u0026hellip; almost, this installation is the one I use for use with a webserver so you will want to add the appropriate directives to apache.\nLoadModule php5_module modules/libphp5.so AddHandler php5-script .php AddType text/html .php ","date":"2012-11-06T00:00:00Z","permalink":"/compiling-php-5-4-ubuntu-12-04/","title":"Compiling PHP 5.4 on Ubuntu 12.04"},{"content":"Rsync is a great tool but can be a pain if you have to jump through hoops to connect via ssh such as connecting via a different port.\nA simple solution is to use the -e flag (also knows as \u0026ndash;rsh=COMMAND). This flag allows you manually define the ssh command to use when connecting\nrsync -e \u0026#39;ssh -p2020\u0026#39; -rav ./* user@server: Will allow me to connect to a server with SSH listening on port 2020\n","date":"2012-07-31T00:00:00Z","permalink":"/rsync-custom-ssh-commands/","title":"Rsync and custom SSH commands"},{"content":"This morning I woke up to an email telling me that my Nexus7 that I had ordered 3 weeks ago was\u0026hellip; \u0026ldquo;out for delivery\u0026rdquo;.\nI couldn\u0026rsquo;t contain my excitement. I sat patiently waiting by my door. Finally 11 o\u0026rsquo;clock rolls around and there is a knock. I\u0026rsquo;m handed a brown parcel and hand over the obligatory signature. I close the door behind me and carefully place the box on the desk. I contemplate teasing myself and seeing how long I can hold out before opening it.\nThat lasted about 20 seconds!!!!\nIn fact\u0026hellip; this video says it all\nhttp://www.youtube.com/watch?v=Xijcwbg8CGQ\nI\u0026rsquo;m not gonna bore you with how awesome it is (and it is awesome). I will however point out a few obvious foibles with it though (not that they would have ever stopped me from buying it).\nNo way to expand storage Lack of Flash (may not seem important, but until everyone else catches up there is loads of content I cant use i.e. BBC iPlayer) A number of apps (games specifically) that I have run on my phone are not yet supported Google Now feels a little clunky at times and struggles with some of the regional British accents MTP doesn\u0026rsquo;t appear to work out of the box with my Linux OS (I\u0026rsquo;m sure this will be remedied soon) No obvious way to directly access the front facing camera (easily remedied with an app from MoDaCo) ","date":"2012-07-19T00:00:00Z","image":"/nice-nexus7/tablet-n7-features-ushome-family.png","permalink":"/nice-nexus7/","title":"Nice New Nexus7"},{"content":"If you want to register custom view helpers with a module you can do so by using the service location built into the Skeleton Application and creating a module config that looks something like.\nreturn array( \u0026#39;view_helpers\u0026#39; =\u0026gt; array( \u0026#39;invokables\u0026#39; =\u0026gt; array( // generic view helpers \u0026#39;truncate\u0026#39; =\u0026gt; \u0026#39;Zucchi\\View\\Helper\\Truncate\u0026#39;, // form based view helpers \u0026#39;bootstrapForm\u0026#39; =\u0026gt; \u0026#39;Zucchi\\Form\\View\\Helper\\BootstrapForm\u0026#39;, \u0026#39;bootstrapRow\u0026#39; =\u0026gt; \u0026#39;Zucchi\\Form\\View\\Helper\\BootstrapRow\u0026#39;, \u0026#39;bootstrapCollection\u0026#39; =\u0026gt; \u0026#39;Zucchi\\Form\\View\\Helper\\BootstrapCollection\u0026#39;, ), ), ); ","date":"2012-07-18T00:00:00Z","permalink":"/registering-custom-view-helpers-zf2/","title":"Registering custom view helpers in ZF2"},{"content":"So\u0026hellip;\nWith the release of beta 5 for Zend Framework 2 I thought it time for me to tidy up and fix a few modules I created back at beta 3.\nNow I\u0026rsquo;m a big fan of Twitter Bootstrap CSS framework as I\u0026rsquo;m sure a lot of other people are as well. Seeing that the Zend Skeleton Application comes with bootstrap already included it was easy enough to set up my forms using the old ZF Forms found in ZF1.\nHowever a brand spanking new Forms component has been rolled out with ZF2. The long and the short of this new component meant that I had the opportunity to hand roll a new way of making my forms work with Twitter Bootstrap.\nSo, a little tinkering, a quick pull request to ZF2 to allow the definition of arbitrary options and I came up with some useful View Helpers that can be dropped into a project and used.\nYou can find them at https://github.com/zucchi/Zucchi.\nSo how to use them. Lets start by creating a new form (we\u0026rsquo;ll keep it simple for now)\nclass MyForm extends Form { public function __construct() { parent::__construct(\u0026#39;myform\u0026#39;); $this-\u0026gt;add(array( \u0026#39;name\u0026#39; =\u0026gt; \u0026#39;price\u0026#39;, \u0026#39;attributes\u0026#39; =\u0026gt; array( \u0026#39;type\u0026#39; =\u0026gt; \u0026#39;text\u0026#39;, \u0026#39;required\u0026#39; =\u0026gt; \u0026#39;required\u0026#39;, \u0026#39;placeholder\u0026#39; =\u0026gt; \u0026#39;0.99\u0026#39;, ), \u0026#39;options\u0026#39; =\u0026gt; array( \u0026#39;label\u0026#39; \u0026#39;bootstrap\u0026#39; =\u0026gt; array( \u0026#39;help\u0026#39; =\u0026gt; array( \u0026#39;style\u0026#39; =\u0026gt; \u0026#39;block\u0026#39; \u0026#39;content\u0026#39; =\u0026gt; \u0026#39;The price you wish to use\u0026#39; ), \u0026#39;prepend\u0026#39; =\u0026gt; array(\u0026#39;$\u0026#39;), \u0026#39;append\u0026#39; =\u0026gt; array(\u0026#39;¢\u0026#39;), ), ), ); $actions = new Collection(\u0026#39;actions\u0026#39;); $actions-\u0026gt;setAttribute(\u0026#39;class\u0026#39;, \u0026#39;form-actions\u0026#39;); $actions-\u0026gt;add(array( \u0026#39;name\u0026#39; =\u0026gt; \u0026#39;submit\u0026#39;, \u0026#39;attributes\u0026#39; =\u0026gt; array( \u0026#39;type\u0026#39; =\u0026gt; \u0026#39;submit\u0026#39;, \u0026#39;value\u0026#39; =\u0026gt; \u0026#39;Save\u0026#39;, \u0026#39;class\u0026#39; =\u0026gt; \u0026#39;btn btn-primary\u0026#39; ), \u0026#39;options\u0026#39; =\u0026gt; array( \u0026#39;bootstrap\u0026#39; =\u0026gt; array( \u0026#39;style\u0026#39; =\u0026gt; \u0026#39;inline\u0026#39;, ), ), )); $actions-\u0026gt;add(array( \u0026#39;name\u0026#39; =\u0026gt; \u0026#39;reset\u0026#39;, \u0026#39;attributes\u0026#39; =\u0026gt; array( \u0026#39;type\u0026#39; =\u0026gt; \u0026#39;reset\u0026#39;, \u0026#39;value\u0026#39; =\u0026gt; \u0026#39;reset\u0026#39;, \u0026#39;class\u0026#39; =\u0026gt; \u0026#39;btn\u0026#39; ), \u0026#39;options\u0026#39; =\u0026gt; array( \u0026#39;bootstrap\u0026#39; =\u0026gt; array( \u0026#39;style\u0026#39; =\u0026gt; \u0026#39;inline\u0026#39;, ), ), )); $this-\u0026gt;add($actions); } } You\u0026rsquo;ll notice that I have highlighted some lines. Thanks to the ability to set arbitrary options we can define a \u0026ldquo;bootstrap\u0026rdquo; option which we can then use to allow us to pass data into our new bootstrap view helpers. You can also see that I have added a save and reset button to a collection. I\u0026rsquo;ll explain that later.\nSo what next\u0026hellip; Rather than go into the mechanics of how to work with forms I\u0026rsquo;ll refer you to the ZF documentation and this excellent blog post\nWe then pick up by looking at your view, and the helpers I have created.\nBootstrapForm($form, $formStyle) One of the few things I miss from the ZF1 implementation of Forms is the self rendering aspect! So what did I decide to do? That\u0026rsquo;s right I created a view helper to render everything in one command.\nThe $this-\u0026gt;bootstrapForm() takes two parameters. The first is quite obviously the form. The second is the style of form. This is directly related to the form types that can be found http://twitter.github.com/bootstrap/base-css.html#forms. You can use any of \u0026lsquo;vertical\u0026rsquo;, \u0026lsquo;inline\u0026rsquo;, \u0026lsquo;search\u0026rsquo; \u0026amp; \u0026lsquo;horizontal\u0026rsquo;. If you dont specify a formStyle then it will default to \u0026lsquo;vertical\u0026rsquo;\nCaveat: This helper will then iterate through all of the associated elements and render them first. Only after the direct elements have been generated will it then move onto Collections or Fieldsets (as soon as I work out how I\u0026rsquo;ll fix this).\nBootstrapRow($element, $formStyle) This is a straightforward modification of the FormRow helper that come bundled with the new component.\nWe have a few differences now though. We have a second parameter as with the BootstrapForm view helper and the output is generated using sprintf and a set of templates that mimic the structures of the different form styles from bootstrap.\nThis helper can be used by itself to generate an element row and is used by the BootstrapForm helper\nWe can also now take advantage of the \u0026ldquo;bootstrap\u0026rdquo; options we set earlier.\nBootstrap Options style\nThe style of form element to use regardless of what style may be passed into the view helper (you can see an example of this in the buttons from the MyForm example above)\nhelp\nThis works in the same way as \u0026ldquo;description\u0026rdquo; did from ZF1 but allows you to define it either as a string or an array with the keys \u0026ldquo;style\u0026rdquo; for either \u0026lsquo;inline\u0026rsquo; or \u0026lsquo;block\u0026rsquo; and \u0026ldquo;Content\u0026rdquo; which should be self explainatory\nprepend\nTakes advantage of Bootstraps ability to prepend blocks to an input field. This can be defined as a single string, or an array of strings to allow you to add multiple blocks should you want to\nprepend\nTakes advantage of Bootstraps ability to append blocks to an input field. This can be defined as a single string, or an array of strings to allow you to add multiple blocks should you want to\nThese options get evaluated and spat out from the new renderBootstrapOptions() method as part of the \u0026ldquo;render\u0026rdquo;.\nBootstrapCollection($element, $style, $wrap) Again this is a direct rip off of the FormCollection helper found in the ZF2 Form component witha few modifications. The main difference is that is makes use of the BootstrapRow helper and has methods and properties to allow the setting of the form style to use.\nYou can see from the MyForm example above that we set a Collection called \u0026lsquo;actions\u0026rsquo;. This is a pretty standard way of grouping elements together. You can also see that we set a class for the Collection which may look familiar to those that have used Twitter Bootstrap for a while.\nWhat our helper will then do is wrap the buttons in a div with the appropriate class attached. If you were to define a label for the Collection/Fieldset You would then also find that the fieldset and legend tags are also spat out with our \u0026lt;div class=\u0026quot;form-actions\u0026quot;\u0026gt; sandwiched between them and the elements.\nResult So what we now get when we use MyForm with out helpers.\n$this-\u0026gt;bootstrapForm($form, \u0026#39;horizontal\u0026#39;); Should now look something like this\nHow you can use it As of right now you can get the library from its repo on github @ https://github.com/zucchi/Zucchi and can be found on packagist for use with composer\nEdit: The bootstrap stuff has moved to a new location as a separate ZF2 module. you can find it @ https://github.com/zucchi/ZucchiBootstrap or @ packagist for use with composer\n","date":"2012-07-17T00:00:00Z","permalink":"/bootstrapping-zf2-forms/","title":"Bootstrapping ZF2 Forms"},{"content":"I recently had to do some load testing for a site recently that would allow me to test in excess of 100k requests in a 60 second period\u0026hellip;\nSo I decided to do some testing using JMeter as it seemed like a suitable tool for doing what I needed and I had used it for some simpler testing in the past.\nAfter a little fumbling around I managed to get a test plan designed that would simulate 10k users actually navigating the site and adding to a cart etc, with a number of various interactions. It wasnt perfect but it would correctly simulate over 100k requests.\nSo feeling quite pleased with myself I started the test from my laptop. Now I\u0026rsquo;m not a big gamer, I\u0026rsquo;m known to play a little World or Warcraft from time to time but that\u0026rsquo;s about it. So when it comes to computing power i tend to opt for battery life over sheer grunt.\nSuffice to say, my laptop fell flat on its face, and if it hadn\u0026rsquo;t it turns out that the connection I was using just wasn\u0026rsquo;t up to the task of handling that much traffic adequately.\nSo plan B\u0026hellip;\nI quickly fired up the largest AWS instance available and got a copy of jmeter installed. A little tinkering with my test plan and some googling on how to run jmeter without a gui and a quick\n./jmeter -n -t test-plan.jmx\nand it appeared to be running.\n(Please bear in mind that I\u0026rsquo;m being overly kind\u0026hellip; it took a LOT of tinkering and twice as much Googling to work out how to get the test results out so i could actually get some idea of WTF was happening during the test)\nSo\u0026hellip; client \u0026ldquo;happy\u0026rdquo;\u0026hellip; I decided to go and find a better way to do my load testing in the future.\nSticking with JMeter I managed to find this gem of a page\nhttp://jmeter.apache.org/usermanual/remote-test.html\ntl;dr \u0026gt; use your local install of jmeter to trigger tests to run on one or more remote \u0026ldquo;nodes\u0026rdquo; and then have all the results sent to your local install.\nSo I set to work!\nBuilding a Node First I need to set up an AWS instance that we can use and duplicate so I can quickly build a cluster of nodes on demand. I\u0026rsquo;m a big fan of Ubuntu so I spin up a micro instance of 12.04 server. Next I shell into the instance and install the default Java runtime from apt\napt-get install openjdk-7-jre\nYes I know there are other more appropriate runtimes, but i dont really care\u0026hellip; i just need it to work and it does.\nnext I grab a copy of the latest stable from http://jmeter.apache.org/download_jmeter.cgi and un-tar it to /usr/local/jmeter\n(N.B. JMeter is available through the apt but I had issues with that version and you need to make sure that both your local version and all the nodes run the same version of jmeter)\nWe can now test that the install is working running /usr/local/jmeter/bin/jmeter-server and you should get some output that looks similar to\nCreated remote object: UnicastServerRef [liveRef: [endpoint:[10.???.???.???:38939](local),objID:[46522b57:138381f1023:-7fff, 2635011707874933136]]] Which tells us that the server is running.\nBUT unfortunately its not going to work just yet. Because we are using Amazons EC2 we are going to relying on their NAT for routing. Out of the box JMeter just wont work properly.\nHowever there is something we can do to combat this. We can set the parameter RMI_HOST_DEF that the /usr/local/jmeter/bin/jmeter-server script will include in starting the server.\nexport RMI_HOST_DEF=-Djava.rmi.server.hostname=$(wget http://169.254.169.254/latest/meta-data/public-hostname -q -O -) I\u0026rsquo;ll explain what we are doing here. Amazon have been quite clever by providing a meta-data endpoint that you can poll from within your instance to get key pieces of data\u0026hellip; Including the public dns record.\nWe can use this endpoint and using wget pipe that into the RMI_HOST_DEF param (ensuring that we prepend -D) and then export that so it becomes available to the /usr/local/jmeter/bin/jmeter-server script.\nNow to get the server to start on boot.\na quick upstart script should solve this\n# Upstart script to initialise jmeter-server description \u0026#34;JMeter Server\u0026#34; author \u0026#34;Dev in Charge \u0026#34; start on started networking stop on stopping networking stop on stopping shutdown console output script # get the current public DNS record export RMI_HOST_DEF=-Djava.rmi.server.hostname=$(wget http://169.254.169.254/latest/meta-data/public-hostname -q -O -) # start jmeter in server mde /usr/local/jmeter/bin/jmeter-server end script saving this to /etc/init/jmeter-server.conf will mean that it will auto-start jmeter-server on boot and allow you to manually control the process using start jmeter-server and stop jmeter-server\nand thats it\u0026hellip; instance configured\nAll you need to do now is save the instance as an AMI and you have an on-demand image for spinning up a cluster of remote JMeter servers for you to play with.\nConfiguring your local installation Now that the server side is working we need to configure our local installation to allow it to connect.\nFirst things first however, make sure you are using the same version of JMeter as you are running on the server.\nWe need to edit the jmeter.properties file that can be found in the bin folder of the installtion you downloaded. Look for the parameter remote_hosts This needs to be set with the public dns of the remote server(s) your connecting to. for example\nremote_hosts=ec2-176-34-164-170.eu-west-1.compute.amazonaws.com,ec2-123-34-456-789.eu-west-1.compute.amazonaws.com Thats your local version configured. You will now be able to tell your local version to run tests on any or all of your specified remotes.\nHowever if your like me you work behind a router/firewall. If so this isnt the end of the story. When you send a test plan to a remote from your local install it will also send the IP address of your local machine for it to send the results back to. JMeter does this by looking up where your current hostname resolves to. In my circumstance it resolved to 127.0.1.1. The reason it did this is down to the fact my systems host file had the line\n127.0.1.1 devincharge.local To resolve this I had to change it to my external IP address\n89.345.871.79 devincharge.local And set up port forwarding from my router to my local machine for all ports from 1024 to 65535. Now, you can if you want use specific ports so you dont have to port forward everything from your router, but i\u0026rsquo;ll leave that for you to lookup as there are plenty resources on how to do this for you to google and I\u0026rsquo;ve waffled on for far too long already.\nHappy testing\n","date":"2012-06-30T00:00:00Z","permalink":"/loaded-testing/","title":"Loaded Testing"}]