<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Oidc on PHP Boy Scout</title><link>https://blog-570662.gitlab.io/tags/oidc/</link><description>Recent content in Oidc on PHP Boy Scout</description><generator>Hugo -- gohugo.io</generator><language>en-gb</language><copyright>Matt Cockayne</copyright><lastBuildDate>Thu, 14 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://blog-570662.gitlab.io/tags/oidc/index.xml" rel="self" type="application/rss+xml"/><item><title>A 403 you can't fix in IAM</title><link>https://blog-570662.gitlab.io/a-403-you-cant-fix-in-iam/</link><pubDate>Thu, 14 May 2026 00:00:00 +0000</pubDate><guid>https://blog-570662.gitlab.io/a-403-you-cant-fix-in-iam/</guid><description>&lt;img src="https://blog-570662.gitlab.io/a-403-you-cant-fix-in-iam/cover-a-403-you-cant-fix-in-iam.png" alt="Featured image of post A 403 you can't fix in IAM" /&gt;&lt;p&gt;&lt;a class="link" href="https://blog-570662.gitlab.io/no-access-keys-in-ci/" &gt;The OIDC post&lt;/a&gt; explained the handshake that lets a GitLab pipeline deploy to AWS with no stored key. This is the story of the first time I got it wrong, and spent an afternoon fixing the wrong thing. The error was a flat 403 from AWS, and the maddening part is that no amount of editing the IAM policy was ever going to fix it.&lt;/p&gt;
&lt;h2 id="a-403-on-the-first-real-run"&gt;A 403 on the first real run
&lt;/h2&gt;&lt;p&gt;The OIDC post covered the handshake: GitLab CI mints a signed token, AWS exchanges it for short-lived credentials against a role whose trust policy names the pipeline. During the GitLab migration I wired exactly that up for the &lt;code&gt;infra&lt;/code&gt; repo, including a trust policy condition meant to let merge-request pipelines run a plan.&lt;/p&gt;
&lt;p&gt;The first merge request that should have triggered &lt;code&gt;tofu-plan&lt;/code&gt; didn&amp;rsquo;t run it. The job failed, and the error from AWS was a flat &lt;code&gt;AccessDenied&lt;/code&gt;. A 403.&lt;/p&gt;
&lt;h2 id="the-instinct-and-why-it-wastes-an-afternoon"&gt;The instinct, and why it wastes an afternoon
&lt;/h2&gt;&lt;p&gt;The instinct on an IAM 403 is immediate and almost always right: the policy&amp;rsquo;s wrong, so go and edit the policy. Tighten the condition. Loosen the condition. Check the wildcard. Re-read the &lt;code&gt;sub&lt;/code&gt; pattern character by character.&lt;/p&gt;
&lt;p&gt;All of that was wasted, and it was wasted for a reason that took me far too long to see. The trust policy wasn&amp;rsquo;t matching the &lt;em&gt;wrong&lt;/em&gt; value. It was matching a value that &lt;em&gt;does not exist&lt;/em&gt;. No amount of editing a condition makes it match a thing that&amp;rsquo;s never present.&lt;/p&gt;
&lt;h2 id="what-is-actually-in-the-token"&gt;What is actually in the token
&lt;/h2&gt;&lt;p&gt;GitLab&amp;rsquo;s OIDC token has a &lt;code&gt;sub&lt;/code&gt; claim that encodes the pipeline&amp;rsquo;s context, and part of that encoding is a &lt;code&gt;ref_type&lt;/code&gt;. I&amp;rsquo;d assumed &lt;code&gt;ref_type&lt;/code&gt; could be &lt;code&gt;branch&lt;/code&gt;, &lt;code&gt;tag&lt;/code&gt;, or &lt;code&gt;mr&lt;/code&gt;, because a pipeline can certainly be a branch pipeline, a tag pipeline, or a merge-request pipeline. So the trust policy, for the plan job, matched a &lt;code&gt;sub&lt;/code&gt; containing &lt;code&gt;ref_type:mr&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;That assumption was wrong. GitLab&amp;rsquo;s &lt;code&gt;ref_type&lt;/code&gt; is &lt;code&gt;branch&lt;/code&gt; or &lt;code&gt;tag&lt;/code&gt;. That&amp;rsquo;s the entire set. There is no &lt;code&gt;mr&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;A merge-request pipeline doesn&amp;rsquo;t run against a merge-request ref. It runs against the source &lt;em&gt;branch&lt;/em&gt;. So its token&amp;rsquo;s &lt;code&gt;sub&lt;/code&gt; carries &lt;code&gt;ref_type:branch&lt;/code&gt;, like any other branch pipeline. The trust policy condition asked for &lt;code&gt;ref_type:mr&lt;/code&gt;, GitLab never puts &lt;code&gt;mr&lt;/code&gt; in a token, the condition was therefore never true, and every merge-request pipeline got a 403. Forever, until the policy stopped asking for a claim that isn&amp;rsquo;t real.&lt;/p&gt;
&lt;h2 id="the-fix-and-the-lesson-worth-more-than-the-fix"&gt;The fix, and the lesson worth more than the fix
&lt;/h2&gt;&lt;p&gt;The fix is small once it&amp;rsquo;s visible: match &lt;code&gt;ref_type:branch&lt;/code&gt; and narrow it down by branch name or project path instead. An afternoon of policy edits, and the actual change is one word.&lt;/p&gt;
&lt;p&gt;The lesson is the part worth keeping. When an OIDC trust fails, the useful question is never &amp;ldquo;is my policy clever enough&amp;rdquo;. It&amp;rsquo;s &amp;ldquo;what&amp;rsquo;s &lt;em&gt;actually in the token&lt;/em&gt;&amp;rdquo;. An OIDC trust policy can only ever match the claims the identity provider genuinely asserts, and the gap between what a provider asserts and what you &lt;em&gt;assumed&lt;/em&gt; it asserts is precisely where this class of bug lives.&lt;/p&gt;
&lt;p&gt;So the move, when an OIDC handshake 403s, is to get hold of a real token and decode it. Look at the actual &lt;code&gt;sub&lt;/code&gt;, the actual claims, the actual values. Match what&amp;rsquo;s there. A 403 that survives every sensible edit to the policy is usually not a policy that&amp;rsquo;s too loose or too strict. It&amp;rsquo;s a policy matching a claim that was never going to be in the token.&lt;/p&gt;
&lt;h2 id="the-habit-it-left-behind"&gt;The habit it left behind
&lt;/h2&gt;&lt;p&gt;I wired an OIDC trust policy to let merge-request pipelines plan, by matching a &lt;code&gt;sub&lt;/code&gt; claim with &lt;code&gt;ref_type:mr&lt;/code&gt;. The first real merge request got a 403, and no edit to the policy fixed it, because GitLab&amp;rsquo;s &lt;code&gt;ref_type&lt;/code&gt; is only ever &lt;code&gt;branch&lt;/code&gt; or &lt;code&gt;tag&lt;/code&gt;. A merge-request pipeline runs on a branch ref, so the &lt;code&gt;mr&lt;/code&gt; value the policy demanded was never in any token.&lt;/p&gt;
&lt;p&gt;The fix was one word. The habit it left behind is the valuable bit: when an OIDC trust fails, stop editing the policy and go and read a real token. A trust policy can only match what the provider actually asserts, and &amp;ldquo;what I assumed it asserts&amp;rdquo; is where the 403 was hiding the whole time. (If this shape of bug feels familiar by the end of the series, that&amp;rsquo;s not an accident: I &lt;a class="link" href="https://blog-570662.gitlab.io/two-bugs-that-taught-me-the-rules/" &gt;come back to it&lt;/a&gt; with two more from exactly the same family.)&lt;/p&gt;</description></item><item><title>No access keys in CI</title><link>https://blog-570662.gitlab.io/no-access-keys-in-ci/</link><pubDate>Fri, 08 May 2026 00:00:00 +0000</pubDate><guid>https://blog-570662.gitlab.io/no-access-keys-in-ci/</guid><description>&lt;img src="https://blog-570662.gitlab.io/no-access-keys-in-ci/cover-no-access-keys-in-ci.png" alt="Featured image of post No access keys in CI" /&gt;&lt;p&gt;A long-lived AWS access key, sitting in a CI system, is just about the single credential I&amp;rsquo;d most like to be rid of. It&amp;rsquo;s powerful, it never expires unless someone remembers to rotate it (nobody remembers to rotate it), and it lives in one of the most attractive targets in the whole supply chain. For infrastructure that&amp;rsquo;s eventually going to hold a release-signing key, it&amp;rsquo;s exactly the wrong place to start. So the &lt;code&gt;phpboyscout&lt;/code&gt; infrastructure has no AWS access key in CI at all. None.&lt;/p&gt;
&lt;h2 id="the-access-key-you-dont-want"&gt;The access key you don&amp;rsquo;t want
&lt;/h2&gt;&lt;p&gt;A CI pipeline that runs &lt;code&gt;tofu apply&lt;/code&gt; against AWS needs AWS credentials. The traditional way to give it some is an IAM user with an access key pair, pasted into the CI system as a masked variable.&lt;/p&gt;
&lt;p&gt;Look at what that key is. It&amp;rsquo;s long-lived: it works until someone remembers to rotate it, and rotating it is a chore, so mostly nobody does. It&amp;rsquo;s powerful: it can apply infrastructure, so it can do nearly anything. And it&amp;rsquo;s sitting in a CI system, which is one of the most attractive targets in your whole supply chain. You&amp;rsquo;ve taken your highest-value credential and stored a permanent copy of it in a place built for running automated jobs.&lt;/p&gt;
&lt;p&gt;For infrastructure that&amp;rsquo;s going to hold a release-signing key, that&amp;rsquo;s precisely the wrong starting point. So the &lt;code&gt;phpboyscout&lt;/code&gt; infrastructure has no AWS access key in CI at all. Not a well-guarded one. None.&lt;/p&gt;
&lt;h2 id="federation-instead-of-a-stored-secret"&gt;Federation instead of a stored secret
&lt;/h2&gt;&lt;p&gt;The replacement is OIDC federation, and the shape of it is worth walking through, because it&amp;rsquo;s genuinely different from &amp;ldquo;a secret, but better&amp;rdquo;.&lt;/p&gt;
&lt;p&gt;A modern CI platform can mint an OIDC token. GitLab does this with an &lt;code&gt;id_tokens:&lt;/code&gt; block: at job time, GitLab issues a short-lived JSON Web Token, signed by GitLab, that asserts a set of facts. This is project X. This is pipeline Y. This is running on ref Z, of this type.&lt;/p&gt;
&lt;p&gt;AWS can consume that. The &lt;code&gt;sts:AssumeRoleWithWebIdentity&lt;/code&gt; call takes such a token and, if it satisfies an IAM role&amp;rsquo;s trust policy, returns short-lived AWS credentials for that role. The trust policy is where the control lives: it names GitLab as a trusted token issuer, and it constrains the token&amp;rsquo;s &lt;code&gt;sub&lt;/code&gt; claim so that only the specific project, and the specific refs, you intend can assume the role.&lt;/p&gt;
&lt;p&gt;Put it together: the pipeline asks GitLab for a token, hands it to AWS, and gets back credentials that last about an hour and are scoped to one role. Nothing long-lived is stored anywhere. The credential exists only for the job that needs it, and it can&amp;rsquo;t be stolen from a CI variable store, because it was never in one.&lt;/p&gt;
&lt;h2 id="two-halves-of-one-handshake"&gt;Two halves of one handshake
&lt;/h2&gt;&lt;p&gt;That handshake is built by two of the repos in this series, each owning one side.&lt;/p&gt;
&lt;p&gt;&lt;a class="link" href="https://blog-570662.gitlab.io/the-bootstrap-that-does-almost-nothing/" &gt;&lt;code&gt;terraform-aws-bootstrap&lt;/code&gt;&lt;/a&gt; builds the AWS half, in its &lt;code&gt;automation-iam&lt;/code&gt; module: it registers GitLab as an OIDC identity provider in the account, and it creates the automation role with the trust policy that decides which pipelines may assume it.&lt;/p&gt;
&lt;p&gt;The CI components build the consuming half: the &lt;code&gt;id_tokens:&lt;/code&gt; block that asks GitLab for the JWT, and then simply letting the AWS provider&amp;rsquo;s own credential chain perform the exchange. The pipeline doesn&amp;rsquo;t call &lt;code&gt;sts&lt;/code&gt; by hand. It presents the token; the SDK does the rest.&lt;/p&gt;
&lt;h2 id="the-gotcha-dont-set-a-profile"&gt;The gotcha: don&amp;rsquo;t set a profile
&lt;/h2&gt;&lt;p&gt;There&amp;rsquo;s one quiet way to break this, and a stack can look completely correct while doing it.&lt;/p&gt;
&lt;p&gt;The AWS SDK finds credentials by walking a chain of sources in order. The web-identity path, the one that uses the OIDC token, is one link in that chain. It triggers off environment variables the CI sets up automatically.&lt;/p&gt;
&lt;p&gt;But if the &lt;code&gt;aws&lt;/code&gt; provider block has a hardcoded &lt;code&gt;profile = &amp;quot;...&amp;quot;&lt;/code&gt;, the SDK takes the &lt;em&gt;profile&lt;/em&gt; link of the chain instead, and never reaches the web-identity link. A &lt;code&gt;profile&lt;/code&gt; line is the sort of thing that ends up in a provider block from someone&amp;rsquo;s local development setup, where it&amp;rsquo;s exactly right. Committed and run in CI, it silently short-circuits the federation. The pipeline either fails to find credentials, or finds the wrong ones.&lt;/p&gt;
&lt;p&gt;The rule is simple once you know it: the provider block that runs in CI must not name a &lt;code&gt;profile&lt;/code&gt;. Leave the chain free to find the web identity. It&amp;rsquo;s the kind of bug that teaches you to be precise about &lt;em&gt;which&lt;/em&gt; link of the credential chain you&amp;rsquo;re actually relying on.&lt;/p&gt;
&lt;h2 id="the-bottom-line"&gt;The bottom line
&lt;/h2&gt;&lt;p&gt;Giving CI an AWS access key means storing your most powerful, longest-lived credential in one of your most exposed systems. OIDC federation removes it entirely. The CI platform mints a short-lived signed token, AWS exchanges it via &lt;code&gt;AssumeRoleWithWebIdentity&lt;/code&gt; for hour-long credentials against a role whose trust policy names the exact pipeline, and nothing permanent is stored.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;terraform-aws-bootstrap&lt;/code&gt; builds the AWS side, the identity provider and the trust policy; the CI components build the consuming side, the token request. The one trap is a hardcoded &lt;code&gt;profile&lt;/code&gt; in the provider block, which short-circuits the SDK&amp;rsquo;s credential chain before it reaches the web-identity path. Get that right, and a pipeline deploys to AWS as a verifiable, short-lived identity, with no key to steal.&lt;/p&gt;</description></item></channel></rss>