<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Image-Processing on PHP Boy Scout</title><link>https://blog-570662.gitlab.io/tags/image-processing/</link><description>Recent content in Image-Processing on PHP Boy Scout</description><generator>Hugo -- gohugo.io</generator><language>en-gb</language><copyright>Matt Cockayne</copyright><lastBuildDate>Wed, 01 Jul 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://blog-570662.gitlab.io/tags/image-processing/index.xml" rel="self" type="application/rss+xml"/><item><title>There's no AI in my photo culler</title><link>https://blog-570662.gitlab.io/no-ai-in-my-photo-culler/</link><pubDate>Wed, 01 Jul 2026 00:00:00 +0000</pubDate><guid>https://blog-570662.gitlab.io/no-ai-in-my-photo-culler/</guid><description>&lt;img src="https://blog-570662.gitlab.io/no-ai-in-my-photo-culler/cover-no-ai-in-my-photo-culler.png" alt="Featured image of post There's no AI in my photo culler" /&gt;&lt;p&gt;Before a wedding photographer can edit a single frame, there&amp;rsquo;s the cull: sitting down with three or four thousand photos from the day and deciding which are even worth keeping. The blurry ones, the ones where the flash fired into a mirror, the same moment shot eight times in a burst where only one frame is sharp. It&amp;rsquo;s mechanical, it&amp;rsquo;s exhausting, and it&amp;rsquo;s the first job krites does for Hailey.&lt;/p&gt;
&lt;p&gt;Every culling tool I looked at before building it leads with the same word. AI. AI culling, AI selects, trained on millions of weddings. So when I sat down to write krites&amp;rsquo; first pass, I assumed I&amp;rsquo;d be wiring up a model too. For the part that does the most work, it turns out, I didn&amp;rsquo;t need one.&lt;/p&gt;
&lt;p&gt;The shipped culler doesn&amp;rsquo;t load a single weight. It&amp;rsquo;s arithmetic, the sort a calculator could do if you were patient enough, and that&amp;rsquo;s a deliberate choice rather than a corner I cut. Here&amp;rsquo;s what&amp;rsquo;s actually under it.&lt;/p&gt;
&lt;h2 id="blur-is-the-variance-of-a-laplacian"&gt;Blur is the variance of a Laplacian
&lt;/h2&gt;&lt;p&gt;The first question for any frame is whether it&amp;rsquo;s in focus. You can answer it without knowing anything about what&amp;rsquo;s in the photo.&lt;/p&gt;
&lt;p&gt;A Laplacian is an edge detector. Run it over an image and it lights up wherever the brightness changes sharply, the crisp boundary between a dark suit and a white shirt, the line of an eyelash. A photo in focus is full of those sharp transitions; a soft or motion-blurred one has smeared them all into gentle gradients. So if you measure how much the edge response &lt;em&gt;varies&lt;/em&gt; across the frame, a sharp photo gives you a big spread of values and a blurry one gives you a flat, lifeless number. That single number is the focus score.&lt;/p&gt;
&lt;p&gt;In krites it&amp;rsquo;s a 3×3 kernel over the frame&amp;rsquo;s luma (the brightness channel, Rec. 601 weights), and the score is the variance of the response:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-go" data-lang="go"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nx"&gt;lap&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;:=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;luma&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="nx"&gt;y&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nx"&gt;w&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;luma&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="nx"&gt;y&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nx"&gt;w&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;luma&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;y&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nx"&gt;w&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;luma&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;y&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nx"&gt;w&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;lapCenter&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Sum the responses, sum their squares, and the variance falls out as &lt;code&gt;sumSq/n - mean*mean&lt;/code&gt;. No training data, no inference, and the same pixels always give the same answer. (&lt;a class="link" href="https://gitlab.com/phpboyscout/krites/-/blob/fe863ae/pkg/analyze/quality/quality.go#L89-L118" target="_blank" rel="noopener"
 &gt;&lt;code&gt;quality.go&lt;/code&gt;&lt;/a&gt;.)&lt;/p&gt;
&lt;h2 id="exposure-is-a-histogram"&gt;Exposure is a histogram
&lt;/h2&gt;&lt;p&gt;The second question is whether the exposure is salvageable. If a third of the frame is pure white, the highlights are blown and there&amp;rsquo;s no detail to bring back; if it&amp;rsquo;s mostly pure black, the shadows are crushed the same way.&lt;/p&gt;
&lt;p&gt;That&amp;rsquo;s just counting. Walk the luma plane once, tally how many pixels sit at or above a near-white threshold and how many at or below a near-black one, divide by the total, and you&amp;rsquo;ve got two fractions: the blown-highlight proportion and the crushed-shadow proportion. A photographer cares about those two numbers directly, and a &lt;code&gt;for&lt;/code&gt; loop produces them (&lt;a class="link" href="https://gitlab.com/phpboyscout/krites/-/blob/fe863ae/pkg/analyze/quality/quality.go#L120-L140" target="_blank" rel="noopener"
 &gt;&lt;code&gt;quality.go&lt;/code&gt;&lt;/a&gt;).&lt;/p&gt;
&lt;h2 id="two-photos-are-the-same-when-sixty-four-bits-agree"&gt;Two photos are the same when sixty-four bits agree
&lt;/h2&gt;&lt;p&gt;Then there are the bursts. A photographer holds the shutter through the first kiss and gets twelve nearly-identical frames; you want the sharpest one and the rest out of the way. To do that the tool has to know which frames are &amp;ldquo;the same shot&amp;rdquo;, and again you don&amp;rsquo;t need to understand the photo to tell.&lt;/p&gt;
&lt;p&gt;The trick is a perceptual hash, a difference hash to be exact. Shrink the image right down to a nine-by-eight grey thumbnail, then for each row note simply whether each cell is brighter than the one to its right. That&amp;rsquo;s sixty-four yes/no comparisons, packed into a sixty-four-bit number, a fingerprint of the picture&amp;rsquo;s broad light-and-dark structure that survives a resize, a small reframe or a touch of noise:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-go" data-lang="go"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;grey&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;y&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nx"&gt;hashW&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;grey&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;y&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nx"&gt;hashW&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;h&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;|=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;Hash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;bit&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Two fingerprints are compared by counting the bits that differ between them, the Hamming distance, which on a 64-bit integer is one CPU instruction (&lt;code&gt;bits.OnesCount64&lt;/code&gt;). A small distance means the frames look alike. krites only clusters &lt;em&gt;consecutive&lt;/em&gt; frames within that distance, so a run of similar shots merges into a burst but two unrelated photos that happen to rhyme don&amp;rsquo;t (&lt;a class="link" href="https://gitlab.com/phpboyscout/krites/-/blob/fe863ae/pkg/analyze/dedup/dedup.go#L37-L89" target="_blank" rel="noopener"
 &gt;&lt;code&gt;dedup.go&lt;/code&gt;&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;Best-of-burst is then the dullest line of code in the project: keep the sharpest frame in the cluster, demote the others from &lt;em&gt;keep&lt;/em&gt; to &lt;em&gt;maybe&lt;/em&gt;, and write down why.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-go" data-lang="go"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nx"&gt;fv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Reasons&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;fv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Reasons&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;#34;near-duplicate of &amp;#34;&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="nx"&gt;bestFrame&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="s"&gt;&amp;#34; (kept the sharper frame)&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="signals-in-a-verdict-out"&gt;Signals in, a verdict out
&lt;/h2&gt;&lt;p&gt;None of those measurements decide anything on their own. A focus score of 50 is rejectable on one shoot and fine on another, because the numbers scale with resolution and content. So the signals feed a &lt;em&gt;profile&lt;/em&gt;, a small set of thresholds, and the profile turns them into a ruling: below the hard focus gate it&amp;rsquo;s a reject, below a softer floor it&amp;rsquo;s a maybe, blown past the exposure gates it&amp;rsquo;s a reject, otherwise it&amp;rsquo;s a keep. Every verdict carries its reasons in plain words, &amp;ldquo;out of focus (sharpness 32 below 50)&amp;rdquo;, because krites proposes and the human disposes (&lt;a class="link" href="https://gitlab.com/phpboyscout/krites/-/blob/fe863ae/pkg/cull/cull.go#L71-L108" target="_blank" rel="noopener"
 &gt;&lt;code&gt;cull.go&lt;/code&gt;&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;The seed thresholds for a wedding are just a starting point, written to config on &lt;code&gt;krites init&lt;/code&gt; and tuned from there:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-go" data-lang="go"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nx"&gt;seedMinSharpness&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;// below this: rejected as out of focus&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nx"&gt;seedSoftSharpness&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;150&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;// below this (but &amp;gt;= min): demoted to maybe&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nx"&gt;seedMaxHighlights&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nx"&gt;seedMaxShadows&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.30&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nx"&gt;seedDedupDistance&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The thresholds are the whole point of keeping them visible. &amp;ldquo;Suitable for a wedding album&amp;rdquo; is Hailey&amp;rsquo;s definition, not mine and not a model&amp;rsquo;s, and a number in a config file is something she can move (&lt;a class="link" href="https://gitlab.com/phpboyscout/krites/-/blob/fe863ae/pkg/cull/profile.go#L9-L29" target="_blank" rel="noopener"
 &gt;&lt;code&gt;profile.go&lt;/code&gt;&lt;/a&gt;).&lt;/p&gt;
&lt;h2 id="where-the-models-do-belong"&gt;Where the models do belong
&lt;/h2&gt;&lt;p&gt;I&amp;rsquo;m not claiming AI has no place in this. Some of what a wedding photographer culls on genuinely needs a model: is this person mid-blink, is anyone actually looking at the camera, is the composition any good. Those are coming, and they&amp;rsquo;ll be model-backed when they do. The deliberate bit is that they sit &lt;em&gt;outside&lt;/em&gt; this deterministic core, behind an interface, opt-in. The maths that does the heavy lifting of the first pass never imports a model.&lt;/p&gt;
&lt;p&gt;That separation buys three things you lose the moment a neural net touches the hot path. It&amp;rsquo;s reproducible: the same frames in the same order always cull the same way, so a verdict is debuggable and a regression is catchable. It&amp;rsquo;s quick enough to run over four thousand frames on a laptop with no GPU. And it stays honest about what it knows, because a threshold you can read is a threshold you can argue with, which a confidence score from a black box never quite is.&lt;/p&gt;
&lt;p&gt;&amp;ldquo;AI culling&amp;rdquo; makes for a better headline. But blur really is just a number, a duplicate really is just sixty-four bits, and the grim, mechanical first pass that stands between a photographer and their best photos comes down to arithmetic.&lt;/p&gt;</description></item></channel></rss>