There's no AI in my photo culler

Before a wedding photographer can edit a single frame, there’s the cull: sitting down with three or four thousand photos from the day and deciding which are even worth keeping. The blurry ones, the ones where the flash fired into a mirror, the same moment shot eight times in a burst where only one frame is sharp. It’s mechanical, it’s exhausting, and it’s the first job krites does for Hailey.

Every culling tool I looked at before building it leads with the same word. AI. AI culling, AI selects, trained on millions of weddings. So when I sat down to write krites’ first pass, I assumed I’d be wiring up a model too. For the part that does the most work, it turns out, I didn’t need one.

The shipped culler doesn’t load a single weight. It’s arithmetic, the sort a calculator could do if you were patient enough, and that’s a deliberate choice rather than a corner I cut. Here’s what’s actually under it.

Blur is the variance of a Laplacian

The first question for any frame is whether it’s in focus. You can answer it without knowing anything about what’s in the photo.

A Laplacian is an edge detector. Run it over an image and it lights up wherever the brightness changes sharply, the crisp boundary between a dark suit and a white shirt, the line of an eyelash. A photo in focus is full of those sharp transitions; a soft or motion-blurred one has smeared them all into gentle gradients. So if you measure how much the edge response varies across the frame, a sharp photo gives you a big spread of values and a blurry one gives you a flat, lifeless number. That single number is the focus score.

In krites it’s a 3×3 kernel over the frame’s luma (the brightness channel, Rec. 601 weights), and the score is the variance of the response:

lap := int(luma[(y-1)*w+x]) + int(luma[(y+1)*w+x]) +
    int(luma[y*w+x-1]) + int(luma[y*w+x+1]) - lapCenter*c

Sum the responses, sum their squares, and the variance falls out as sumSq/n - mean*mean. No training data, no inference, and the same pixels always give the same answer. (quality.go.)

Exposure is a histogram

The second question is whether the exposure is salvageable. If a third of the frame is pure white, the highlights are blown and there’s no detail to bring back; if it’s mostly pure black, the shadows are crushed the same way.

That’s just counting. Walk the luma plane once, tally how many pixels sit at or above a near-white threshold and how many at or below a near-black one, divide by the total, and you’ve got two fractions: the blown-highlight proportion and the crushed-shadow proportion. A photographer cares about those two numbers directly, and a for loop produces them (quality.go).

Two photos are the same when sixty-four bits agree

Then there are the bursts. A photographer holds the shutter through the first kiss and gets twelve nearly-identical frames; you want the sharpest one and the rest out of the way. To do that the tool has to know which frames are “the same shot”, and again you don’t need to understand the photo to tell.

The trick is a perceptual hash, a difference hash to be exact. Shrink the image right down to a nine-by-eight grey thumbnail, then for each row note simply whether each cell is brighter than the one to its right. That’s sixty-four yes/no comparisons, packed into a sixty-four-bit number, a fingerprint of the picture’s broad light-and-dark structure that survives a resize, a small reframe or a touch of noise:

if grey[y*hashW+x] > grey[y*hashW+x+1] {
    h |= Hash(1) << bit
}

Two fingerprints are compared by counting the bits that differ between them, the Hamming distance, which on a 64-bit integer is one CPU instruction (bits.OnesCount64). A small distance means the frames look alike. krites only clusters consecutive frames within that distance, so a run of similar shots merges into a burst but two unrelated photos that happen to rhyme don’t (dedup.go).

Best-of-burst is then the dullest line of code in the project: keep the sharpest frame in the cluster, demote the others from keep to maybe, and write down why.

fv.Reasons = append(fv.Reasons, "near-duplicate of "+bestFrame+" (kept the sharper frame)")

Signals in, a verdict out

None of those measurements decide anything on their own. A focus score of 50 is rejectable on one shoot and fine on another, because the numbers scale with resolution and content. So the signals feed a profile, a small set of thresholds, and the profile turns them into a ruling: below the hard focus gate it’s a reject, below a softer floor it’s a maybe, blown past the exposure gates it’s a reject, otherwise it’s a keep. Every verdict carries its reasons in plain words, “out of focus (sharpness 32 below 50)”, because krites proposes and the human disposes (cull.go).

The seed thresholds for a wedding are just a starting point, written to config on krites init and tuned from there:

seedMinSharpness  = 50  // below this: rejected as out of focus
seedSoftSharpness = 150 // below this (but >= min): demoted to maybe
seedMaxHighlights = 0.10
seedMaxShadows    = 0.30
seedDedupDistance = 8

The thresholds are the whole point of keeping them visible. “Suitable for a wedding album” is Hailey’s definition, not mine and not a model’s, and a number in a config file is something she can move (profile.go).

Where the models do belong

I’m not claiming AI has no place in this. Some of what a wedding photographer culls on genuinely needs a model: is this person mid-blink, is anyone actually looking at the camera, is the composition any good. Those are coming, and they’ll be model-backed when they do. The deliberate bit is that they sit outside this deterministic core, behind an interface, opt-in. The maths that does the heavy lifting of the first pass never imports a model.

That separation buys three things you lose the moment a neural net touches the hot path. It’s reproducible: the same frames in the same order always cull the same way, so a verdict is debuggable and a regression is catchable. It’s quick enough to run over four thousand frames on a laptop with no GPU. And it stays honest about what it knows, because a threshold you can read is a threshold you can argue with, which a confidence score from a black box never quite is.

“AI culling” makes for a better headline. But blur really is just a number, a duplicate really is just sixty-four bits, and the grim, mechanical first pass that stands between a photographer and their best photos comes down to arithmetic.