Process · AI Engineering for Frontend

I run multi-agent pipelines that ship production WebGL.

Five specialists. Four human gates I sit at personally. One brief in. One pull request out.

The pipeline I built at CraftedKit takes a hero brief — palette, motion budget, industry, restraint tier — and routes it through five specialist agents. Research, design, build, QA, and an orchestrator. I sit at four hard gates: mission approval, two creative reviews, and ship. Nothing moves between agents without my read. The harness enforces style and safety at every Write and Edit, so the output that lands at my desk is already past the slop threshold. The point of the system is not to remove me from the loop. It is to put me only where my judgment is the binding constraint.

Section 02

Where agents pull their weight.

01
Reference research
Pull 5+ live sites from a brief. Annotate steal-from, avoid, and motion budget per reference. Cuts the inspiration-gathering pass from a half-day to a coffee break.
02
Component scaffolding under tight rules
Given approved direction, palette, and motion budget, agents produce R3F + GLSL components that compile, render, and pass smoke tests. Output is small enough to QA by hand.
03
Style enforcement
Banned patterns blocked at PostToolUse. Wildcard Three imports, console.log calls, suppressed lint rules — agents physically cannot ship slop because the harness rejects it.
04
Repetitive operations
Asset conversion, video compression, file moves, mission scaffolding. Deterministic, well-defined work. Native agent territory.
05
QA and audit
Bundle size, FPS validation, motion-budget claims. Agents run the audit before the PR opens. Faster than manual, more consistent than spot-checks.

Section 03

Where humans still own it.

01
Taste calls on direction
Picking 1 of 3 references requires brand fit, cultural read, trend timing. Agents over-fit to averages. I make the call at Mission Approval — every mission, every time.
02
Mobile legibility
Does this read on a 375px screen at 14px in motion? Has to be tested on a real phone. Agents miss this. I sit at Creative Review B with the build on my desk and in my pocket.
03
Compounding architecture
Naming conventions, module boundaries, registry shapes. Agents pick fine-looking answers that compound badly over 100 components. The bones are mine.
04
Deep stack debugging
Race conditions, GPU vendor quirks, Apple Silicon Metal vs ANGLE, shader precision drift across mobile. Agents flounder. Humans pattern-match.
05
The "is this any good" gut read
Agents will tell you everything looks great. They cannot replace the gut read of "this is unmissable" versus "this is competent." That is the human gate, full stop.

Section 04

Hire the engineer or hire the studio.

hello@craftedkit.io ↗︎craftedkit.io ↗︎

Where agents pull their weight.

Reference research

Component scaffolding under tight rules

Style enforcement

Repetitive operations

QA and audit

Where humans still own it.

Taste calls on direction

Mobile legibility

Compounding architecture

Deep stack debugging

The "is this any good" gut read

Hire the engineer or hire the studio.