Skip to content

Process · AI Engineering for Frontend

I run multi-agent pipelines that ship production WebGL.

Five specialists. Four human gates I sit at personally. One brief in. One pull request out.

The pipeline I built at CraftedKit takes a hero brief — palette, motion budget, industry, restraint tier — and routes it through five specialist agents. Research, design, build, QA, and an orchestrator. I sit at four hard gates: mission approval, two creative reviews, and ship. Nothing moves between agents without my read. The harness enforces style and safety at every Write and Edit, so the output that lands at my desk is already past the slop threshold. The point of the system is not to remove me from the loop. It is to put me only where my judgment is the binding constraint.

Section 02

Where agents pull their weight.

  • 01

    Reference research

    Pull 5+ live sites from a brief. Annotate steal-from, avoid, and motion budget per reference. Cuts the inspiration-gathering pass from a half-day to a coffee break.

  • 02

    Component scaffolding under tight rules

    Given approved direction, palette, and motion budget, agents produce R3F + GLSL components that compile, render, and pass smoke tests. Output is small enough to QA by hand.

  • 03

    Style enforcement

    Banned patterns blocked at PostToolUse. Wildcard Three imports, console.log calls, suppressed lint rules — agents physically cannot ship slop because the harness rejects it.

  • 04

    Repetitive operations

    Asset conversion, video compression, file moves, mission scaffolding. Deterministic, well-defined work. Native agent territory.

  • 05

    QA and audit

    Bundle size, FPS validation, motion-budget claims. Agents run the audit before the PR opens. Faster than manual, more consistent than spot-checks.

Section 03

Where humans still own it.

  • 01

    Taste calls on direction

    Picking 1 of 3 references requires brand fit, cultural read, trend timing. Agents over-fit to averages. I make the call at Mission Approval — every mission, every time.

  • 02

    Mobile legibility

    Does this read on a 375px screen at 14px in motion? Has to be tested on a real phone. Agents miss this. I sit at Creative Review B with the build on my desk and in my pocket.

  • 03

    Compounding architecture

    Naming conventions, module boundaries, registry shapes. Agents pick fine-looking answers that compound badly over 100 components. The bones are mine.

  • 04

    Deep stack debugging

    Race conditions, GPU vendor quirks, Apple Silicon Metal vs ANGLE, shader precision drift across mobile. Agents flounder. Humans pattern-match.

  • 05

    The "is this any good" gut read

    Agents will tell you everything looks great. They cannot replace the gut read of "this is unmissable" versus "this is competent." That is the human gate, full stop.

Section 04

Hire the engineer or hire the studio.