System teardown · Case 01 deep dive

The screen builder, end to end.

Companion piece to Case 01. The walkthrough below is mocked into a neutral domain — a document review workflow — for confidentiality. The architecture, phases, and outputs mirror what we run inside the real product.

Context layer

Eight folders feed the agent its working knowledge, governed by a single CLAUDE.md that defines retrieval rules, scope, and which folder owns what kind of decision.

Eight folders, one rulebook
Folders the agent reads, governed by CLAUDE.md
design-reviews/
design-working-sessions/
user-feedback/
design-system/
competitor-breakdowns/
current-ui-screens/
modernized-ui-screens/
scraped-customer-implementations/

The last folder solved a real visibility gap. We couldn't see how customers were configuring our white-label deployments, so I had the agent build a compliant scraper (run inside our legal team's guardrails) to surface deployment patterns and emerging feature requests we'd otherwise miss. That folder fed every subsequent design decision.

The six-phase pipeline

Each run moves through six phases. Outputs from earlier phases are cited explicitly by later phases, so every artifact carries provenance back to the problem brief that spawned it.

  1. 01
    Audit current product

    Reads current UI screens, the live production site, and recent meeting context.

  2. 02
    Best-practice research

    Pulls competitor patterns: navigation, hierarchy, density, empty/error states, mobile UX. Flags components missing from our library.

  3. 03
    Design rationale + build

    Uses the screen-builder rules, component snippets, and the CSS token system. Builds HTML from tokens and approved components only.

  4. 04
    Accessibility verification

    WCAG 2.2 audit before delivery. Surfaces violations with remediation guidance.

  5. 05
    Packaged delivery

    Final HTML + assets, ready for dev review and Figma round-trip.

  6. 06
    Decision log + learned rules

    Every choice ties back to phase 1–2 evidence. Misuses become hard rules the next run can't violate.

The pipeline on one page

The full diagram, end to end. Problem framing on the left, post-ship learning loop on the right. Every artifact in the middle has a citation back to the brief that spawned it.

8-phase AI-augmented design pipeline, full diagram
Phases 0 through 7, with checkpoints and proposed reinforcements.

The feedback loop

The pipeline above describes a single run. The compounding value comes from what happens between runs.

Per-product memory
The system gets harder to break with every run

After delivery, designers review the screens and write feedback into a lessons.md file — one per product. The screen builder reads that file at the start of every subsequent run. Misuses, edge cases we missed, components used out of context, accessibility gaps caught in review — all of it accrues into a per-product memory the agent has to honor on the next pass.

The compounding effect is the point. Run one is competent. Run twenty is shaped by twenty rounds of designer judgment. The agent isn't getting smarter — the codified design intent around it is.

One run, output side

Below: the agent generated three states for a Document Review screen — drawing only from approved components and the documented edge cases.

Output side · one run
Document review screen — three documented edge cases

The agent generated each variant from the same component vocabulary. Edge cases came from documented scenarios, not improvisation.

Happy path
app.example/reviews
Document Review
AllMineTeam
Empty state
app.example/reviews
Document Review
Nothing to review yet
When teammates assign you a document, it lands here.
View team queue
Permissions: read-only
app.example/reviews
Read-only access — comments are visible to assigned reviewers only.
Document Review
Decision log excerpt · phase 06
  • ComponentUsed Banner / informational for the read-only state. Source: competitor-breakdowns/permissions-patterns.md, lines 14–22.
  • TokenStatus pills use --status-pending-amber-300, --status-active-blue-400, --status-approved-green-400. No new tokens introduced.
  • A11yEmpty-state CTA reaches 4.7:1 contrast on dark surface. WCAG 2.2 AA · pass.
  • Rule learnedDisabled rows must keep their status pill at full opacity — readability beats greyscale uniformity. Added to screen-builder-rules.md §3.4.