Blog Post

I Asked AI to Build a Design System. It Broke Its Own Rules in One Prompt.

Quick answer: I asked Figma Make to generate a design system, then build a dashboard using it. The design system defined 4 colors, one font weight, and zero spacing tokens. The dashboard introduced new colors, swapped to system fonts, used wrong tokens, and hardcoded values the system never defined. One element, four violations. The AI broke its own design system in a single prompt.

Design systems exist to keep products consistent. But what happens when the system itself is AI-generated, and the product using it is also AI-generated, in the same session, by the same tool?

Prompt 1: Generate a Design System

Figma Make returned a full page: color palette, typography scale, buttons with six variants, form elements, feedback components, data display with stats cards. It looked comprehensive at first glance. Then you look closer: 4 colors total with no state variants, every heading at Medium weight, same 1.5 line-height everywhere, zero spacing tokens, and no interactive states defined.

Prompt 2: Make It Real

A more specific prompt asked for text/border/background colors, spacing scale from 4px to 64px, elevation tokens, dark mode palette, hover/focus/active states, and border radius tokens. Figma Make delivered the structure. But on publish, the color palette section broke. Background color swatches rendered as invisible or missing. The design system could not survive its own deployment.

Prompt 3: Build a Product With It

The prompt: "Build me a dashboard for this SaaS product using this design system." Figma Make generated a full dashboard with sidebar nav, stats cards, charts, data table. It even reused the exact same placeholder numbers from the design system's stats cards. It did not generate new data. It copy-pasted the demo data and wrapped a layout around it.

The Audit: What the Design System Says vs. What the Dashboard Does

OverlayQA's element inspector revealed the gap between the defined system and the built product.

Colors It Invented

Pastel icon backgrounds on stats cards, a purple-to-pink chart gradient, four new donut chart colors, and green/orange status badges. None from the defined palette.

Typography It Ignored

The design system defined every heading as Medium weight (500). Dashboard stat numbers render as Bold or Semibold. The AI promoted font weights without updating the system.

The Element Inspector Findings

The "completed" badge showed four violations in one element: font-family ui-sans-serif (system font, not the defined font), background #10B981 mapped to --chart-5 (a chart token used for a status badge), color #FFFFFF with no token reference (hardcoded white), and padding 2px 8px (2px is not in the 4px spacing scale).

The chart subtitle used --muted-foreground correctly but dropped from Medium (500) to Regular (400) weight. The AI sometimes references its own tokens and sometimes ignores them. It has access to the system but does not consistently use it.

The Missing Icon

OverlayQA flagged a missing icon in the Active Users stats card. Four identical cards, three with icons, one without. A visual bug a human reviewer might miss because the cards look structurally the same at a glance.

Why AI Design Systems Drift From Themselves

Traditional design system drift happens over months. AI design system drift happens in a single session, between two consecutive prompts, by the same tool. Three patterns explain why: no enforcement layer (tokens exist as documentation, not constraints), context window limitations (tokens get approximated instead of looked up), and pattern matching over rule following (the AI generates what looks like a dashboard from training data rather than following the rules it just defined).

A Design System Without Enforcement Is a Mood Board

The AI generated a design system with a Token Reference table and Implementation Guide. Then it built a product that ignores most of the tokens. A design system is only as good as its enforcement. If you are using AI to generate design systems, the output needs automated, element-level inspection that checks computed CSS values against defined tokens.

Frequently Asked Questions

Can AI tools generate a usable design system?

AI tools can generate the visual structure of a design system but the output tends to be shallow. In this test, Figma Make produced only 4 colors with no state variants, uniform heading weights, and zero spacing tokens.

Does the AI use its own design system when building a product?

Inconsistently. Some tokens were correctly referenced. Others were misapplied (a chart token used as a status color) or ignored entirely (hardcoded white, 2px padding outside the defined spacing scale).

How is AI design system drift different from normal drift?

Traditional drift accumulates over months across multiple developers. AI drift happens in a single session, between two consecutive prompts, by the same tool.

How do you detect AI design system drift?

Element-level CSS inspection. Click an element on the published page, check its computed values against the design system's defined tokens. OverlayQA surfaces token mappings and flags values that do not match.

What should you do after AI generates a design system?

Treat it as a starting point, not a finished system. Verify token completeness, check that published output displays correctly, then audit every page built with the system against its defined tokens.

Related Resources