Blog Post

Bolt, Figma Make, and Lovable Ship Fast — With ~160 Bugs Per App

Quick answer: AI app builders like Bolt, Figma Make, and Lovable average ~160 issues per generated app. The most common problems are misaligned layouts, broken CSS, ignored design tokens, and accessibility violations.

AI app builders promise production-ready code in minutes. Lovable, Bolt, and Figma Make can generate a full application from a single prompt. But "generated" is not the same as "shipped." When you run the output through a structured QA pass, the numbers tell a different story: an average of ~160 issues per app, ranging from subtle layout drift to hard accessibility failures.

The Promise vs. the Reality

AI-generated apps look impressive in demos. The layout appears correct, components render, and interactions mostly work. But visual quality requires more than "mostly works." It requires precision — and precision is exactly where current AI code generation falls short. The generated code compiles and runs, but the visual output drifts from what a designer would approve.

What We Found: ~160 Issues Per App

We ran structured QA audits on apps generated by Lovable, Bolt, and Figma Make. Each app was evaluated against standard design quality criteria: layout accuracy, responsive behavior, typography consistency, color and contrast compliance, interactive states, and accessibility. The average across all tools was approximately 160 issues per generated application.

40-50% Are Layout Problems

Layout issues account for the largest category — between 40% and 50% of all findings. These include misaligned elements, inconsistent spacing, broken flex/grid behavior at certain breakpoints, and containers that overflow or collapse unexpectedly. AI builders generate layout code that works at one viewport size but breaks at others.

Broken CSS and Ignored Design Tokens

Generated code frequently uses hardcoded pixel values instead of design tokens. Font sizes, spacing, and colors are inlined rather than referenced from a system. This means every component is a one-off — there's no design system consistency even when the prompt references one. CSS issues include incorrect stacking contexts, missing overflow handling, and specificity conflicts.

Accessibility Violations

Every generated app we tested had accessibility issues. Missing alt text, insufficient color contrast, missing form labels, incorrect heading hierarchy, and non-keyboard-accessible interactive elements. These aren't edge cases — they're baseline WCAG 2.1 AA requirements that AI builders consistently miss.

Tool-by-Tool Findings

Each AI builder has distinct patterns in the types of issues it generates. Lovable tends to produce cleaner component structure but struggles with responsive edge cases. Bolt generates more complete features but with higher CSS specificity conflicts and hardcoded values. Figma Make produces layout-faithful output at the reference viewport but breaks down at other sizes and consistently ignores interaction states.

Why AI Builders Produce These Issues

AI code generation optimizes for "looks right at first glance." The models are trained on code that compiles and renders, not code that passes a structured design review. They lack context about design systems, responsive requirements, accessibility standards, and the interaction states that real products need. The result is code that appears complete but isn't production-ready.

The 7-Point QA Checklist for AI-Generated Apps

Before shipping any AI-generated app, run through this checklist. These checks catch the most common issues.

Layout scan. Resize the browser from 320px to 1440px. Watch for overflow, collapsed containers, misaligned elements, and broken grid/flex behavior. This single step catches 40-50% of all issues.
Responsive test. Check three breakpoints: mobile (375px), tablet (768px), and desktop (1280px). Verify navigation, cards, and form layouts adapt correctly at each size.
Typography. Confirm font family, weight, size, and line-height match the design spec. AI builders frequently substitute similar-looking but incorrect values.
Color and contrast. Verify primary, secondary, and text colors against the design system. Run a contrast check on text against its background — WCAG AA requires 4.5:1 for body text.
Interactive states. Hover every button, focus every input, click every link. Check for missing hover states, focus indicators, active states, and disabled styling.
Accessibility. Tab through the entire page. Verify all images have alt text, all form inputs have labels, and heading levels are sequential (h1, h2, h3 — no skipping).
Empty and error states. What happens when a list has zero items? When a form submission fails? When an image fails to load? AI builders almost never generate these states.

From 160 Issues to Shippable

The point isn't that AI app builders are bad — they're remarkably capable for first drafts. The point is that first drafts need QA. The 5-minute checklist above catches the majority of issues. For teams shipping AI-generated code regularly, a structured design QA step turns "generated" into "production-ready" and prevents visual bugs from reaching users.