Blog Post

QA for Vibe Coders: The Visual Bugs AI Won't Catch

Quick answer: Vibe coding QA is the process of checking AI-generated apps for visual bugs, accessibility failures, and design drift before shipping. AI code tools produce approximately 160 issues per app. A structured QA pass catches the problems that AI cannot see in its own output.

AI code tools like Bolt.new, Lovable, Cursor, and v0 can scaffold a functional app from a natural language prompt in minutes. But every tool in the vibe coding ecosystem optimizes for scaffolding speed, not for the last 30% of polish that separates a prototype from a product. That last 30% is QA territory.

What Is Vibe Coding QA?

Vibe coding is the practice of building software by describing what you want in natural language and letting an AI write the code. Vibe coding QA is the quality assurance process specifically adapted for AI-generated output. It shares the same foundations as traditional design QA, but the priority order shifts because AI-generated apps have different failure patterns.

Why AI-Generated Code Needs More QA, Not Less

Jason Arbon ran 1,000+ automated checks against apps built with Lovable, Bolt.new, and similar tools. The result: approximately 160 issues per app. Both tools produce statistically equivalent results (p=0.7199). A SmartBear survey found that 68% of teams say faster AI-assisted development creates testing bottlenecks.

The Fix Loop Tax

The most expensive consequence of skipping QA is the fix loop. You prompt the AI to fix a bug. It rewrites the file. The fix works but breaks two other things. Each cycle burns 3-5 million tokens. One developer reported using 20 million tokens trying to fix a single authentication bug.

The Visual Bugs AI Won't Catch

1. Responsive Layout Failures

AI tools default to Tailwind breakpoints and generate layouts that break between them. Container overflow, flex wrap chaos, hidden content, and touch target compression are the most common symptoms.

2. Design System Drift

AI code tools are trained on ShadCN and Tailwind defaults. When your design uses different values, the output drifts toward training data. Inconsistent spacing, color substitution, typography mismatch, and component variant drift result.

3. Accessibility Gaps

AI-generated code routinely fails basic accessibility checks: missing alt text, color contrast failures, no focus indicators, heading hierarchy breaks, and missing form labels. For teams subject to ADA Title II compliance, these are legal exposure.

4. Missing Interaction States

AI generates the default state of every component but frequently skips hover/focus states, loading states, error states, empty states, and disabled states.

The Vibe Coding QA Checklist

  1. Layout & Spacing. Container overflow, flex/grid alignment, z-index stacking, consistent margins/padding.
  2. Typography. Font family, size, weight, line-height, letter-spacing against spec.
  3. Color & Contrast. Hex values on primary elements, hover/focus state contrast (WCAG AA 4.5:1).
  4. Responsive. Check 375px, 768px, 1440px plus resize continuously for gaps.
  5. Forms & Interactions. Validation states, input types, submit behavior, keyboard navigation.
  6. Navigation & Routing. Every link works, active states present, back button behavior, deep linking.
  7. Accessibility. Alt text, heading hierarchy, focus indicators, form labels, ARIA attributes.

A Repeatable QA Workflow

  1. Screenshot at three viewports (375px, 768px, 1440px).
  2. Run the 7-category checklist. Document every issue. Do not fix anything yet.
  3. Categorize by severity: broken functionality, visual bugs, polish items.
  4. Batch fixes by file. One prompt per file with all issues listed together.
  5. Re-QA after every batch. AI fixes frequently introduce new issues.

Does the AI Tool Matter?

Not as much as you would think. The Arbon study found statistically equivalent bug counts between Bolt.new and Lovable. The specific bugs differ by tool, but the categories are the same across Bolt.new, Lovable, Cursor, v0, and Figma Make. For tool-specific guidance, see How to QA a Bolt.new or Lovable App.

Frequently Asked Questions

Is vibe coding QA different from regular QA?

The checklist categories are the same but the priority order changes. AI-generated apps have higher rates of responsive layout failures, design system drift, and accessibility omissions.

How many bugs should I expect in a vibe-coded app?

Approximately 160 issues per app based on automated testing across 1,000+ checks, consistent across AI tools.

Can AI test its own code?

AI can run unit tests and catch functional regressions. It cannot visually verify that the output looks correct, meets accessibility standards, or matches design intent.

What is the fastest way to QA a vibe-coded app?

Use the 7-category checklist combined with a capture tool like OverlayQA that records CSS values, viewport dimensions, and element selectors in one click.

Do I need a design file to QA a vibe-coded app?

No. You can QA any live URL without a design file. A design file helps when comparing exact values against a spec. OverlayQA works with or without a connected Figma file.

Should I QA before or after prompting fixes?

Before. Always do a full QA pass and inventory all issues first. Batching fixes by file is 60-80% cheaper than fixing one at a time.

Related Resources