Vibe coding is a workflow where you prompt an AI (Claude Code, Cursor, Copilot, v0, Lovable, Windsurf) to write most of the code for you. The term was coined by AI researcher Andrej Karpathy in 2025. Your job shifts from writing code line by line to guiding, reviewing, and debugging what the AI generates.

Is AI-generated code safe to ship to production?

It can be, but not automatically. A 2024 Stanford/UIUC study found that developers using AI coding assistants were about 41% more likely to introduce security vulnerabilities when they trusted the output without manual verification. AI-generated code needs review for hallucinated APIs, missing auth checks, SQL injection, and silent deletions.

What are the most common bugs in AI-generated code?

The top five we see: (1) hallucinated imports — packages or methods that don't exist; (2) missing auth checks on API routes; (3) string concatenation in SQL queries; (4) unhandled null/undefined/empty states; (5) silent deletions where the AI removes working code during a refactor.

How do I fix a bug the AI keeps failing to fix?

After 2-3 failed attempts, stop. Reset to the last working commit and try a different approach: paste the full error (not a summary), narrow the scope to one file, or explain the bug in plain English before asking for a fix. If the AI keeps making it worse, write the fix yourself — some bugs need human intuition.

Which AI coding tool is the safest for production?

None are automatically safe — the safety comes from your review, not the tool. That said, tools with built-in type checking, test running, and diff review (Claude Code, Cursor with rules, Windsurf) give you more guardrails. Tools that hide the diff or auto-apply changes are the most dangerous.

Do I need to understand the code the AI writes?

For anything you'll maintain, yes. The line between vibe coding and vibe engineering is: can you explain this code to a colleague? If not, don't ship it. For throwaway prototypes, you can be more relaxed, but the moment a real user depends on it, you need to understand it.

How long does this checklist take to run?

For a small change (one feature, one file), about 10-15 minutes once you're familiar with the items. For a larger PR, 30-45 minutes. The point isn't speed — it's catching the bugs that would cost you hours to fix in production.

Does this checklist cover tests?

Yes, under Ship-Readiness. We recommend at least one happy-path end-to-end test before shipping, not full coverage. The goal on day one is to catch the moment you break the core flow, not to chase 100% coverage.

Is this checklist only for Claude Code users?

No. It works for any AI-generated code: Claude Code, Cursor, GitHub Copilot, v0, Lovable, Windsurf, Codeium, Aider, Replit Agent. The failure modes are the same regardless of which model or tool generated the code.

How is this different from a regular code review checklist?

Regular code review checklists assume a human wrote the code and catch human mistakes: typos, style issues, architectural drift. AI code review catches different bugs: hallucinated APIs, confidently wrong version mismatches, silent deletions. This list focuses on the AI-specific failures.

🩹Vibe Code Fix

Vibe Code Fix

The interactive checklist for shipping AI-generated code without blowing up prod.

25 things to verify before you ship code that Claude Code, Cursor, Copilot, or v0 wrote. Free, interactive, saves progress locally.

The short version

Vibe coding — the workflow where you prompt an AI to write most of your code — ships roughly 41% more security bugs than hand-written code when reviewers trust the output without verification (Stanford/UIUC, 2024). This 25-point checklist covers the failure modes that actually bite in production: hallucinated imports, missing auth checks, fake API calls, and the silent refactors that nuke a working feature while you look at the diff.

Progress

0%0/71 · 25 items left

Grade

Hallucinations & Correctness

Did the AI make stuff up?

0/5

Critical
LLMs love inventing npm packages and module paths that sound real. Run the code once — a ModuleNotFoundError is the cheapest bug you'll ever catch.
Critical
The AI may call `client.users.getByEmail()` when the real method is `client.getUserByEmail()`. These slip past TypeScript if the client is typed loosely.
High
AI training data is a blur of versions. It may write React 18 hooks against React 19, or pre-v4 Tailwind syntax in a v4 project.
Critical
When you paste a file back and ask for a change, the AI sometimes deletes unrelated functions, comments, or error-handling branches it didn't think were important.
Medium
AI-generated code is prone to near-duplicates: two validation functions that look identical but differ by one character. One gets fixed, the other rots.

Security

Can a stranger break in?

0/6

Critical
AI eagerly inlines `OPENAI_API_KEY` or database passwords into React components. Anyone can open devtools and steal them.
Critical
A surprisingly common pattern: the frontend hides the Delete button, but the `/api/delete` route has zero auth check. Anyone with the URL can call it.
Critical
AI loves template literals. `` `SELECT * FROM users WHERE email = '${email}'` `` is a SQL injection waiting for its first apostrophe.
High
The AI 'fixes' a file upload error by making the S3 bucket public. Now your user uploads are indexed by Google.
High
To make the frontend 'just work', the AI opens CORS to the whole world. Now any site can hit your API with the user's cookies.
Medium
`console.log(req.body)` includes the login form. Your log provider now has a plaintext password store.

Edge Cases & UX

What happens when things go wrong?

0/6

High
AI writes the happy path first. `.map()` on an array that might be undefined, `.toLowerCase()` on a value that might be null — classic production crash.
Medium
`items.slice(page * pageSize, (page + 1) * pageSize)` vs `items.slice(page * pageSize, page * pageSize + pageSize)` — both 'look right', one is wrong depending on whether page is 0- or 1-indexed.
Medium
`new Date().toISOString()` is fine. `new Date('2026-04-10')` parses as UTC on the server and local time in the browser — a classic off-by-one-day bug.
High
Your OpenAI call, your email provider, your payments API — they all rate-limit. The AI rarely writes retry-with-backoff, so the first burst of traffic looks like a total outage.
Medium
Button click → nothing visible for 2 seconds → user clicks again → duplicate submission. A missing loading state is a data corruption bug dressed up as a UX bug.
High
The AI wrote `try { ... } catch (e) { console.error(e); }`. The error shows up in devtools. The user sees nothing happen and assumes the app is broken.

Performance

Will it survive 100 users?

0/4

High
The classic: loop over users, query each user's posts inside the loop. 50 users = 51 queries. The AI writes this constantly because it's the simplest-looking code.
Medium
React component recomputes a filtered sorted list on every keystroke. At 10 items it's invisible. At 1,000 it freezes the page.
Medium
AI imports `lodash` for one function, `moment` instead of `date-fns`, or pulls in a massive icon library when you needed one icon.
Medium
Server route reads a 50MB file synchronously before responding. Each request blocks every other request on the same worker.

Ship-Readiness

Is it actually production-ready?

0/4

High
Vibe-coded apps usually have zero tests. You don't need 100% coverage on day one — but one end-to-end test that proves the core flow works means you'll catch it the moment you break it.
High
Deploy succeeds, server starts, first request crashes because `STRIPE_SECRET_KEY` is undefined. You find out from a user.
Medium
Without error tracking, your first signal that prod is on fire is a tweet. Free tier of Sentry or Highlight takes 5 minutes to set up.
Critical
The AI created a `.env` with your real keys, you added it to git, pushed to a public repo. GitHub's secret scanner usually catches this — but only after a bot has already grabbed it.

Why this checklist exists

You're writing 10x more code with Claude Code, Cursor, or Copilot — and the failure modes changed. It's not 'I wrote a typo'. It's 'the AI confidently called an API that doesn't exist'. The bugs look like working code until they hit prod. This list is every failure mode I've personally shipped or fished out of someone else's vibe-coded repo, organized by blast radius.