Same Feature, Two Approaches: Vibe Coding vs Spec Coding
Vibe coding is intoxicating. You describe what you want in plain language, the AI writes the code, and ten minutes later you have a working endpoint. I was sold — until I shipped a refund feature that way and spent the next two weeks patching bugs that a 90-minute spec would have prevented entirely. This article walks through both paths side by side, using the exact same requirement, so you can see where the gap opens up.
The requirement
An e-commerce platform needs an order refund feature. The product manager's brief is straightforward:
- Support full and partial refunds
- Call the payment gateway (Stripe-style) to reverse the charge
- Track refund status: pending, processing, succeeded, failed
- Customer support agents trigger refunds through an internal tool
Simple enough, right? Both approaches start here. What happens next is where they diverge — dramatically.
Path A: Vibe coding
The vibe coding approach starts with a prompt:
"Build me an order refund API in Node.js. Support full and partial refunds. Call a payment gateway to reverse the charge. Track refund status. Use Express and Postgres."
Sixty seconds later, the AI produces a clean RefundController with createRefund and getRefundStatus endpoints. It validates that the order exists, checks the refund amount against the order total, calls a paymentGateway.refund() method, saves the result. The code looks professional. The happy path works.
Ship it.
Bug #1: The double refund
A support agent clicks the refund button, the page hangs for a second, they click again. Two refunds go through for the same order. The original code has no idempotency check.
Fix prompt: "Add a check to prevent duplicate refunds for the same order."
The AI adds a database query: if a refund already exists for this order, reject the request. This works — until it doesn't.
Bug #2: Partial refund overflow
A customer bought a $200 order. Support issues a $50 partial refund, then another $80, then another $100. Total refunded: $230 — more than the order. The duplicate check only looks for exact duplicates, not cumulative amounts.
Fix prompt: "Track cumulative refund amounts and reject refunds that would exceed the order total."
The AI adds a SUM(amount) query. But it's not wrapped in a transaction with the insert, so two concurrent partial refunds can both pass the check.
Bug #3: Gateway timeout
The payment gateway times out. The refund record is created in the database with status "processing," but the gateway response never arrives. The refund is stuck. Support can see it's "processing" but can't retry it — the duplicate check blocks them. Did the money actually leave? Nobody knows.
Fix prompt: "Add retry logic for gateway timeouts and a way to check refund status with the gateway."
The AI adds a retry loop with no exponential backoff, no idempotency key on the gateway call, and no timeout on the retry itself. The retry can now create another duplicate charge on the gateway side.
Bug #4: Race condition
Two support agents process refunds for the same order at the same time. Both pass the cumulative amount check (the first refund hasn't been committed yet), both hit the gateway, both succeed. The customer gets refunded twice.
Fix prompt: "Add locking to prevent concurrent refund processing for the same order."
At this point, the code has been patched four times. Each patch was a reasonable fix in isolation, but the overall architecture is a patchwork. There's no clear state machine, no documented invariants, no test coverage for the interaction between patches.
The real cost
The first version took 10 minutes. The four patches took two weeks — including investigation time, testing, customer support escalations, and one manual reconciliation of gateway records. The "fast" approach wasn't fast. It just front-loaded the dopamine and back-loaded the pain.
Path B: Spec coding
Same requirement. Same AI. Different starting point: we write the spec first.
The spec
# Feature: Order Refund Processing ## Goal Process refunds for e-commerce orders safely, ensuring no over-refund, no duplicate processing, and correct gateway reconciliation. ## Non-Goals - Customer self-service refund portal (future phase) - Refund reason analytics and reporting - Automated refund approval rules ## State Machine pending → processing → succeeded pending → processing → failed → pending (retry) Only one refund may be in "processing" state per order at any time. ## Acceptance Criteria Given an order with total $200 and $0 previously refunded When a support agent requests a $50 refund Then a refund record is created with status "pending" And the gateway is called with an idempotency key And on gateway success, status moves to "succeeded" And the refundable balance is now $150. Given an order with total $200 and $150 already refunded When a support agent requests a $75 refund Then the request is rejected with "exceeds refundable balance" And no gateway call is made. Given a refund in "processing" state When another refund request arrives for the same order Then the request is rejected with "refund already in progress" And no gateway call is made. Given a refund in "processing" state When the gateway times out Then the refund status remains "processing" And a background job retries with exponential backoff And the retry uses the same idempotency key And after 3 failures, status moves to "failed" And an alert is sent to the payments team. ## Edge Cases - Concurrent requests: Use SELECT FOR UPDATE on the order row before checking refundable balance - Idempotency: Each refund attempt gets a UUID; gateway calls include this as the idempotency key - Partial refund precision: All amounts in cents (integer), no floating-point - Gateway reconciliation: Nightly job compares local refund records against gateway settlement report ## Rollback Plan - Feature flag: refund_processing_v2 - Rollback disables new refund creation; in-flight refunds continue processing via the background job - No database migration rollback needed (additive schema only)
This spec took 90 minutes to write and 30 minutes to review with the team. Every edge case that bit us in Path A? It's answered here — before a single line of code was written.
The implementation
Now we prompt the same AI — but with the spec as context:
"Implement the refund feature described in this spec. Follow the state machine exactly. Use SELECT FOR UPDATE for concurrency control. Include the idempotency key in all gateway calls. All amounts in cents." [paste spec]
The output is structurally different. The AI generates:
- A
processRefundfunction wrapped in a database transaction withSELECT FOR UPDATE - Cumulative refund balance check inside the transaction
- An idempotency key (UUID) generated at refund creation and passed to every gateway call
- A background retry job with exponential backoff, capped at 3 attempts
- Proper state transitions that match the spec's state machine
- Input validation rejecting amounts that would exceed the refundable balance
Same AI. Same capability. Dramatically different output — because the input was dramatically different. The AI didn't get smarter; it got better constraints.
The same scenarios, pre-handled
Double refund? The SELECT FOR UPDATE lock prevents concurrent processing. The idempotency key prevents duplicate gateway charges. Both are in the first version, not patched in later.
Partial refund overflow? The cumulative balance check runs inside the same transaction as the insert. No race window.
Gateway timeout? The background job retries with the same idempotency key. The gateway treats it as a safe retry, not a new charge. After 3 failures, the team gets alerted.
Race condition? SELECT FOR UPDATE serializes all refund operations for a given order. The second request waits for the first to complete, then sees the updated balance.
Side by side
| Dimension | Vibe Coding | Spec Coding |
|---|---|---|
| Time to first working endpoint | 10 minutes | 3 hours (incl. spec) |
| Time to production-ready | 2+ weeks | 4 hours |
| Bugs found in production | 4 critical | 0 |
| Customer-facing incidents | 2 (double refund, over-refund) | 0 |
| Code architecture | Patchwork of reactive fixes | Coherent, matches spec |
| AI output quality | Happy path only | Covers all specified edge cases |
| Onboarding a new developer | Read code + Slack threads + incident reports | Read the spec |
| Confidence in shipping | Low — "what else will break?" | High — acceptance criteria verified |
The vibe coding path was "faster" for exactly 10 minutes. After that, it was slower in every dimension that matters.
The lesson isn't "don't use AI"
Both paths used the same AI. The difference was the input, not the tool. Vibe coding gives the AI freedom to make decisions you haven't made yet. Spec coding makes those decisions explicit first, then hands the AI a constrained problem with a clear definition of correct.
AI coding tools are force multipliers. But a force multiplier applied to an unclear direction multiplies the confusion. Applied to a clear spec, it multiplies the precision.
Vibe coding has its place — prototyping, exploration, throwaway scripts, hackathons. These are contexts where edge cases don't matter because the code won't see production traffic. The moment you're building something that handles real money, real users, or real data, you need the spec.
The 90 minutes I spent writing the refund spec saved two weeks of incident response. That's not a productivity trick. That's a fundamentally different way of working.
If you're ready to make the switch, the 30-day adoption plan is a good starting point. And if you want to see how specs improve AI-generated code specifically, the AI prompts guide goes deeper on prompt structure.
What I would actually ship first
For the refund feature, I would not start by building every refund branch. I would ship the smallest behavior that proves the contract: one full refund path, idempotency, and the support-visible state for pending provider confirmation.
First shippable slice: - Full refund only - Idempotency key required - Pending provider confirmation is visible to support - Duplicate click returns original refund_id - Rollback disables refund creation but keeps status lookup Deferred: - Partial refunds - Multi-currency adjustments - Bulk refund actions - Automated refund reason classification
That slice is smaller than the product dream but bigger than a UI demo. It proves the risky decisions before adding breadth.
Starter Review Block to Copy
Use this as the smallest practical artifact when a team is trying spec-first on a real change. It is deliberately short so it can live inside a ticket or PR.
Spec-first starter block: Same Feature, Two Approaches: Vibe Coding vs Spec Coding Decision to make: - Same order refund feature built two ways — vibe coding ships fast but drowns in edge-case bugs,… Owner check: - Product owner: - Engineering owner: - QA or operations reviewer: Scope boundary: - In scope: - Out of scope: - Assumption that still needs approval: Acceptance evidence: - Test or fixture: - Log, metric, or screenshot: - Manual review step: Scope boundary: the reviewer must be able to reject unclear goals, missing non-goals, and criteria with no evidence. Reviewer prompt: - What would still be ambiguous to someone who missed the planning meeting? - What evidence would make this safe enough to ship?
Editorial Review Note
Reviewed Apr 28, 2026. This update added a reusable artifact, checked the article against the related topic hub, and tightened the next-step links so the page works as a practical reference rather than a standalone essay.
Topic Path
This article belongs to the AI Coding Governance track. Start with the hub, then use the checklist, template, or tool below on a real project.
Keep reading
Fill a form, get a complete feature spec in Markdown — free, no signup.
Editorial note
This article covers vibe coding vs spec coding for software delivery teams. The refund scenario is an illustrative engineering example, not financial or legal advice.
- Author details: Daniel Marsh
- Editorial policy: How we review and update articles
- Corrections: Contact the editor