Weak prompt
Use AI to fix the profile settings bug. It should save display name changes correctly. Clean up anything related while you are there.
This case shows how to review an AI-generated pull request that solves the visible request but quietly changes more than the team approved.
Use AI to fix the profile settings bug. It should save display name changes correctly. Clean up anything related while you are there.
Feature: Save profile display name Owner: Account Settings Status: Ready for review Goal: - Persist display_name from account settings. - Show a success state after save. - Preserve existing email, avatar, and billing settings behavior. Allowed paths: - apps/web/routes/settings/profile.tsx - packages/account/profile-service.ts - packages/account/profile-service.test.ts Non-goals: - No billing settings changes. - No auth/session changes. - No settings layout redesign. - No dependency upgrades.
- [ ] Reproduce display_name save failure. - [ ] Patch profile update service only. - [ ] Add regression test for display_name persistence. - [ ] Keep email, avatar, billing, and auth flows unchanged. - [ ] Add PR diff map before requesting review.
- Given a signed-in user When they update display name Then the name persists after reload. - Given the same request When the PR is reviewed Then no files outside allowed paths are changed.
Automated: - profile display_name regression test - settings form submit test Manual: - before/after screenshot - changed-file list mapped to tasks - reviewer confirms no scope expansion
You may only change the allowed paths. If you find a related issue in billing, auth, layout, dependencies, or email settings, write a follow-up note instead of changing code. Every changed line must map to one listed task.
| Review question | Pass signal | Reject signal |
|---|---|---|
| Did the diff stay inside allowed paths? | Every changed file appears in the packet. | The PR touches billing, auth, layout, dependencies, or unrelated tests. |
| Does each file map to a task? | The PR description links every file to a task and criterion. | Changes are described as cleanup, refactor, or improvement without a task. |
| Did the assistant invent policy? | Product behavior matches the spec exactly. | The assistant changes validation rules, session behavior, or saved fields. |
| Is there evidence before merge? | Regression test, screenshot, and diff map are attached. | The PR asks reviewers to trust a manual demo or broad summary. |
The most dangerous phrase in the original prompt is "clean up anything related." It sounds helpful, but it gives the assistant permission to reinterpret the ticket. In practice, that can produce a diff that changes settings layout, billing forms, validation policy, dependency versions, and tests that no reviewer expected to inspect.
The spec packet moves that decision back to the team. The allowed path list is not decorative; it is a review contract. If the assistant needs to touch a file outside the packet, the correct response is a note explaining why the scope is insufficient, not a bigger PR.
Review the file list before reading code style. A clean diff outside the allowed scope is still a failed spec.
Each file should map to a task. If a file only maps to "cleanup", remove it or create a separate spec.
The test should prove the original bug is fixed, while the screenshot and diff map prove the PR did not broaden behavior.
Use this pattern whenever an AI assistant is asked to fix a narrow bug inside a mature product area: settings, billing, permissions, onboarding, notifications, admin tools, or reporting. The exact files will differ, but the review move is the same: define allowed paths, write non-goals that block attractive side quests, and require a changed-file map before review.
Do not make the allowed path list so narrow that implementation becomes impossible. The purpose is not to trap the assistant. The purpose is to force scope negotiation into the spec instead of letting it happen silently inside the diff. If the model discovers that the bug really belongs in a shared service, update the packet and ask for review before accepting broader changes.
This also makes PR review less personal. Instead of saying "I do not like this broad refactor", the reviewer can point to the packet: the file is outside scope, the behavior is a non-goal, or the evidence does not prove the listed criterion.
The easiest way to make this stick is to add three fields to the pull request template. First, require a link to the spec packet. Second, require a table that maps every changed file to a task. Third, require the author to list any suggested follow-up work that was intentionally not included in the diff. Those fields turn AI review from a style discussion into a contract check.
This is especially useful when the assistant finds adjacent problems. Adjacent problems are real, but they should not be smuggled into a bug fix. The PR can capture them as follow-up notes with owners and risk level, then the team can decide whether each one deserves its own spec.
The reviewer starts from the approved behavior and boundaries, not from the assistant's summary.
Every changed file has a reason tied to a task, criterion, or evidence item.
Useful discoveries are preserved without expanding the current pull request.
Refactors are not free when the reviewer expected a bug fix. Move cleanup into its own spec.
Package changes expand test surface and rollback risk. Require a separate upgrade packet.
Validation rules are product behavior. They need acceptance criteria, not a casual implementation note.
Start from a spec packet, then make the PR prove every changed file belongs to the packet.
This case focuses on AI-assisted implementation review: the spec boundary is the main protection against unapproved scope expansion.