Use a fixed shape
Actor, state, trigger, expected behavior, and evidence keep criteria from turning into wishful prose.
Copy examplesA focused path for writing acceptance criteria that engineers can implement, QA can test, and reviewers can use as a release gate.
Actor, state, trigger, expected behavior, and evidence keep criteria from turning into wishful prose.
Copy examplesA good criterion covers the unhappy path: validation failure, duplicate action, timeout, permission denial, or partial success.
Open edge casesAI-generated or human-written changes should map each criterion to tests, logs, screenshots, or manual verification.
Review AI PRs| Stage | Example | Review value |
|---|---|---|
| Too vague | User can upload CSV successfully. | No file size, invalid rows, duplicate handling, or completion signal |
| Testable | Given a 10MB CSV with 3 invalid rows, when upload completes, then valid rows import and invalid rows appear in a downloadable error report. | QA can build a fixture and expected result |
| Release-ready | Add processing time, audit log, retry behavior, and rollback signal. | Review can block release when evidence is missing |
# Acceptance criteria block - Given [actor/state/precondition] When [trigger/action] Then [observable result] And [side effect or audit proof] Negative path: - Given [invalid input or denied permission] When [same trigger] Then [clear failure behavior] And [no unsafe side effect]
Use this hub when a ticket has clear intent but weak proof. “The upload works” or “the user can save settings” may be enough for a conversation, but it is not enough for implementation or QA. Acceptance criteria translate intent into the specific state, trigger, result, and evidence that define done.
The easiest way to improve a weak criterion is to add the missing slot. If the actor is missing, name the role. If the initial state is missing, add the record, permission, or dependency state. If the trigger is vague, name the exact action or event. If the result is subjective, replace it with a visible UI state, API response, database state, log entry, or metric.
Good criteria also protect negative space. They say what must not happen: no duplicate event, no second refund, no hidden permission elevation, no partial import without an error report, no silent retry after a permanent failure. These “no unsafe side effect” clauses often catch more bugs than the happy path.
| Check | Pass condition |
|---|---|
| Actor | The user, client, job, or system role is named. |
| Precondition | The starting state is concrete enough to build a fixture. |
| Trigger | The action or event is specific and repeatable. |
| Expected result | The outcome can be observed without interpretation. |
| Negative path | Invalid input, denied permission, or retry behavior is covered. |
The handoff artifact should map each criterion to one evidence source: a test, fixture, log, screenshot, or manual check. PR review becomes much faster when reviewers can point to the exact proof for each behavior.
A vague upload ticket often says “users can import contacts.” A useful criteria pass turns that into three checks: a clean CSV imports within the target time, a mixed-validity CSV imports valid rows while returning a downloadable error report, and a duplicate upload does not create duplicate contacts. Those three checks force decisions about file size, duplicate key, partial success, and user feedback.
The reviewer should ask whether QA can build the fixture without interviewing the author. If not, the criterion is still too vague. The PR should show where each criterion is proven: unit test for parser behavior, integration test for duplicate handling, and a screenshot or log line for the error report path.
CSV import criteria become useful when they separate clean rows, invalid rows, duplicate rows, and user feedback. That is where QA can build fixtures without interviewing the author.
| Fixture | Expected behavior | Evidence |
|---|---|---|
| 10 valid rows | All rows import and summary shows 10 created | Integration test plus import summary screenshot |
| 7 valid, 3 invalid rows | Valid rows import; invalid rows appear in downloadable error report | Fixture file and generated error CSV |
| Duplicate email with different casing | No duplicate contact; existing row is referenced | Database assertion for normalized key |
| Worker timeout | Import status stays processing and retry is visible | Log query for retry job and UI state capture |
Before treating this hub as complete for a real team, run a short audit. First, confirm the reader can leave the page with one artifact copied into their workflow. Second, confirm every linked article answers a different question instead of repeating the same definition. Third, confirm the page names a failure mode that would matter in production, not only during planning.
The most useful hubs behave like workbenches. A reader should be able to open the page, choose the relevant path, copy the block, and know what evidence to attach before review. If a hub only explains the topic, it becomes another article. If it helps the reader decide what to do next, it becomes a resource.
A large example set across authentication, ecommerce, APIs, data workflows, and notifications.
Read articleTurn vague requirements into criteria QA can test and reviewers can use as a release gate.
Read articleUse prompt structure to generate better acceptance criteria without letting hallucinated scenarios reach QA.
Read articleReview AI-generated pull requests against acceptance criteria, test evidence, and scope boundaries.
Read articleConvert duplicate, timeout, permission, and partial-success cases into concrete test fixtures.
Read articleUse Spec Skills to draft criteria, seed failure modes, and preserve human review at the right point.
Read articleNo. Use it when behavior has clear states and triggers. For static content or copy changes, a short checklist may be enough.
Enough to cover the main path, at least one failure path, and the release evidence reviewers need. Count by risk, not by word count.
A testable criterion names the actor, trigger, observable result, boundary condition, and evidence needed to confirm the behavior.
Product should own intent, while engineering and QA tighten the wording until each criterion can be implemented, tested, and reviewed without guesswork.
Each important criterion should map to evidence: automated tests, manual checks, screenshots, logs, metrics, or review notes attached before release.
When you need to use this today, open a template first, then come back to this hub for review and evidence checks.