Mobile API Spec Considerations for Offline Mode

Mobile API Spec Considerations for Offline Mode
Spec Coding Editorial Team · Spec-first engineering notes

Most mobile APIs I review were designed for a phone with five bars of LTE next to the engineer's desk. That is not the device we ship to. I spec offline-capable APIs assuming the client is in a basement for nine hours — and whatever the user taps during that window has to survive, sync, and sometimes get politely rejected once the radio returns.

Published on 2026-03-01 · 6 min read · Author: Spec Coding Editorial Team · Review policy: Editorial Policy

Review Note

Reviewed May 3, 2026. This article is maintained as a focused companion to the API Contracts Hub. The current version adds offline test evidence, sync telemetry, and rejection rules for mobile teams.

Local-First Is a Spec Decision, Not an Implementation Detail

I force the spec to answer one question before anyone opens an endpoint file: is the phone the source of truth between sync windows, or is the server? For field-service checklists, notes apps, inspection tools, and delivery scanners, the phone wins. Every mutation writes to local storage first, returns success to the UI immediately, and gets picked up by a background sync worker. The server is a merge partner, not a gatekeeper.

Once that is written down, a lot of dumb arguments disappear. No one proposes a spinner on the save button. No one suggests disabling features when the radio is off. The API stops being a RESTful wish list and starts being a contract for reconciling two divergent stores.

The Outbox Pattern on the Client Belongs in the Spec

My default is a client-side outbox table with four columns: client-generated ID (ULID), mutation type, payload JSON, and attempt count. The spec has to call this out because three things depend on it: idempotency, retry policy, and tombstones for deletes.

Every mutation ships its ULID as an Idempotency-Key header. The server persists that key for at least 30 days. If the client retries — radio died mid-request, OS killed the background task, user force-quit — the server returns the original result. I specify 30 days because I once watched a field-service app replay a two-week-old mutation after the tech finally drove back into cell range.

Deletes need tombstones. If the client deletes a row locally, it inserts a tombstone with the same ID and a deleted_at timestamp. The sync worker ships it, the server accepts it, and the client only purges after the server acknowledges. Skip this step and deletes silently resurrect on the next delta sync.

Delta Sync With Cursors, Not Full Refresh

For pulling changes from the server, I push back hard on "just refetch the list." A field tech with 400 work orders cannot afford a full payload every time they open the app on cell data. The spec defines a cursor — an opaque server-issued token — and a GET /sync?cursor=X endpoint that returns everything changed since that cursor plus a new cursor.

The cursor is opaque on purpose. Clients must not parse it. I have changed the encoding three times on one project without a client update because the contract said "opaque string, echo it back unchanged." If the server returns 410 Gone, the client performs a full refresh and stores the new cursor.

What Happens When the Server Rejects a Queued Mutation

This is the section most specs skip, and it is the one that matters. The client queued a mutation offline. Sync runs. The server says no. I force the spec to distinguish three rejection categories, each with different client behavior:

Last-writer-wins is lazy and it is fine for 80 percent of fields. The other 20 percent deserve their own rule, written down per field.

Never Trust Client Timestamps for Ordering

A tech's phone might be set to last Tuesday. I have seen it. The spec forbids using client-supplied timestamps as the primary ordering key. Clients send client_generated_at for display, but ordering uses server receive time or a hybrid logical clock the server assigns on ingest. If two mutations collide and both claim to be newer, the server picks — by a named rule in the spec, not whoever wrote the merge function.

Large Attachments Get Their Own Protocol

Photos from a field inspection are 4 MB each and the tech took 18 of them in a parking garage. If attachments ride in the main mutation payload, nothing ever syncs. The spec separates them: mutations reference attachments by client ID, attachments upload via chunked resumable PUT to a pre-signed URL, and the sync worker only marks the mutation ready once all referenced attachments return 200. Chunks are 1 MB with a content-range header, resume returns the next expected byte, and uploads run through OS-level background APIs (URLSession on iOS, WorkManager on Android). No attachment uploads on cellular unless the user flags the record as urgent.

Schema Evolution for Clients That Will Never Update

A non-trivial fraction of field devices run a build from 14 months ago. The spec pins two rules: the server must accept any documented historical request shape for at least 12 months, and unknown fields in server responses must be preserved by the client on round-trip. That second rule lets us add fields without breaking old clients that will later sync edits back. Deprecations ship with a Sunset header and a minimum-version gate; below the gate, sync returns 426 Upgrade Required.

Acceptance Criteria

- Given a client is offline and the user creates three notes
  When the device regains connectivity on a metered network
  Then the outbox drains in ULID order with exponential backoff and each mutation carries its idempotency key

- Given a queued mutation is rejected with HTTP 409 and a server-state payload
  When the conflict touches a field marked "escalate" in the spec
  Then the client surfaces a resolution UI and does not auto-merge

- Given the server rotates its cursor encoding
  When a client presents an old cursor and receives 410 Gone
  Then the client performs a full refresh, stores the new cursor, and does not drop local unsynced mutations

Observability the Server Side Actually Needs

Offline-capable apps fail silently. Without the right signals, the first sign of trouble is a support ticket a week later. I require three metrics in the spec, reported by the client on every sync:

When conflict rate on one endpoint jumps from 0.2 percent to 4 percent overnight, that is a schema bug or a merge-rule bug, and I want to see it the day it ships — not the week the complaints arrive.

The Offline Test Matrix

The spec is not ready until QA can run a small matrix on a real device. Simulators help, but the bugs that matter often involve the OS killing a background task or the user moving between weak networks.

Offline test matrix

Device state:
- app foreground, radio disabled
- app backgrounded, radio returns after 30 minutes
- app killed by OS with pending outbox rows

Mutation state:
- create, update, delete
- duplicate retry with same idempotency key
- stale update after server-side edit

Expected evidence:
- outbox count returns to zero
- rejected mutation shows human-readable reason
- server stores one result per idempotency key
- sync metric reports app version, OS version, oldest pending age

This matrix catches a different class of defect than endpoint unit tests. It proves the contract works on the device the customer actually uses, not just in the HTTP handler.

Keywords: offline-first mobile API · outbox pattern · idempotency key · delta sync cursor · conflict resolution · tombstone deletes · schema evolution

Editorial Note