Engineering

Building offline-first software for 200 workers in rural Kenya

Connectivity at our coffee estate dies for hours at a time. Here are the offline-first patterns that let supervisors check workers in, log harvests, and run payroll without a working internet connection.

July 2026 · 11 min read

Offline-first is not a feature you bolt onto a CRUD app the week before launch. It is an architectural commitment that touches every layer of the stack: local storage schema, conflict resolution policy, sync protocol, battery budget, and even how you handle timestamps. Most teams underestimate the scope by a factor of three. We learned this firsthand building MkulimaOS for Mulinga Farm, a large estate in Trans-Nzoia County with over 200 field workers. Here is what the architecture actually looks like, what broke, and what we would do differently.

Why this matters at Mulinga

Trans-Nzoia sits in Kenya's Rift Valley, about 340 kilometers north-west of Nairobi. The estate is productive agricultural land, and that is the point: field work starts at 6:00 AM, before most people in Nairobi have opened a browser. The nearest cell tower drops connectivity for two to six hours daily, and the timing is not predictable. A supervisor checking workers in at dawn cannot pause and wait for the network.

If the software requires connectivity to do anything, paper wins. And paper does work for check-in. The problem is that paper does not feed into payroll, does not produce attendance reports for the estate owner, and does not surface which workers have not shown up for three shifts in a row. So the constraint is simple: the software must work fully offline, and sync must be invisible to the user.

Architectural choices we made

On the mobile side, we chose React Native with Expo and WatermelonDB for local storage. WatermelonDB is purpose-built for React Native offline-first apps: it stores data in SQLite, exposes a reactive query layer, and is fast enough to handle a supervisor loading a list of 200 workers without a loading spinner.

On the backend, we run a NestJS API with an idempotent sync endpoint that accepts batched mutations. The mobile client sends an array of operations; the server processes them in order, deduplicates by mutation ID, and returns the canonical state for any records that changed on the server side since the last sync.

Conflict resolution policy: last-write-wins for simple scalar fields (name, status, attendance flag), append-only for transactions (payroll records, deduction entries). We do not use optimistic UI tricks where the UI pretends a write succeeded and then corrects silently. The local WatermelonDB store is the source of truth until a sync cycle confirms the server accepted the mutations. Users see their own data immediately; they see other users' changes after the next sync.

The sync protocol we built

Mobile clients generate UUIDs locally for all new records. There is no round-trip to the server to get an ID before a supervisor can check in a worker. Each mutation is wrapped in an envelope that looks like this:

interface MutationEnvelope {
  client_mutation_id: string;  // UUIDv4, generated on device
  entity_type: string;         // e.g. "attendance_record"
  entity_id: string;           // UUIDv4, generated on device
  operation: "create" | "update" | "delete";
  payload: Record<string, unknown>;
  client_timestamp: number;    // Unix ms — used for ordering, not truth
  device_id: string;           // identifies the phone
}

The server dedupes on client_mutation_id. If the same mutation arrives twice (because the sync response was lost and the client retried), the second application is a no-op. This makes the sync endpoint safe to call as many times as needed without corrupting data.

For concurrent edit detection, we use a vector-clock-style approach: each record carries a server_updated_at timestamp set by the server on last write. When two clients update the same record, the server compares incomingclient_timestamp values and the stored server_updated_at to decide which write wins. For most fields, the later client timestamp wins. We looked at true CRDTs but the problem domain did not require them: two supervisors rarely edit the same worker record simultaneously.

Sync runs opportunistically. Whenever the app detects a network connection (via the React Native NetInfo API), it queues a sync cycle. Mutations are batched in groups of up to 100 to keep individual HTTP requests under 50KB.

Battery and storage constraints

Field phones are budget Android handsets running on a single daily charge. The React Native performance guidelines flag background processing and GPS polling as the two biggest battery drains. We addressed both.

GPS polling for location-stamped check-ins runs at one point per 30 seconds, not one per second. The coordinates are deduped before storage: if the delta from the last recorded point is under 10 meters, we skip the write. This cut GPS-related battery drain by roughly 60% compared to our first implementation.

Image attachments (supervisor incident photos, farm condition records) are compressed on-device to a maximum of 200KB before they enter the sync queue. The original image is discarded after a successful upload. We use the Expo ImageManipulator API for this, which runs synchronously in the JS thread without needing a native module.

Local DB retention policy: any record that has been synced to the server and is older than 30 days is purged from the device. This keeps the SQLite file under 50MB on phones with limited storage, while ensuring that in-flight and recent data is always available offline.

Edge cases that bit us

Three failure modes that we did not anticipate fully in the initial design:

  • Phone reset. If a supervisor's phone is factory-reset, all local mutations that have not synced yet are gone. We solved this with a weekly automatic backup: the app exports its unsynced mutation queue to a JSON file in local storage, which can be restored if the user reinstalls and authenticates on the same device ID.
  • Duplicate check-ins. Two supervisors can check in the same worker on different phones during a network outage. When both records sync, the server sees two attendance entries for the same worker on the same shift. We handle this with a composite unique key on (worker_id, shift_date, shift_type) and a reconciliation queue that flags duplicates for a manager to review rather than silently dropping one.
  • Wrong device clocks. Budget Android phones sometimes have system clocks that are days off, especially after a battery drain and restart. We do not trust client_timestamp for absolute ordering. The server assigns a server_received_at timestamp on every mutation, and that is what we use for ordering and conflict resolution. Client timestamps are kept for auditing only.

What we would do differently

If we were starting Mulinga's offline layer today, three things would change.

For richer collaboration (multiple supervisors editing the same records in real time), we would evaluate a proper CRDT library like Yjs or Automerge rather than rolling a custom vector-clock layer. CRDTs handle merge semantics correctly without custom conflict resolution code.

For teams that do not want to own a sync server, a managed layer like ElectricSQL or PowerSync would cut two to four weeks of backend work. We built our own because we needed tight control over the mutation envelope and the payroll integration with the Safaricom Daraja API for M-Pesa payroll disbursements, but most teams do not need that level of control.

For read-heavy field apps (viewing records, not creating them), a service-worker PWA with background sync may be cheaper to build and maintain than a full React Native app with an embedded SQLite database. The trade-off is that PWAs are harder to distribute in low-connectivity environments where the Play Store is the only reliable distribution channel.

We describe the broader services framework, including offline-first architecture engagements, on our services page. If you are building software for distributed field teams and want to talk through what this architecture would look like for your context, get in touch at spideylabs.tech/contact.

Get the next one

Notes like this one, in your inbox.

Production AI engineering, EUDR compliance work, and lessons from running real software in rural Kenya. Twice a month. Unsubscribe anytime.

Start a project.

Tell us what you're trying to automate. We'll tell you whether we can ship it.