How I Built an Offline-First App for Forest Rangers in Nepal (And What I Learned)

Most of the mobile projects I take on have something in common: the user has internet. They might be on slow 4G, they might drop signal occasionally, but connectivity is the default assumption. This project had the opposite assumption. The client was a conservation organisation working in Nepal's remote forest zones — areas where mobile coverage simply does not exist. Their rangers spend multiple days in the field collecting GPS coordinates, tree health observations, and photographic evidence. They were doing all of it on paper.

The brief sounded straightforward: build a mobile app that works completely offline and synchronises everything when the ranger returns to connectivity. No data loss. No duplicates. No confusing error states the ranger has to diagnose in the middle of a forest.

Easy to describe. Genuinely difficult to build correctly. This post covers the full technical architecture, the mistakes I made, and the specific lessons that apply to any offline-first mobile application — not just conservation tools. If you are building a Flutter app with local SQLite storage, a Django REST backend, or a custom sync engine for unreliable network environments, most of what follows will be directly applicable.

The Stack

Before getting into architecture, here is what I chose and why:

Flutter (Android only) — iOS was not in scope. The rangers were all on Android devices, which simplified device testing considerably.
Drift (formerly Moor) for local SQLite on device — more on this choice below.
Django REST Framework on the backend — the client's existing infrastructure was Python, so this was a natural fit.
PostgreSQL as the server-side database.
A custom sync engine for conflict resolution — no third-party sync library fully covered the requirements.

Why Drift Over Other SQLite Options

The Flutter SQLite ecosystem gives you several reasonable options: Hive, Isar, sqflite, and Drift. Each has a different philosophy. Hive and Isar are fast key-value / document stores — excellent for simpler data shapes. sqflite gives you raw SQL access but minimal safety guarantees. Drift sits at the intersection of structured SQL and type safety.

For this project, Drift won on four specific requirements:

Raw SQL when needed — complex sync queries benefit from expressing joins and conditional updates in real SQL, not a limited query builder API.
Type-safe query generation — query mistakes surface at compile time, not at runtime when a ranger is offline and I cannot push a hotfix.
Built-in migration support — rangers cannot be expected to reinstall the app when I update the schema. Drift's migration API handles this cleanly without wiping local data.
Reactive streams — the UI subscribes to a Drift stream and updates automatically when sync writes new data, with no manual state management wiring required.

Here is the core observations table definition. The status field became the central state machine for the entire sync engine:

class Observations extends Table {
  IntColumn get id => integer().autoIncrement()();
  TextColumn get localUuid => text()();
  TextColumn get status => text().withDefault(const Constant('pending'))();
  RealColumn get latitude => real()();
  RealColumn get longitude => real()();
  TextColumn get notes => text().nullable()();
  TextColumn get photoPath => text().nullable()();
  DateTimeColumn get recordedAt => dateTime()();
  DateTimeColumn get syncedAt => dateTime().nullable()();
}

The status field cycled through four states: pending (created locally, not yet attempted), syncing (upload in progress), synced (confirmed on server), and failed (error stored alongside the record so the ranger can see what went wrong). This state machine is what makes an offline-first app feel trustworthy rather than mysterious.

The Sync Engine: Three Phases, Not One

The naive approach to offline sync — "upload everything pending when internet is detected" — fails in production in several predictable ways. A partial upload followed by an app crash leaves records in an ambiguous state. A photo upload that succeeds while the associated record insert fails creates an orphaned file on S3. A duplicate submission after a network timeout creates duplicate records in the database.

I structured the sync into three sequential phases, each with its own success/failure path:

Phase 1 — Photo Uploads

Photos upload first to S3 via a signed URL obtained from the Django backend. Once the upload is confirmed, the local record is updated with the remote S3 URL. If a photo upload fails, that observation stays in pending and is retried on the next sync cycle. The record never moves to the batch sync phase until its photo is safely stored.

Phase 2 — Record Batch Sync

Records sync in batches of 20, each carrying their localUuid. The server responds with a mapping of localUuid → serverUuid for each successfully inserted record. Batching limits memory pressure and gives the sync meaningful checkpoints — if the network drops mid-batch, only that batch is retried.

Phase 3 — Confirmation

On server confirmation, records are marked synced with a timestamp. Failed records are marked failed with the server's error message stored locally. Rangers see a clear visual indicator — green check for synced, amber warning for failed — and can tap a failed record to see the exact error.

Future<void> syncPendingObservations() async {
  final pending = await _db.getPendingObservations();
  if (pending.isEmpty) return;

  // Phase 1: upload photos independently
  for (final obs in pending.where((o) => o.photoPath != null)) {
    await _uploadPhotoIfNeeded(obs);
  }

  // Phase 2: batch record sync
  final batch = pending.take(20).toList();
  await _db.markAsSyncing(batch.map((o) => o.localUuid).toList());

  try {
    final result = await _api.syncObservations(batch);
    await _db.confirmSync(result.succeeded, result.failed);
  } catch (e) {
    await _db.markSyncFailed(
      batch.map((o) => o.localUuid).toList(),
      e.toString(),
    );
  }
}

Idempotency on the Django Backend

Network timeouts in Nepal's mountain regions are not edge cases — they are the expected operating condition. A ranger submits a batch, the response is lost in transit, and the device retries. Without idempotency, that retry creates duplicate observations in the database. With 10 rangers each collecting 30-50 observations a day, duplicates compound quickly and corrupt the conservation data that the whole project exists to protect.

The solution is straightforward: every observation carries a localUuid generated on the device at creation time. The Django view checks for this UUID before inserting — if it already exists, it returns success without creating a duplicate. The client treats this response identically to a fresh insert.

class ObservationBatchView(APIView):
    def post(self, request):
        succeeded, failed = [], []

        for item in request.data.get('observations', []):
            local_uuid = item.get('localUuid')

            # Idempotency check — already exists, return success
            if Observation.objects.filter(local_uuid=local_uuid).exists():
                succeeded.append({'localUuid': local_uuid})
                continue

            serializer = ObservationSerializer(data=item)
            if serializer.is_valid():
                serializer.save(uploaded_by=request.user)
                succeeded.append({
                    'localUuid': local_uuid,
                    'serverUuid': str(serializer.instance.id)
                })
            else:
                failed.append({
                    'localUuid': local_uuid,
                    'errors': serializer.errors
                })

        return Response({'succeeded': succeeded, 'failed': failed})

This pattern — UUID-keyed idempotency at the batch endpoint — is something I now use on every project that involves any kind of offline or unreliable-network data submission. It is one of those things that takes twenty minutes to implement and saves hours of production debugging.

What I Got Wrong

I want to be specific here, because vague "lessons learned" sections are not useful. These are the three concrete mistakes I made, what they caused in practice, and how I fixed each one.

Mistake 1: Photo Uploads Blocked Record Sync

My first version uploaded all photos sequentially before syncing any records. A ranger with 40 photos and a 3G connection was waiting 15-20 minutes before a single record reached the server. If the app was backgrounded during that wait, the upload was interrupted and the whole process restarted from scratch.

The fix was architectural: photo upload and record sync became independent pipelines. Records sync with a photoStatus: 'pending' flag if the photo has not yet uploaded. The backend accepts records with pending photos and updates the photo URL when the upload completes in a subsequent sync. Data reaches the server quickly; photos follow when bandwidth allows.

Mistake 2: Conflict Resolution Was Too Simple

"Server wins" is the easiest conflict resolution strategy to implement, and it worked fine — until a ranger edited their notes after returning to connectivity but before the first sync completed. The local edit would upload, the server would accept it, and then a second sync would overwrite those notes with the older server copy of a record inserted by a different device session. Data was silently lost.

I added a modifiedAt timestamp to every record and moved to a "last write wins" strategy: if the device's modifiedAt is newer than the server's, the device version takes precedence. For most single-user field tools, this is the right trade-off. In a multi-user environment with shared records, you would need a more sophisticated approach — operational transforms or user-visible merge prompts — but that was out of scope here.

Mistake 3: I Underestimated Network Failure Testing

The hardest bugs in offline-first apps only appear under specific failure conditions: connection killed mid-upload, phone rebooted during sync, app force-closed while a batch is in syncing state, full local database with an unresponsive server. Standard unit tests do not catch these. Emulator-based tests rarely simulate them reliably.

"I now run a 12-item manual checklist on a physical device before delivering any offline-capable feature. It includes scenarios like killing the connection mid-upload, rebooting during sync, and running with a full local database against a throttled server. Without this checklist, I would have shipped two of the bugs above to production."

One specific scenario worth flagging: records stuck in syncing state after an app crash. If the app restarts and finds records in syncing, it does not know whether the server received them or not. My solution: on startup, any record older than 5 minutes in syncing state is reset to pending. The idempotency check on the server handles the case where the server did already receive it.

Battery Life and GPS: The Field Reality

Rangers carry the device for 8-10 hour field days. Battery is a hard constraint. GPS is the feature most likely to drain it.

I used two GPS modes depending on context. For background route tracking — where the app is passively logging position every few minutes — I used LocationAccuracy.low, which uses cell tower and Wi-Fi triangulation rather than the GPS chip. Accuracy drops to roughly 100-300 metres, which is acceptable for route overview. Battery impact is minimal.

For recording a new observation — where the ranger is standing still and needs accurate coordinates — the app switches to LocationAccuracy.high for the duration of the recording, then drops back to low. This approach kept the app usable for a full field day on mid-range Android hardware without requiring a battery pack.

One lesson learned the hard way: always show the ranger the current accuracy in metres before they submit an observation. Two rangers submitted observations with 800m accuracy because they were under dense canopy and the GPS had not locked. A simple accuracy indicator with a "waiting for GPS lock" state prevented this in subsequent builds.

Architecture Summary: What Makes Offline-First Hard

Looking back at the project, the technical complexity was not in any individual component. Drift is well-documented. Django REST Framework is mature. S3 uploads are routine. The difficulty was in the interaction between all of these under adversarial network conditions — and in making the system feel transparent to a non-technical user who cannot tell you "the sync is in a bad state."

The principles I would carry into any future offline-first app:

Principle	Why It Matters
UUID on device at creation	Enables idempotency — safe to retry any operation any number of times
Explicit status state machine	Every record knows its own sync state; the UI can surface this clearly
Decouple binary uploads from record sync	Photos blocking records is a UX disaster on slow connections
Crash recovery on startup	Records stuck in `syncing` must be reset to `pending`
modifiedAt timestamp on every record	Enables "last write wins" conflict resolution without complex merge logic
Physical device testing under bad conditions	Emulators do not simulate mid-upload network kills or reboot-during-sync accurately

Where This Architecture Applies Beyond Forest Apps

Offline-first is not a niche requirement. It is relevant in a wider range of applications than most developers initially realise:

Field inspection apps — construction site surveys, agricultural monitoring, infrastructure audits. These workers spend hours in areas with unreliable signal.
Healthcare in low-connectivity regions — community health workers in rural Nepal, India, or Africa recording patient data that cannot wait for connectivity.
Logistics and delivery — drivers in basements, warehouses, and rural routes need package scan data to survive a connectivity drop.
Event check-in and ticketing — stadiums and venues have notoriously unreliable mobile coverage when 40,000 people are in the same area.
Any app used on public transport — tunnel-heavy cities like London, Tokyo, and Kathmandu create regular connectivity gaps that a well-built app should handle gracefully.

The core pattern is always the same: local-first data creation, explicit sync state, idempotent server endpoints, and transparent feedback to the user about what is pending. The details differ by domain; the architecture stays remarkably consistent.

Final Thoughts

This was one of the most technically rewarding projects I have taken on as a freelance developer. Not because it was the most complex codebase — it was not. But because the stakes were real in an unusual way. If the sync failed silently, conservation data that could not be re-collected was lost. If the app drained a battery in four hours, a ranger was navigating forest terrain without a working device.

Building for constrained, high-stakes environments forces a kind of rigour that comfortable, always-connected apps do not. The sync architecture I ended up with — three-phase, idempotent, state-machine-driven — is now my default starting point for any project that touches unreliable connectivity. I have reused significant portions of it in two subsequent projects.

"Offline-first is not about handling the absence of internet. It is about designing an application that treats the network as an optional enhancement rather than a required dependency. When you internalize that inversion, the architecture becomes considerably cleaner."

If you are building a Flutter app that needs to work without reliable connectivity — whether that is in a Nepali forest, a hospital in a rural district, or a delivery vehicle in an underground car park — feel free to reach out. I have built this architecture before, know where the edge cases hide, and can help you avoid the mistakes I made the first time.