How do you build software for unreliable power and internet?

Assume the connection will drop. Let core work happen offline and sync later, make every action safe to interrupt and retry, and degrade in tiers instead of collapsing.

What does offline-first mean?

The software holds data locally and keeps working without a connection, syncing to the server quietly whenever one is available — so the network is a convenience, not a precondition.

Does designing for poor connectivity help elsewhere?

Yes. Offline-first, interruption-safe, frugal-data design is simply robust engineering — software that survives a Dar es Salaam Tuesday survives almost anywhere.

Building software for unreliable power and patchy data

Most software in the world is built on a quiet assumption: the power stays on and the connection is always there. The frameworks assume it, the tutorials assume it, the architecture diagrams assume it. For a developer in San Francisco, it is close enough to true to ignore. For a developer in Dar es Salaam, it is a lie that will quietly break everything you ship.

Here, power flickers and connections drop — not as rare disasters but as ordinary Tuesday conditions. Software that only works when everything is up is software that fails precisely when your user most needs it: mid-transaction, mid-sale, mid-shift. This essay is about the engineering principles for building software that survives those conditions gracefully. They are not workarounds for a deficient environment. They are simply good engineering, and they happen to be mandatory here.

High-voltage power lines running from a power station. — Do not design against the power going out. Assume it will, and keep serving anyway. · Robosk / Wikimedia Commons (CC BY-SA 4.0)

The core principle: assume the failure, design for it

The fundamental shift is to stop treating disconnection as an error and start treating it as a normal state your software passes through. In a place with reliable infrastructure, you can get away with "if the network fails, show an error and stop." Here, that produces software that is broken most of the time it matters.

The mindset instead is: the connection will drop during this operation. What should happen? If the answer is "the user loses their work" or "the sale cannot be completed," you have not finished designing. Resilience is not a feature you add at the end. It is the assumption you start from.

Treat disconnection not as an error to handle but as a normal state your software passes through. If dropping the connection loses the user's work, you have not finished the design.

Principle one: work offline, sync later

The most important capability is that core work continues when the connection is gone, and quietly catches up when it returns. A shopkeeper recording a sale, a field officer registering a farmer, a teacher marking attendance — none of these should stop because a tower is congested.

In practice this means the software holds its data locally first, lets the user keep working against that local copy, and synchronises to the server in the background whenever a connection is available. The user should barely notice the network state at all. When the connection is there, sync happens silently. When it is not, work continues and waits its turn. The network becomes a background convenience, not a precondition for doing anything.

This is more work to build than a simple always-online app. It is also the difference between software people can actually rely on here and software they abandon the first day it fails them during a sale.

Principle two: every action must survive an interruption

Power and data can vanish in the middle of anything — a payment, a form submission, a save. The question for every important action is: if this is interrupted halfway, what state are we left in?

The unacceptable answer is "we do not know" — money taken but order not recorded, form half-saved, data corrupted. Good design makes actions safe to interrupt and safe to retry. If a payment confirmation is cut off, the system can check afterwards whether it actually went through, rather than guessing. If a save is interrupted, the data is either fully saved or not at all, never half. The user can repeat an action without fear of doing it twice, because the software recognises the repeat.

This discipline — making operations interruption-safe and repeat-safe — is the single most valuable habit for unreliable conditions, and it is exactly the discipline that makes software robust everywhere. The flaky connection just forces you to do what you should have done anyway.

Principle three: degrade in tiers, do not collapse

Software should not be binary — fully working or fully broken. It should fall back through tiers as conditions worsen. Full connection: everything works. Weak connection: the essentials work, the heavy extras wait. No connection: the core local work continues and queues to sync.

A point-of-sale system is the clear example. Online, it does everything — sales, live inventory, reports, receipts by SMS. Offline, it must still do the one thing that cannot wait: record the sale, take the payment, let the customer leave with their goods. The reports and the live sync catch up later. A POS that simply stops when the connection drops is not a smaller version of itself; it is useless exactly when a customer is standing at the counter.

Designing these tiers deliberately — deciding in advance what must survive to the lowest tier and what may gracefully disappear — is the work. The alternative, an all-or-nothing system, is the one that fails at the worst moment.

Why this makes better software everywhere

Here is what I tell developers who think of these as African constraints to be apologised for: they are the constraints that produce genuinely robust software, and the rest of the world is slowly, expensively rediscovering them.

Offline-first design, interruption-safe operations, frugal data use, graceful degradation — these are now celebrated best practices in the wider industry, sold as sophistication. We did not adopt them to be sophisticated. We adopted them because the environment refused to let us ship anything fragile. The unreliable connection is a harsh but honest teacher: it fails your weak assumptions immediately, in front of a real user, instead of letting them hide until scale.

Software built to survive a Dar es Salaam Tuesday is software that will survive almost anything a gentler environment throws at it. The constraint is not a disadvantage to engineer around. It is a discipline that, taken seriously, makes you build the resilient thing you should have built regardless.

The honest summary

Do not build for the network you wish you had. Build for the one your users actually have — intermittent, congested, and expensive. Assume the connection will drop and design for that state as normal. Let core work happen offline and sync later. Make every important action safe to interrupt and safe to retry. Degrade in tiers instead of collapsing. And treat data as the costly, metered resource it is for your users.

Do this and you will ship software that works on a real Tuesday in Dar es Salaam — which, not by coincidence, is also the most robust software you could build anywhere.

Building software for unreliable power and patchy data

The core principle: assume the failure, design for it

Principle one: work offline, sync later

Principle two: every action must survive an interruption

Principle three: degrade in tiers, do not collapse

Why this makes better software everywhere

The honest summary

Frequently asked

More essays you'll probably want.

Why African businesses shouldn't copy Silicon Valley startups

Hiring and keeping good developers in Tanzania

The real cost of being online in Tanzania — data, bundles, and connectivity

Mobile money is the real operating system of East African business