Home / Blog / Article

Reliability · 7 min read

Failover that actually works: designing for sub-second switchover

Published 2026-05-20

Failover is a promise — keep it

Most outages aren't total blackouts. They're brownouts: a link that degrades, drops packets, then limps back. The question isn't whether you have a backup link — it's how fast and how cleanly you move traffic to it before users notice.

Failover vs link aggregation

Failover moves traffic to a standby link when the primary fails. Link aggregation (bonding) uses multiple links at once, spreading sessions for both resilience and extra throughput. The best designs combine both: bond your healthy links and fail away from the sick ones automatically.

  • Continuously probe each WAN for loss, latency and jitter — not just "is it up?".
  • Pin latency-sensitive apps (VoIP, video, payments, SCADA) to the cleanest path.
  • Let bulk traffic ride cheaper or busier links.
  • Switch in under a second so stateful sessions survive.

Why sub-second matters

A VoIP call tolerates a brief reroute; it does not tolerate a ten-second stall. A card terminal that times out becomes a lost sale. A control loop that misses telemetry can trip a safety system. Sub-second switchover is the difference between a blip and an incident.

Design for the brownout, not just the blackout.

How CELLWAN does it

CELLWAN bonds up to five WAN links per CPE, probes each in real time, and reroutes application traffic in under a second — all visible and tunable from the cloud console, with event logs and alerts when a link misbehaves.

Keep reading