Back to Writing
Growth status: Growing GrowingUpdated: Jan 27, 20262 min read

Thinking in Failure Modes

System design is often taught as an exercise in scale, but in practice it’s an exercise in failure. Systems fail in partial, messy ways—timeouts, retries, degraded dependencies, and inconsistent state. Designing with this in mind changes everything.

System design is often taught as an exercise in scale, but in practice it’s an exercise in failure. Systems fail in partial, messy ways—timeouts, retries, degraded dependencies, and inconsistent state. Designing with this in mind changes everything.

A well-designed system assumes that components will fail independently. Networks partition. Services restart. Dependencies become slow instead of completely unavailable. Systems that only handle binary success/failure don’t survive long in production.

One of the most important design tools is backpressure. Without it, failures cascade. Queues grow unbounded, threads exhaust, and small issues become outages. Backpressure is how systems say “slow down” before things break.

Another key principle is graceful degradation. Not every feature is equally important. Systems should be able to shed non-essential functionality under load while preserving core behavior. This requires prioritization at design time, not during incidents.

State management is where system design becomes subtle. Distributed systems trade simplicity for scalability. Keeping state consistent across boundaries is hard, and pretending otherwise leads to fragile designs. Sometimes the right answer is accepting eventual consistency and designing UX and workflows around it.

Observability is not an afterthought in good system design—it’s foundational. If you can’t see what the system is doing, you can’t reason about its behavior. Logs, metrics, and traces should be designed alongside architecture.

In the end, system design is about humility. No system is perfect, and no design survives first contact with reality unchanged. The goal is not to eliminate failure, but to contain it, understand it, and recover from it calmly.

Share this writing