fadaly.net/work/chaosscore
SRE & INCIDENT
BLAST.
14 chaos experiments scored across system tiers.
4 surfaced critical dependencies the runbook didn't name.
2 caused a full customer-facing outage during business hours.
You don't pick the day your dependencies fail. Unless you do.
CX-009 · DynamoDB partition failure
PROD OUTAGE
Ran during business hours. Cart service down 47min. No fallback.
Add read replica failover, schedule chaos to off-hours.