In this episode I'm joined by a group of seasoned incident response professionals to discuss a simulated incident drill conducted on the Uptime Labs platform. The conversation centers around the Multi-party Dilemma—the challenge of coordinating incident response across teams or organizations with different missions, contexts, or incentives.
Eric Dobbs, our incident analyst, joins to break down the drill and provide deep insights into the incident dynamics, team interactions, and what true incident analysis looks like when it's done well. Participants Alex Elman and Sarah Butt, who served as deputy and lead incident commanders respectively during the drill, recount their roles and experiences, highlighting realistic stress responses, decision-making, and coordination failures and successes. Hamed Silatani, CEO of Uptime Labs, provides context and insights into the behind-the-scenes work he and his team provide as the other "characters" driving the narrative of the drill.
The episode uniquely showcases the value of structured incident analysis and the benefits of using drills to expose hidden assumptions and improve resilience in complex systems.
A few key highlights include:
References/Resources