Listen

Description

We explore the evolution of automation as we continue studying Google's Site Reliability Engineering, while Michael, ah, forget it, Joe almost said it correctly, and Allen fell for it.

The full show notes for this episode are available at https://www.codingblocks.net/episode187.

News

Survey Says

What's your favorite Tom Cruise movie?

Take the survey at: https://www.codingblocks.net/episode187.

Automation

Why Do We Automate Things?

Cover of the
The famous "SRE Book" from Google

If we are engineering processes and solutions that are not automatable, we continue having to staff humans to maintain the system. If we have to staff humans to do the work, we are feeding the machines with the blood, sweat, and tears of human beings. Think The Matrix with less special effects and more pissed off System Administrators.

Joseph Bironas

The Value of SRE at Google

Google's Use Cases for Automation

The Use Cases for Automation

A Hierarchy of Automation Classes

Maturity Model

When your levels of abstraction get to be very sophisticated, you can lose the ability to work effectively at a lower level. Kind of like trying to make your own toaster today (Gizmodo).
  1. No automation: database failover to a new location manually.
  2. Externally maintained system-specific automations: SRE has a couple commands they run in their notes.
  3. Externally maintained generic system-specific automation: SRE adds a script to a playbook.
  4. Internally maintained system-specific automation: the database ships with a script.
  5. System doesn't need automation: Database notices and automatically fails over.

Can you automate so much that developers are unable to manually support systems when a (very rare) need occurs?

Resources we Like

Tip of the Week