Listen

Description

We dive back into Designing Data-Intensive Applications to learn more about replication while Michael thinks cluster is a three syllable word, Allen doesn't understand how we roll, and Joe isn't even paying attention.

For those that like to read these show notes via their podcast player, we like to include a handy link to get to the full version of these notes so that you can participate in the conversation at https://www.codingblocks.net/episode160.

Sponsors

Survey Says

How important is it to learn advanced programming techniques?

News

The major difference between a thing that might go wrong and a thing that cannot possibly go wrong is that when a thing that cannot possibly go wrong goes wrong it usually turns out to be impossible to get at or repair Douglas Adams

Douglas Adams
Book: Designing Data-Intensive Applications In this episode, we are discussing Data Replication, from chapter 5 of "Designing Data-Intensive Applications".

Replication in Distributed Systems

Synchronous vs Asynchronous Writes

Steps for Adding New Followers

  1. Take a consistent snapshot of the leader at some point in time (most db can do this without any sort of lock)
  2. Copy the snapshot to the new follower
  3. The follower connects to the leader and requests all changes since the back-up
  4. When the follower is fully caught up, the process is complete

Handling Outages

Rough Steps for Failover

  1. Determining that the leader has failed (trickier than it sounds! how can a replica know if the leader is down, or if it's a network partition?)
  2. Choosing a new leader (election algorithms determine the best candidate, which is tricky with multiple nodes, separate systems like Apache Zookeeper)
  3. Reconfigure: clients need to be updated (you'll sometimes see things like "bootstrap" services or zookeeper that are responsible for pointing to the "real" leader…think about what this means for client libraries…fire and forget? try/catch?

Failover is Hard!

Implementation of Replication Logs

Statement-Based Replication

Write Ahead Log Shipping

Row Based Log Replication

Trigger-Based Replication

Resources We Like

Tip of the Week