Listen

Description

We wrap up the discussion on partitioning from our collective favorite book, Designing Data-Intensive Applications, while Allen is properly substituted, Michael can't stop thinking about Kafka, and Joe doesn't live in the real sunshine state.

The full show notes for this episode are available at https://www.codingblocks.net/episode172.

Sponsors

Survey Says

How many different data storage technologies do you use for your day job?

News

Last Episode …

Designing Data Intensive Applications Best book evar!

In our previous episode, we talked about data partitioning, which refers to how you can split up data sets, which is great when you have data that's too big to fit on a single machine, or you have special performance requirements. We talked about two different partitioning strategies: key ranges which works best with homogenous, well-balanced keys, and also hashing which provides a much more even distribution that helps avoid hot-spotting.

This episode we're continuing the discussion, talking about secondary indexes, rebalancing, and routing.

Partitioning, Part Deux

Partitioning and Secondary Indexes

Document Based Partitioning

Term Based Partitioning

Rebalancing Partitions

Partitions > Nodes

Other methods of partitioning

Automated vs Manual Rebalancing

Request Routing

Parallel Query Execution

Resources We Like

Tip of the Week

Powerlevel10k Configuration Wizard Check out PowerLevel10k