Listen

Description

We start our deep dive into Joe's favorite new book, Designing Data-Intensive Applications as Joe can't be stopped while running downhill, Michael might have a new spin on #fartgate, and Allen doesn't quite have a dozen tips this episode.

If you're reading this via your podcast player, you can always go to https://www.codingblocks.net/episode120 to read these show notes on a larger screen and participate in the conversation.

Sponsors

Survey Says

What is the single most important piece of your battlestation?

 

News

Designing Data-Intensive Applications

About this book

What is a data-intensive application per the book?

Any application whose primary challenge is:

That's in contrast to applications that are compute intensive.

Buzzwords that seem to be synonymous with data-intensive

This book is …

This book is NOT a tutorial on how to do data-intensive applications with a particular toolset or pure theory.

What the book IS:

Why read this book?

The goal is that by going through this, you will be able to understand what's available and why you would use various methods, algorithms, and technologies.

While this book is geared towards software engineers/architects and their managers, it will especially appeal to those that:

"[B]uilding for scale that you don't need is wasted effort and may lock you into an inflexible design."

Martin Kleppmann

Most of the book covers what is known as "big data" but the author doesn't like that term for good reason: "Big data" is too vague. Big data to one person is small data to someone else.

Instead, single node vs distributed systems are the types of language used in the book.

The book is also heavily biased towards FOSS (Free Open Source Software) because it's possible to dig in and see what's actually going on.

Are we living in the golden age of data?

Reliability

While reading this book, think about the systems that you use: How do they rate in terms of reliability, scalability, and maintainability?

What does it mean for your application to be reliable?

So in short – the application works correctly even when things go wrong.

When things go wrong, they're called "faults".

Faults are NOT the same as failures: a fault did something not to spec, a failure means a service is unavailable.

Hardware Faults

Typically, hardware failures are solved by adding redundancies:

As time has marched on, single machine resiliency has been deprioritized in favor of elasticity, i.e. the ability to scale up / down more machines. As a result, systems are now being built to be fault tolerant of machine loss.

Software Errors

Software errors usually happen by some weird event that was not planned for and can be more difficult to track down than hardware errors. Examples include:

Human Errors

Humans can be the least reliable part of any system. So, how can we make systems reliable in spite of our best efforts to crash them?

How important is reliability?

Resources We Like

Tip of the Week