Listen

Description

Deep learning and neural networks. How to stack our logisitic regression units into a multi-layer perceptron.

## Resources
- Overview:
** Deep Learning Simplified (https://www.youtube.com/watch?v=b99UVkWzYTQ) `video:easy` quick series to get a lay-of-the-land.
- Quickstart:
** TensorFlow Tutorials (https://www.tensorflow.org/get_started/get_started) `tutorial:medium`
- Deep-dive code (pick one):
** Fast.ai (http://course.fast.ai/) `course:medium` practical DL for coders
** Hands-On Machine Learning with Scikit-Learn and TensorFlow (http://amzn.to/2tVdIXN) `book:medium`
- Deep-dive theory:
** Deep Learning Book (http://amzn.to/2tXgCiT) (Free HTML version (http://www.deeplearningbook.org/)) `book:hard` comprehensive DL bible; highly mathematical

## Episode
- Value
** Represents brain? Magic black-box
** Feature learning (layer removed from programmer)
** Subsumes AI
- Stacked shallow learning
** Logistic regression = lego, Neural Network = castle
- Deep Learning => ANNs => MLPs (& RNNs, CNNs, DQNs, etc)
** MLP: Perceptron vs LogReg / sigmoid activation
- Architecture
** (Feed forward) Input => Hidden Layers => Hypothesis fn
** "Feed forward" vs recursive (RNNs, later)
** (Loss function) Cross entropy
** (Learn) Back Propagation
- Price ~ smoking + obesity + age^2
** 1-layer MLP
- Face? ~ pixels
** Extra layer = hierarchical breakdown
** Inputs => Employees => Supervisors => Boss
- Backprop / Gradient descent
** Optimizers: adagrad, adam, ... vs gradient descent
- Silver bullet, but don't abuse
** linear (housing market)
** features don't combine
** expensive: like hiring a company when the boss h(x) does all the work
- Brian comparison (dentrites, axons); early pioneers as neuroscientists / cogsci
- Different types
** vs brain
** RNNs
** CNNs
- Activation fns
** Activation units / neurons (hidden layer)
** Relu, TanH, Sigmoid