We're going to use the policy gradient technique from reinforcement learning to beat the game of Pong. We'll use OpenAI's Universe as an environment for our agent and I'll go over the process of setting it up as well as the math behind the PG method in detail.
Microphone popping issues end at 11:15 . That cannot happen again. Udacity is aware of this and will be more prepared next time.
Code for this video:
https://github.com/llSourcell/Policy_Gradients_to_beat_Pong
Join us in the Wizards Slack channel:
http://wizards.herokuapp.com/
More Learning resources:
http://www.scholarpedia.org/article/Policy_gradient_methods
http://proceedings.mlr.press/v32/silver14.pdf
http://karpathy.github.io/2016/05/31/rl/
http://home.deib.polimi.it/restelli/MyWebSite/pdf/rl7.pdf
http://www0.cs.ucl.ac.uk/staff/D.Silver/web/Teaching_files/pg.pdf
https://github.com/dennybritz/reinforcement-learning/tree/master/PolicyGradient
Please Subscribe! And like. And comment. That's what keeps me going.
And please support me on Patreon:
https://www.patreon.com/user?u=3191693
Follow me:
Twitter: https://twitter.com/sirajraval
Facebook: https://www.facebook.com/sirajology Instagram: https://www.instagram.com/sirajraval/ Instagram: https://www.instagram.com/sirajraval/