You heard about the amazing results achieved by Deepmind with AlphaGo Zero and by OpenAI in Dota 2! Don't you want to know how they work? This is the right opportunity for you and me to finally learn Deep RL and use it on new exciting projects.
The ultimate aim is to use these general-purpose technologies and apply them to all sorts of important real world problems. Demis Hassabis
This repository wants to guide you through the Deep Reinforcement Learning algorithms, from the most basic ones to the highly advanced AlphaGo Zero. You will find the main topics organized by week and the resources suggested to learn them. Also, every week I will provide practical examples implemented in python to help you better digest the theory. You are highly encouraged to modify and play with them!
This is my first project of this kind, so please, if you have any idea, suggestion or improvement contact me at [email protected].
- Basic level of Python and PyTorch
- Machine Learning
- Basic knowledge in Deep Learning (MLP, CNN and RNN)
- Q-learning
- DQN
- AC2
- ES
- AlphaGo Zero
-
An introduction to Reinforcement Learning by Arxiv Insights
-
Introduction and course overview - CS294 by Levine
-
Deep Reinforcement Learning: Pong from Pixels by Karpathy
Those who cannot rember the part are condomned to repeat it - George Santayana
This week, we will learn about the basic blocks of reinforcement learning, starting from the definition of the problem all the way through the estimation and optimization of the functions that are used to express the quality of a policy or state.
-
Markov Decision Process - RL by David Silver
Formalizing RL problem using MDP
- Markov Processes
- Markov Decision Processes
-
Planning by Dynamic Programming - RL by David Silver
How to solve known MDP
- Policy iteration
- Value iteration
-
Model-Free Prediction - RL by David Silver
Estimate the value function of unknown MDP
- Monte Carlo Learning
- Temporal Difference Learning
- TD(ฮป)
-
Model-Free Control - RL by David Silver
Optimise the value function of an unknown MDP
- ฦ-greedy policy iteration
- GLIE Monte Carlo Search
- SARSA
- Importance Sampling
Q-learning applied to FrozenLake. For exercise, you can solve the game using SARSA or implement Q-learning by yourself. In the former case, only few changes are needed.
- Read chapters 3,4,5,6,7 of Reinforcement Learning An Introduction - Sutton, Barto
๐บ Deep Reinforcement Learning - UC Berkeley class by Levine, check here their site.
๐บ Reinforcement Learning course - by David Silver, DeepMind. Great introductory lectures by Silver, a lead researcher on AlphaGo. They follow the book Reinforcement Learning by Sutton & Barto.
๐ Reinforcement Learning: An Introduction - by Sutton & Barto. The "Bible" of reinforcement learning. Here you can find the PDF draft of the second version.
๐ Awesome Reinforcement Learning. A curated list of resources dedicated to reinforcement learning