Day 1: Overview, prereq probability concepts
Day 2: Epsilon greedy
Day 3: Optimistic initial value, Upper Confidence Bound 1
Day 4: Bayes Bandits / Thompson Sampling theory, Beta Distribution
Day 5: Thompson Sampling with Gaussian Reward theory
keejin / reinforcement-learning Goto Github PK
View Code? Open in Web Editor NEWRepository for my personal reinforcement journey