Model based on reinforcement learning to solve a form of multi-armed bandit problem. From the available ads, we will be finding the best one in the minimal amount of time so as to maximize the number of clicks on the ad.
rbnp98 / best-ad-selection-upper-confidence-bound Goto Github PK
View Code? Open in Web Editor NEWModel based on reinforcement learning to solve a form of multi-armed bandit problem.