Topic: on-policy Goto Github

Some thing interesting about on-policy

👇 Here are 15 public repositories matching this topic...

amirhosein-mesbah / reinforcement_learning

on-policy,This repository contains the implementation of a wide variety of Reinforcement Learning Projects in different applications of Bandit Algorithms, MDPs, Distributed RL and Deep RL. These projects include university projects and projects implemented due to interest in Reinforcement Learning.

User: amirhosein-mesbah

bandit-algorithms deep-reinforcement-learning deeprl distributed-reinforcement-learning gym mdp multi-agent-reinforcement-learning network-routing off-policy on-policy q-learning reinforcement-learning stablebaselines3

by571 / pytorch-vmpo

on-policy,PyTorch implementation of V-MPO

User: by571

on-policy reinforcement-learning vmpo pytorch-implementation v-mpo

fardinabbasi / tabulated_rl

on-policy,Interactive Learning [ECE 641] - Fall 2023 - University of Tehran - Prof. Nili

User: fardinabbasi

grid-world markov-decision-processes mdp off-policy on-policy q-learning sarsa tree-backup value-iteration

kristogj / on-policy-mcts

on-policy,Monte Carlo Search Tree for training shared Actor-Critic-Network on the game Hex🏋️

User: kristogj

reinforcement-learning mcts hex on-policy pytorch

mabirck / cs294-deeprl

on-policy,My content of CS294 Deep Reinforcement Learning course, conduced by Sergey Levine from UC Berkeley.

User: mabirck

reinforcement-learning cs294 deep-learning neural-networks reinforcement policy-gradient on-policy off-policy deep-reinforcement-learning deep-neural-networks

marcometer / episodic-transformer-memory-ppo

on-policy,Clean baseline implementation of PPO using an episodic TransformerXL memory

User: marcometer

actor-critic deep-reinforcement-learning episodic-memory gated-transformer-xl gtrxl memory-gym on-policy policy-gradient pomdp ppo proximal-policy-optimization pytorch transformer transformer-xl trxl

marcometer / recurrent-ppo-truncated-bptt

on-policy,Baseline implementation of recurrent PPO using truncated BPTT

User: marcometer

actor-critic bptt deep-learning deep-reinforcement-learning gru lstm on-policy policy-gradient pomdp ppo proximal-policy-optimization pytorch recurrence recurrent recurrent-neural-networks truncated

narjesno / reinforcement-learning

on-policy,This repository contains all of the Reinforcement Learning-related projects I've worked on. The projects are part of the graduate course at the University of Tehran.

User: narjesno

dynamic-programming off-policy on-policy model-free-rl model-based-rl monte-carlo sarsa n-step-bootstrapping n-step-expected-sarsa n-step-tree-backup