Naive implementations of CSE525 HWs
Please check Environment.ipynb
- Ubuntu 18.04
- Python 3.6.9
- torch 1.5.0
- numpy 1.18.3
- Baseline: Vanilla Online Q Learning (without target network and replay buffer)
- Fitted Q Iteration
- DQN
- DQN-like-version of "SARSA"
- InvertedPendulumMuJoCoEnv-v0
- HalfCheetahMuJoCoEnv-v0
- Breakout-v4-ram
- REINFORCE
- Advantage Actor Critic (A2C)
- InvertedPendulumMuJoCoEnv-v0
- HalfCheetahMuJoCoEnv-v0
- Offline dynamics model learning
- Online dynamics model learning
- Breakout-v4