This is an implementation for Learning with AMIGo: Adversarially Motivated Intrinsic GOals.
The method described in the AMIGo paper listed below is implemented in monobeast/minigrid/monobeast_amigo.py of this repository. Please consult that file for details of the teacher and student policies, the losses used to train them, and other aspects of training.
The student policy is created in class MinigridNet
. The teacher policy is created in class Generator
. The training loop is defined in train()
and is divided into act()
which collects the batches generated by the actors, and learn()
which updates the learner based on vtrace
. Training is based on the TorchBeast implementation of IMPALA (Monobeast version).
If you have any questions or feel the code needs further clarification in the form of comments, please do not hesitate to raise an issue.
If you use AMIGo in your research and found it helpful, or are comparing against our results, please consider citing the following paper:
@article{campero2020learning,
title={Learning with AMIGo: Adversarially Motivated Intrinsic Goals},
author={Campero, Andres and Raileanu, Roberta and K{\"u}ttler, Heinrich and Tenenbaum, Joshua B and Rockt{\"a}schel, Tim and Grefenstette, Edward},
journal={arXiv preprint arXiv:2006.12122},
year={2020}
}
# create a new conda environment
conda create -n amigo python=3.7
conda activate amigo
# install dependencies
git clone [email protected]:facebookresearch/adversarially-motivated-intrinsic-goals.git
cd adversarially-motivated-intrinsic-goals
pip install -r requirements.txt
# Run AMIGo on MiniGrid Environment
OMP_NUM_THREADS=1 python -m monobeast.minigrid.monobeast_amigo --env MiniGrid-KeyCorridorS5R3-v0 \
--num_actors 40 --modify --generator_batch_size 150 --generator_entropy_cost .05 \
--generator_threshold -.5 --total_frames 600000000 \
--generator_reward_negative -.3 --disable_checkpoint \
--savedir ./experimentMinigrid
Please be sure to use --total_frames as in the paper:
6e8 for KeyCorridorS4R3-v0, KeyCorridorS5R3-v0, ObstructedMaze-2Dlhb-v0, ObstructedMaze-1Q-v0
3e7 for KeyCorridorS3R3 and ObstructedMaze-1Dl-v0
We used an open sourced implementation of the exploration baselines (i.e. RIDE, RND, ICM, and Count). This code should be pulled in a separate local repository and run within a separate environment.
# create a new conda environment
conda create -n ride python=3.7
conda activate ride
# install dependencies
git clone [email protected]:facebookresearch/impact-driven-exploration.git
cd impact-driven-exploration
pip install -r requirements.txt
To reproduce the baseline results in the paper, run:
OMP_NUM_THREADS=1 python -m python main.py --env MiniGrid-ObstructedMaze-1Q-v0 \
--intrinsic_reward_coef 0.01 --entropy_cost 0.0001
with the corresponding best values for the --intrinsic_reward_coef
and --entropy_cost
reported in the paper for each model.
Set --model
to ride
, rnd
, curiosity
, or count
for RIDE, RND, ICM, or Count, respectively.
Set --use_fullobs_policy
for using a full view of the environment as input to the policy network.
Set --use_fullobs_intrinsic
for using full views of the environment to compute the intrinsic reward.
The default uses a partial view of the environment for both the policy and the intrinsic reward.
The code in this repository is released under Creative Commons Attribution-NonCommercial 4.0 International License (CC-BY-NC 4.0).
adversarially-motivated-intrinsic-goals's People
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.