Coder Social home page Coder Social logo

openai / atari-reset Goto Github PK

View Code? Open in Web Editor NEW
193.0 193.0 44.0 1.16 MB

Code for the blog post "Learning Montezuma’s Revenge from a Single Demonstration"

Home Page: https://blog.openai.com/learning-montezumas-revenge-from-a-single-demonstration/

License: MIT License

Python 100.00%
paper

atari-reset's People

Contributors

cberner avatar christopherhesse avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

atari-reset's Issues

The MPI_Comm_test_inter() function was called before MPI_INIT was invoked.

I am running the code from https://github.com/uber-research/atari-reset, but since you have not enabled issues there I write here instead.

When I try to run your robustification code with the default parameters from https://github.com/uber-research/go-explore i get the following error:

*** The MPI_Comm_test_inter() function was called before MPI_INIT was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
[hampusa:2940] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!

I am trying to run it on a single machine, and I have tried if setting --nenvs=1 would help but there is no difference.

stack overflow.

Fatal Python error: Cannot recover from stack overflow.
Current thread 0x00007fd2edffb700 (most recent call first):
File "/home/xwq/anaconda3/envs/goexp/lib/python3.7/site-packages/gym/core.py", line 238 in getattr
File "/home/xwq/anaconda3/envs/goexp/lib/python3.7/site-packages/gym/core.py", line 238 in getattr
File "/home/xwq/anaconda3/envs/goexp/lib/python3.7/site-packages/gym/core.py", line 238 in getattr
...

when it run to : env = SubprocVecEnv([make_env(i + nenvs * hvd.rank()) for i in range(nenvs)])
I am running in single machine

Could you provide model trained for Pitfall?

Hello, due to the large GPU resource needed to reproduce the experiment, it is difficult for us to check the performance in paper and make some relative research about your idea. Could you provide model trained for Pitfall and have positive score to validate results? I would be very appreciated for that !!

--game=Pong argument crashes

When I run
python3 train_atari.py --game=Pong

It crashes with this callstack:

Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/fanbingbing/ML/atari-reset/atari_reset/wrappers.py", line 469, in worker
    env = env_fn_wrapper.x()
  File "train_atari.py", line 34, in env_fn
    env = ReplayResetEnv(env, demo_file_name='demos/'+game_name+'.demo', seed=rank, workers_per_sp=workers_per_sp)
  File "/home/fanbingbing/ML/atari-reset/atari_reset/wrappers.py", line 107, in __init__
    assert len(rewards) == len(self.actions)
AssertionError

When I run
python3 train_atari.py --game=MontezumaRevenge

it works fine

Info:
Ubuntu 18.04
Python 3.6.6
synced to latest atari-reset 0c1b112
synced to latest baselines 28aca63

Reproducing the results

Hi,

Congratulations on your results! I have a couple of questions about your code:

  1. How much time did the training on 128 GPUs take?
  2. Is there any chance of retraining your code with 1-4 cards? I'm also doing research on hard Atari games and I'm planning on building an 8 GPU system, since I'm bothered by the vast amount of time it takes to do these experiments.

Thank you very much.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.