Coder Social home page Coder Social logo

ru-automated-reasoning-group / pi-prl Goto Github PK

View Code? Open in Web Editor NEW
15.0 15.0 2.0 10.69 MB

ICLR'22 Programmatic Reinforcement Learning

License: MIT License

Python 100.00%
differentiable-programming interpretable-machine-learning program-synthesis reinforcement-learning

pi-prl's People

Contributors

roadsong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

roadsong skywuuuu

pi-prl's Issues

Problems when running the program

Hello! I'm trying to run the example in the README by running "python3 pi_PRL.py". However I got error like

not enough values to unpack (expected 5, got 4)
Timeout Error raised... Trying again
Traceback (most recent call last):
  File "pi_PRL.py", line 664, in <module>
    plot_keys=['stoc_pol_mean', 'running_score'])
  File "/home/wyx/Documents/Analogy/code/pi-PRL/mjrl/utils/train_agent.py", line 153, in train_agent_flip
    stats = agent.train_step(**args)
  File "/home/wyx/Documents/Analogy/code/pi-PRL/mjrl/algos/batch_reinforce.py", line 84, in train_step
    paths = trajectory_sampler.sample_paths(**input_dict)
  File "/home/wyx/Documents/Analogy/code/pi-PRL/mjrl/samplers/core.py", line 144, in sample_paths
    for result in results:
TypeError: 'NoneType' object is not iterable

Would you like to help me with that?

And I also noticed that if I run the code directly, I'll get error like

'numpy.random._generator.Generator' object has no attribute 'randn'

I tried to fix it by substitute all "self.np_random" by "np.random". Maybe this cause the first error. Would you like to share with me the numpy version you used in the project?

Question about PID controllers on LunarLander

Hello,

I have a question about the results that you obtained with LunarLander-v2 in the appendix of the paper. Specifically, I want to know about the criteria used for selecting observations with PID controller. Were all observations utilized, or was there a specific method for their selection in the LunarLander task?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.