Comments (4)
It is set by default. You can see the config in this line of configs.yaml. worker_rews: {extr: 1.0, expl: 0.0, goal: 1.0}
from director.
I’m not sure if I got it correctly. The default config file has worker_rews: {extr: 0.0, expl: 0.0, goal: 1.0}. Besides, by using this line, I think what would change is that the worker would have an additional critic, but I still don’t get how the task reward would be provided to the worker. Also, if it works better for most of tasks that don’t have sparse reward profiles, why don’t use this config in general? Does it lead to worst performances when it comes to envs like Ant Maze?
thank you so much!
from director.
Oh my bad. I think I have changed it in my local repo. If you change extr to non-zero, then I think the worker takes the extrinsic reward from the world model? It's in hierarchy.py. Also, the reason why the default is set to 0 is due to the intention of the design I guess. In section 2.4 in the paper, it says "We make this design choice to demonstrate that the interplay between the manager and the worker is successful acorss many environments, although we also include ...". If worker takes task reward, it can be thought of as cheating because ideally the worker policy should be rewarded based on if it reached the goal or not.
from director.
Thank you so much for your answer! Could you expand a bit on the reason why it would be like cheating? Why can't we give to both manager and worker access to the extrinsic rewards? Conceptually speaking, although the manager is supposed to have a higher level perspective, and therefore should be able to access the extrinsic rewards, I can't see why the worker shouldn't as well. Could reference me some paper where such thing is mentioned, or expand on this? Thank you so much!
from director.
Related Issues (8)
- Can I train in the environment of minecraft? HOT 1
- How to do inference with trained model?? HOT 3
- Visualizing decoded skills in hierarchy.py HOT 1
- How to "run" agent after training or visualize results
- "multi_gpu" and "multi_worker" configurations not working HOT 7
- Slow training or OOM Error on single gpu HOT 4
- Recreation loss for goal autoencoder HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from director.