Hi Danijar, Reading the appendix of Director I couldn't understand w

How to reproduce fig. A.1 about director HOT 4 OPEN

danijar commented on June 10, 2024

How to reproduce fig. A.1

from director.

Comments (4)

jdubkim commented on June 10, 2024

It is set by default. You can see the config in this line of configs.yaml. worker_rews: {extr: 1.0, expl: 0.0, goal: 1.0}

from director.

Cmeo97 commented on June 10, 2024

I’m not sure if I got it correctly. The default config file has worker_rews: {extr: 0.0, expl: 0.0, goal: 1.0}. Besides, by using this line, I think what would change is that the worker would have an additional critic, but I still don’t get how the task reward would be provided to the worker. Also, if it works better for most of tasks that don’t have sparse reward profiles, why don’t use this config in general? Does it lead to worst performances when it comes to envs like Ant Maze?

thank you so much!

from director.

jdubkim commented on June 10, 2024

Oh my bad. I think I have changed it in my local repo. If you change extr to non-zero, then I think the worker takes the extrinsic reward from the world model? It's in hierarchy.py. Also, the reason why the default is set to 0 is due to the intention of the design I guess. In section 2.4 in the paper, it says "We make this design choice to demonstrate that the interplay between the manager and the worker is successful acorss many environments, although we also include ...". If worker takes task reward, it can be thought of as cheating because ideally the worker policy should be rewarded based on if it reached the goal or not.

from director.

Cmeo97 commented on June 10, 2024

Thank you so much for your answer! Could you expand a bit on the reason why it would be like cheating? Why can't we give to both manager and worker access to the extrinsic rewards? Conceptually speaking, although the manager is supposed to have a higher level perspective, and therefore should be able to access the extrinsic rewards, I can't see why the worker shouldn't as well. Could reference me some paper where such thing is mentioned, or expand on this? Thank you so much!

from director.

How to reproduce fig. A.1 about director HOT 4 OPEN

Comments (4)

Related Issues (8)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent