Coder Social home page Coder Social logo

feudal_networks's People

Contributors

davidhershey avatar wulfebw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

feudal_networks's Issues

How can I run this project?

I have checked the code for a long time,but I still can't successfully run the code.Train.py seems have some problems.

Does code achieve benchmarks

Does this code achieve the benchmarks given in the paper? I modified this to work on my system, but it doesn't converge after running it for a few days.

more arguments ??

test_init (tests.test_policies.test_lstm_policy.TestLSTMPolicy) ... ERROR

======================================================================
ERROR: test_intrinsic_reward_and_gsum_calculation (tests.test_policies.test_feudal_batch_processor.TestFeudalBatchProcessor)

Traceback (most recent call last):
File "/Users/user/Desktop/feudal_networks/tests/test_policies/test_feudal_batch_processor.py", line 154, in test_intrinsic_reward_and_gsum_calculation
b = Batch(obs, a, returns, terminal, s, g, features)
TypeError: new() takes exactly 7 arguments (8 given)

Not sure why theres more arguments than what it can take.

Could you explain how to run it?

when i run it by typing train.py, i got this:

Executing the following commands:
mkdir -p /tmp/pong
echo /usr/bin/python train.py > /tmp/pong/cmd.sh
kill $( lsof -i:12345 -t ) > /dev/null 2>&1
kill $( lsof -i:12222-12223 -t ) > /dev/null 2>&1
tmux kill-session -t a3c
tmux new-session -s a3c -n ps -d bash
tmux new-window -t a3c -n w-0 bash
tmux new-window -t a3c -n tb bash
tmux new-window -t a3c -n htop bash
sleep 1
tmux send-keys -t a3c:ps 'CUDA_VISIBLE_DEVICES= /usr/bin/python worker.py --log-dir /tmp/pong --env-id PongDeterministic-v4 --num-workers 1 --job-name ps' Enter
tmux send-keys -t a3c:w-0 'CUDA_VISIBLE_DEVICES= /usr/bin/python worker.py --log-dir /tmp/pong --env-id PongDeterministic-v4 --num-workers 1 --job-name worker --task 0 --remotes 1 --policy lstm' Enter
tmux send-keys -t a3c:tb 'tensorboard --logdir /tmp/pong --port 12345' Enter
tmux send-keys -t a3c:htop htop Enter

Use tmux attach -t a3c to watch process output
Use tmux kill-session -t a3c to kill the job
Point your browser to http://localhost:12345 to see Tensorboard

I don't know how tmux works, but there is no error sign.
What did do wrong?

Transition Policy Gradients

From the paper:
in fact the properform for the transition policy gradient arrived at in eqn.10.

manager_loss = -tf.reduce_sum((self.r-cutoff_vf_manager)*dcos) ( from code )
why not implement the eqn 10.

trouble in "--policy feudal"

Hi, I would like to use your project,but I got some trouble in setting "--policy feudal". I can directly run python train.py and it works normally with default "--policy lstm". But when I switch to add this parameter as python train.py --policy feudal, I got following output:

[2018-04-19 22:01:28,989] Events directory: /tmp/pong/train_0
[2018-04-19 22:01:29,342] Starting session. If this hangs, we're mostly likely w
aiting to connect to the parameter server. One common cause is that the paramete
r server DNS name isn't resolving yet, or is misspecified.
2018-04-19 22:01:29.431565: I tensorflow/core/distributed_runtime/master_session
.cc:998] Start master session 0f5becf7698cbfb7 with config: intra_op_parallelism
_threads: 1 device_filters: "/job:ps" device_filters: "/job:worker/task:0/cpu:0"
inter_op_parallelism_threads: 2
Traceback (most recent call last):
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/client/session.py", line 1327, in _do_call
return fn(*args)
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/client/session.py", line 1306, in _run_fn
status, run_metadata)
[2018-04-19 22:01:28,989] Events directory: /tmp/pong/train_0
[2018-04-19 22:01:29,342] Starting session. If this hangs, we're mostly likely w
aiting to connect to the parameter server. One common cause is that the paramete
r server DNS name isn't resolving yet, or is misspecified.
2018-04-19 22:01:29.431565: I tensorflow/core/distributed_runtime/master_session
.cc:998] Start master session 0f5becf7698cbfb7 with config: intra_op_parallelism
_threads: 1 device_filters: "/job:ps" device_filters: "/job:worker/task:0/cpu:0"
inter_op_parallelism_threads: 2
Traceback (most recent call last):
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/client/session.py", line 1327, in _do_call
return fn(*args)
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/client/session.py", line 1306, in _run_fn
status, run_metadata)
[2018-04-19 22:01:28,989] Events directory: /tmp/pong/train_0
[2018-04-19 22:01:29,342] Starting session. If this hangs, we're mostly likely w
aiting to connect to the parameter server. One common cause is that the paramete
r server DNS name isn't resolving yet, or is misspecified.
2018-04-19 22:01:29.431565: I tensorflow/core/distributed_runtime/master_session
.cc:998] Start master session 0f5becf7698cbfb7 with config: intra_op_parallelism
_threads: 1 device_filters: "/job:ps" device_filters: "/job:worker/task:0/cpu:0"
inter_op_parallelism_threads: 2
Traceback (most recent call last):
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/client/session.py", line 1327, in _do_call
return fn(*args)
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/client/session.py", line 1306, in _run_fn
status, run_metadata)
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/contextlib.py", l
ine 88, in exit
next(self.gen)
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok
_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.NotFoundError: Key global/FeUdal/worker/
rnn/basic_lstm_cell/bias/Adam_1 not found in checkpoint
[[Node: save/RestoreV2_55 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:
ps/replica:0/task:0/cpu:0"](_recv_save/Const_0_S1, save/RestoreV2_55/tensor_name
s, save/RestoreV2_55/shape_and_slices)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "worker.py", line 174, in
tf.app.run()
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "worker.py", line 166, in main
run(args, server)
File "worker.py", line 94, in run
with sv.managed_session(server.target, config=config) as sess, sess.as_defau
lt():
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/contextlib.py", l
ine 81, in enter
return next(self.gen)
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/training/supervisor.py", line 964, in managed_session
self.stop(close_summary_writer=close_summary_writer)
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/training/supervisor.py", line 792, in stop
stop_grace_period_secs=self._stop_grace_secs)
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/training/coordinator.py", line 389, in join
six.reraise(*self._exc_info_to_raise)
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/six
.py", line 686, in reraise
raise value
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/training/supervisor.py", line 953, in managed_session
start_standard_services=start_standard_services)
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/training/supervisor.py", line 708, in prepare_or_wait_for_session
init_feed_dict=self._init_feed_dict, init_fn=self._init_fn)
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/training/session_manager.py", line 273, in prepare_session
config=config)
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/training/session_manager.py", line 205, in _restore_checkpoint
saver.restore(sess, ckpt.model_checkpoint_path)
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/training/saver.py", line 1560, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/client/session.py", line 1124, in _run
feed_dict_tensor, options, run_metadata)
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/client/session.py", line 1321, in _do_run
options, run_metadata)
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/client/session.py", line 1340, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Key global/FeUdal/worker/
rnn/basic_lstm_cell/bias/Adam_1 not found in checkpoint
[[Node: save/RestoreV2_55 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:
ps/replica:0/task:0/cpu:0"](_recv_save/Const_0_S1, save/RestoreV2_55/tensor_name
s, save/RestoreV2_55/shape_and_slices)]]
Caused by op 'save/RestoreV2_55', defined at:
File "worker.py", line 174, in
tf.app.run()
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "worker.py", line 166, in main
run(args, server)
File "worker.py", line 50, in run
saver = FastSaver(variables_to_save)
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/training/saver.py", line 1140, in init
self.build()
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/training/saver.py", line 1172, in build
filename=self._filename)
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/training/saver.py", line 688, in build
restore_sequentially, reshape)
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/training/saver.py", line 407, in _AddRestoreOps
tensors = self.restore_op(filename_tensor, saveable, preferred_shard)
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/training/saver.py", line 247, in restore_op
[spec.tensor.dtype])[0])
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/ops/gen_io_ops.py", line 663, in restore_v2
dtypes=dtypes, name=name)
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/framework/ops.py", line 2630, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/ten
sorflow/python/framework/ops.py", line 1204, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-
access
NotFoundError (see above for traceback): Key global/FeUdal/worker/rnn/ba[26/480]
_cell/bias/Adam_1 not found in checkpoint
[[Node: save/RestoreV2_55 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:
ps/replica:0/task:0/cpu:0"](_recv_save/Const_0_S1, save/RestoreV2_55/tensor_name
s, save/RestoreV2_55/shape_and_slices)]]
ERROR:tensorflow:==================================
Object was never used (type <class 'tensorflow.python.framework.ops.Tensor'>):
<tf.Tensor 'report_uninitialized_variables/boolean_mask/Gather:0' shape=(?,) dty
pe=string>
If you want to mark it as used call its "mark_used()" method.
It was originally created here:
['File "worker.py", line 174, in \n tf.app.run()', 'File "/home/xunti
an2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/tensorflow/python/plat
form/app.py", line 48, in run\n _sys.exit(main(_sys.argv[:1] + flags_passthro
ugh))', 'File "worker.py", line 166, in main\n run(args, server)', 'File "wor
ker.py", line 77, in run\n ready_op=tf.report_uninitialized_variables(variabl
es_to_save),', 'File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/sit
e-packages/tensorflow/python/util/tf_should_use.py", line 175, in wrapped\n r
eturn _add_should_use_warning(fn(*args, **kwargs))', 'File "/home/xuntian2/anaco
nda2/envs/fedal_tf16/lib/python3.6/site-packages/tensorflow/python/util/tf_shoul
d_use.py", line 144, in _add_should_use_warning\n wrapped = TFShouldUseWarni$
gWrapper(x)', 'File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site
-packages/tensorflow/python/util/tf_should_use.py", line 101, in init\n s
tack = [s.strip() for s in traceback.format_stack()]']
==================================
[2018-04-19 22:01:29,676] ==================================
Object was never used (type <class 'tensorflow.python.framework.ops.Tensor'>):
<tf.Tensor 'report_uninitialized_variables/boolean_mask/Gather:0' shape=(?,) dty
pe=string>
If you want to mark it as used call its "mark_used()" method.
It was originally created here:
['File "worker.py", line 174, in \n tf.app.run()', 'File "/home/xunti
an2/anaconda2/envs/fedal_tf16/lib/python3.6/site-packages/tensorflow/python/plat
form/app.py", line 48, in run\n _sys.exit(main(_sys.argv[:1] + flags_passthro
ugh))', 'File "worker.py", line 166, in main\n run(args, server)', 'File "wor
ker.py", line 77, in run\n ready_op=tf.report_uninitialized_variables(variabl
es_to_save),', 'File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/sit
e-packages/tensorflow/python/util/tf_should_use.py", line 175, in wrapped\n r
eturn _add_should_use_warning(fn(*args, **kwargs))', 'File "/home/xuntian2/anaco
nda2/envs/fedal_tf16/lib/python3.6/site-packages/tensorflow/python/util/tf_shoul
d_use.py", line 144, in _add_should_use_warning\n wrapped = TFShouldUseWarnin
gWrapper(x)', 'File "/home/xuntian2/anaconda2/envs/fedal_tf16/lib/python3.6/site
-packages/tensorflow/python/util/tf_should_use.py", line 101, in init\n s
tack = [s.strip() for s in traceback.format_stack()]']

could you please tell me what is the problem? Thanks a lot.

Feudal policy on PongDeterministic-v4

I have been trying to get the 'feudal' policy to work on the 'PongDeterministic-v4' environment but I had no luck. The 'lstm' policy seems to work for me, but If I change it to 'feudal' the episode rewards do not increase even after of 8 hours of training with 1 worker, they are stuck to -20, both on the 'master' branch and the 'dilated_fix' branch.

I saw the other issues mentioning that it doesn't achieve the benchmarks from the paper, but is it supposed to work on pong at least? or am I doing something wrong?

Shouldn't manager_vf be function of x_t?

Right after eq.(7) in the paper, the authors say V_t as a function of x_t. However, in the code it is a function of g_hat (feudal_policy.py->_build_manager()),
self.manager_vf = self._build_value(g_hat)
Shouldn't it be a function of x_t?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.