anita-hu / tf2-rl Goto Github PK

View Code? Open in Web Editor NEW

296.0 6.0 68.0 6.02 MB

Reinforcement learning algorithms implemented for Tensorflow 2.0+ [DQN, DDPG, AE-DDPG, SAC, PPO, Primal-Dual DDPG]

License: MIT License

Python 100.00%

reinforcement-learning tensorflow2 openai-gym ddpg dqn sac ppo ae-ddpg tensorboard

tf2-rl's People

Contributors

Stargazers

Watchers

Forkers

gao370829 deathpure matyle stepneverstop abluceli llt1 chocolate0086 andyli386 trendingtechnology yunxileo highbuyer gan-x-j whidbey lfy80 adoresli uvipen good-repos xinkuo123 yanmuu monkeystrive dewenli armando-fandango dieptran43 wuzh07 hermit-alex dannylee1991 wyz1074152339 mengboy1 yhc1338 zhongzishi davidlisten siyuez qiwang-sjtu rudaiyan liuminwen co233 phamqv sawyer260 haixiaoxuan lanski-ai bigbearblacken zhengzk-rgb kuning19901 yutiansut woshiwangbuer qjy-liveforover tjevgerres ninjacomics pinyaras kwonjunhyung alaatekleh terryzhang95 enjun silvesterchen hfut-li shiyk-0517 bailangning russellatca21 passcet46 ning17 qcz01 boliqi eecqupt shenjiede taabannn

tf2-rl's Issues

PPO问题

你好，我在使用你的PPO模型训练离散输出的环境时，没有任何问题。但是用到连续输出的环境时，就会报错如下：

Traceback (most recent call last):
File "/opt/homebrew/Caskroom/miniforge/base/envs/tf28/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3398, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 1, in <cell line: 1>
runfile('/Users/pc/Documents/myproject/TF2-RL-master/PPO/TF2_PPO.py', wdir='/Users/pc/Documents/myproject/TF2-RL-master/PPO')
File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydev_bundle/pydev_umd.py", line 198, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/Users/pc/Documents/myproject/TF2-RL-master/PPO/TF2_PPO.py", line 282, in
ppo.train(max_epochs=3000, save_freq=50)
File "/Users/pc/Documents/myproject/TF2-RL-master/PPO/TF2_PPO.py", line 214, in train
self.learn(*sampled_data)
File "/Users/pc/Documents/myproject/TF2-RL-master/PPO/TF2_PPO.py", line 168, in learn
self.model_optimizer.apply_gradients(zip(grad, train_variables))
File "/opt/homebrew/Caskroom/miniforge/base/envs/tf28/lib/python3.10/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py", line 672, in apply_gradients
return self._distributed_apply(strategy, grads_and_vars, name,
File "/opt/homebrew/Caskroom/miniforge/base/envs/tf28/lib/python3.10/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py", line 719, in _distributed_apply
update_op = distribution.extended.update(
File "/opt/homebrew/Caskroom/miniforge/base/envs/tf28/lib/python3.10/site-packages/tensorflow/python/distribute/distribute_lib.py", line 2630, in update
return self._update(var, fn, args, kwargs, group)
File "/opt/homebrew/Caskroom/miniforge/base/envs/tf28/lib/python3.10/site-packages/tensorflow/python/distribute/distribute_lib.py", line 3703, in _update
return self._update_non_slot(var, fn, (var,) + tuple(args), kwargs, group)
File "/opt/homebrew/Caskroom/miniforge/base/envs/tf28/lib/python3.10/site-packages/tensorflow/python/distribute/distribute_lib.py", line 3709, in _update_non_slot
result = fn(*args, **kwargs)
File "/opt/homebrew/Caskroom/miniforge/base/envs/tf28/lib/python3.10/site-packages/tensorflow/python/autograph/impl/api.py", line 595, in wrapper
return func(*args, **kwargs)
File "/opt/homebrew/Caskroom/miniforge/base/envs/tf28/lib/python3.10/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py", line 702, in apply_grad_to_update_var
update_op = self._resource_apply_dense(grad, var, **apply_kwargs)
File "/opt/homebrew/Caskroom/miniforge/base/envs/tf28/lib/python3.10/site-packages/tensorflow/python/keras/optimizer_v2/adam.py", line 173, in _resource_apply_dense
return gen_training_ops.ResourceApplyAdam(
File "/opt/homebrew/Caskroom/miniforge/base/envs/tf28/lib/python3.10/site-packages/tensorflow/python/util/tf_export.py", line 400, in wrapper
return f(**kwargs)
File "/opt/homebrew/Caskroom/miniforge/base/envs/tf28/lib/python3.10/site-packages/tensorflow/python/ops/gen_training_ops.py", line 1427, in resource_apply_adam
_ops.raise_from_not_ok_status(e, name)
File "/opt/homebrew/Caskroom/miniforge/base/envs/tf28/lib/python3.10/site-packages/tensorflow/python/framework/ops.py", line 7164, in raise_from_not_ok_status
raise core._status_to_exception(e) from None # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.NotFoundError: No registered 'ResourceApplyAdam' OpKernel for 'GPU' devices compatible with node {{node ResourceApplyAdam}}
(OpKernel was found, but attributes didn't match) Requested Attributes: T=DT_DOUBLE, use_locking=true, use_nesterov=false
. Registered: device='XLA_CPU_JIT'; T in [DT_FLOAT, DT_DOUBLE, DT_COMPLEX64, DT_BFLOAT16, DT_COMPLEX128, DT_HALF]
device='GPU'; T in [DT_FLOAT]
device='CPU'; T in [DT_HALF]
device='CPU'; T in [DT_BFLOAT16]
device='CPU'; T in [DT_FLOAT]
device='CPU'; T in [DT_DOUBLE]
device='CPU'; T in [DT_COMPLEX64]
device='CPU'; T in [DT_COMPLEX128]
[Op:ResourceApplyAdam]

回溯查询，好像是在计算梯度的时候不能分配到GPU。我的工作环境时mac M1.如果是设备问题，为什么离散的环境（比如小车爬坡）又没有问题呢？问题是不是出在tfd.Normal这个函数上？请把以上这段翻译成英文

ValueError: Expected scalar shape, saw shape: (1,).

At first thank you Anita -hu for providing code. i am executing AEDDPG code ,with continuous Action space gym environments ,i am getting following error .kindly help solving the error.

self.action_space=spaces.Box(low=-1,high=1,shape=(1,),dtype=np.float32)

File "TF2_AE_DDPG.py", line 212, in async_collection
tf.summary.scalar('Stats/action', action, step=total_steps)
File "C:\New folder\envs\test\lib\site-packages\tensorboard\plugins\scalar\summary_v2.py", line 61, in scalar
tf.debugging.assert_scalar(data)
File "C:\New folder\envs\test\lib\site-packages\tensorflow_core\python\ops\check_ops.py", line 2068, in assert_scalar_v2
assert_scalar(tensor=tensor, message=message, name=name)
File "C:\New folder\envs\test\lib\site-packages\tensorflow_core\python\ops\check_ops.py", line 2098, in assert_scalar
% (message or '', shape,))
ValueError: Expected scalar shape, saw shape: (1,).

Help regarding pushing code on GPU

Hi Anita hu,
I am Jewaliddinn shaik doing Ph.D from NIT AP,india . I need to force AE_DDPG code to work on GPU mode .kindly suggest me which lines modifications need in code. thank you in advance awaiting for positive response.

Could you please add a license?

I'd like to use your code. Could you please add a license to your code? Thanks a ton!

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.