Coder Social home page Coder Social logo

keras-rl2's Introduction


Gitter Gitter Gitter Gitter

Deep Reinforcement Learning for Tensorflow 2 Keras

NOTE: Requires tensorflow==2.1.0

What is it?

keras-rl2 implements some state-of-the art deep reinforcement learning algorithms in Python and seamlessly integrates with the deep learning library Keras.

Furthermore, keras-rl2 works with OpenAI Gym out of the box. This means that evaluating and playing around with different algorithms is easy.

Of course you can extend keras-rl2 according to your own needs. You can use built-in Keras callbacks and metrics or define your own. Even more so, it is easy to implement your own environments and even algorithms by simply extending some simple abstract classes. Documentation is available online.

What is included?

As of today, the following algorithms have been implemented:

  • Deep Q Learning (DQN) [1], [2]
  • Double DQN [3]
  • Deep Deterministic Policy Gradient (DDPG) [4]
  • Continuous DQN (CDQN or NAF) [6]
  • Cross-Entropy Method (CEM) [7], [8]
  • Dueling network DQN (Dueling DQN) [9]
  • Deep SARSA [10]
  • Asynchronous Advantage Actor-Critic (A3C) [5]
  • Proximal Policy Optimization Algorithms (PPO) [11]

You can find more information on each agent in the doc.

Installation

  • Install Keras-RL2 from Pypi (recommended):
pip install keras-rl2
  • Install from Github source:
git clone https://github.com/wau/keras-rl2.git
cd keras-rl
python install .

Examples

If you want to run the examples, you'll also have to install:

For atari example you will also need:

  • Pillow: pip install Pillow
  • gym[atari]: Atari module for gym. Use pip install gym[atari]

Once you have installed everything, you can try out a simple example:

python examples/dqn_cartpole.py

This is a very simple example and it should converge relatively quickly, so it's a great way to get started! It also visualizes the game during training, so you can watch it learn. How cool is that?

If you have questions or problems, please file an issue or, even better, fix the problem yourself and submit a pull request!

References

  1. Playing Atari with Deep Reinforcement Learning, Mnih et al., 2013
  2. Human-level control through deep reinforcement learning, Mnih et al., 2015
  3. Deep Reinforcement Learning with Double Q-learning, van Hasselt et al., 2015
  4. Continuous control with deep reinforcement learning, Lillicrap et al., 2015
  5. Asynchronous Methods for Deep Reinforcement Learning, Mnih et al., 2016
  6. Continuous Deep Q-Learning with Model-based Acceleration, Gu et al., 2016
  7. Learning Tetris Using the Noisy Cross-Entropy Method, Szita et al., 2006
  8. Deep Reinforcement Learning (MLSS lecture notes), Schulman, 2016
  9. Dueling Network Architectures for Deep Reinforcement Learning, Wang et al., 2016
  10. Reinforcement learning: An introduction, Sutton and Barto, 2011
  11. Proximal Policy Optimization Algorithms, Schulman et al., 2017

keras-rl2's People

Contributors

atra94 avatar dongyeongkim avatar edurenye avatar inarikami avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

keras-rl2's Issues

Problem logging with tensorboard

I am trying to log the training of my agent over time.

I am used to using tensorboard, however when I try and create a callback, i get an error when running a fit.

from keras.callbacks import TensorBoard
tb =  TensorBoard(log_dir='./keras-rl')
dqn.fit(env, nb_steps=1200000, visualize=False, verbose=1, callbacks=[tb])


AttributeError: 'TensorBoard' object has no attribute '_should_trace'

And when I try and use WandbLogger as suggested:

from keras.callbacks import WandbLogger

ImportError: cannot import name 'WandbLogger' from 'keras.callbacks' 

If there is a solution to this I would be thankful, or another way of doing live monitoring would be great too!

Did someone know why DQN agent add one shape to array?

So this is my setting:

keras==2.3.1
keras-rl2
tensorflow-gpu==2.0.0-beta1
numpy==1.16.4

And when i fit the DQN agent, my model go on full craziness

#Build the NN
model = Sequential()
model.add(Convolution2D(32, (8, 8), strides=(4, 4),input_shape=input_shape,activation='relu'))
model.add(Convolution2D(64, (4, 4), strides=(2, 2), activation='relu'))
model.add(Convolution2D(64, (2, 2), strides=(1, 1), activation='relu'))
model.add(Flatten())
model.add(Dense(512,activation='relu'))
model.add(Dense(nb_actions, activation='softmax'))
print(model.summary())

policy = BoltzmannGumbelQPolicy()
memory= SequentialMemory(limit=50000, window_length = 1)
dqn = DQNAgent(model = model, nb_actions = nb_actions , memory=memory, train_interval=1,
               nb_steps_warmup=5000, target_model_update=10000, policy=policy)

dqn.compile(Adam(lr=1e-4), metrics=['mae'])

This is the error :

ValueError: Error when checking input: expected conv2d_input to have 4 dimensions, but got array with shape (1, 1, 240, 256, 3) 

But it start here:

/usr/local/lib/python3.6/dist-packages/rl/core.py in fit(self, env, nb_steps, action_repetition, callbacks, verbose, visualize, nb_max_start_steps, start_step_policy, log_interval, nb_max_episode_steps)

row #169 core.py

I'm quite new in this library, maybe i'm missing something

DDPG ValueError: name for name_scope must be a string.

While running either DDPG agent, I encounter a value error in a Tensorflow 2 ops.py method. The problem appears to be recreated whenever the AdditionalUpdatesOptimizer class is initialized.

Error:

ValueError: name for name_scope must be a string.

The potential error causing class in Keras-rl2 utils.py:

class AdditionalUpdatesOptimizer(optimizers.Optimizer):
    def __init__(self, optimizer, additional_updates):
        super().__init__(optimizer)
        self.optimizer = optimizer
        self.additional_updates = additional_updates

    def get_updates(self, params, loss):
        updates = self.optimizer.get_updates(params=params, loss=loss)
        updates += self.additional_updates
        self.updates = updates
        return self.updates

    def get_config(self):
        return self.optimizer.get_config()

Traceback list:

File "/Users/taylormcnally/.vscode/extensions/ms-python.python-2019.5.18875/pythonFiles/ptvsd_launcher.py", line 43, in <module> main(ptvsdArgs) File "/Users/taylormcnally/.vscode/extensions/ms-python.python-2019.5.18875/pythonFiles/lib/python/ptvsd/__main__.py", line 434, in main run() File "/Users/taylormcnally/.vscode/extensions/ms-python.python-2019.5.18875/pythonFiles/lib/python/ptvsd/__main__.py", line 312, in run_file runpy.run_path(target, run_name='__main__') File "/anaconda3/envs/tf2/lib/python3.6/runpy.py", line 263, in run_path pkg_name=pkg_name, script_name=fname) File "/anaconda3/envs/tf2/lib/python3.6/runpy.py", line 96, in _run_module_code mod_name, mod_spec, pkg_name, script_name) File "/anaconda3/envs/tf2/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/Users/taylormcnally/Documents/GitHub/keras-rl2/examples/ddpg_pendulum.py", line 58, in <module> agent.compile(Adam(lr=.001, clipnorm=1.), metrics=['mae']) File "/Users/taylormcnally/Documents/GitHub/keras-rl2/rl/agents/ddpg.py", line 122, in compile critic_optimizer = AdditionalUpdatesOptimizer(critic_optimizer, critic_updates) File "/Users/taylormcnally/Documents/GitHub/keras-rl2/rl/util.py", line 84, in __init__ super().__init__(optimizer) File "/anaconda3/envs/tf2/lib/python3.6/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py", line 263, in __init__ with backend.name_scope(self._name) as name_scope: File "/anaconda3/envs/tf2/lib/python3.6/site-packages/tensorflow/python/keras/backend.py", line 739, in name_scope return ops.name_scope_v2(name) File "/anaconda3/envs/tf2/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 6248, in __init__ raise ValueError("name for name_scope must be a string.") ValueError: name for name_scope must be a string.

ValueError: Variable Tensor("Mean_1:0", shape=(), dtype=float32) has `None` for gradient while training DDPG

Whenever I try to run the ddpg_pendulum example (or any other DDPG example), I always get the error

ValueError: Variable Tensor("Mean_1:0", shape=(), dtype=float32) has None for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.

Each time the training completes an interval, this problem occurs. Is there any way to get over it?

My Tensorflow and Keras versions are 2.1.0 and 2.3.1 respectively.

please add PPO, A3C...

Guys, Keras-rl is the best reinforcement learning library.
easy to handle despite complex rl algorithmic.
Keras-rl is far moore better than stable baseline.
please add ppo, a3c and other as dqn is the less rl algo.

Thx!

Agent fails with sequential data

The Agent implementation fails for data of indeterminate length, such as temporal data. An environment that outputs data of the shape (None, data_dim) fails for an accompanying model with a fitting LSTM as first layer.

It appears that either the Agent or the standard Processor adds another dimension to the observation, causing a shape mismatch between the Environment output and the Model input, raising a ValueError. The shape received from an Environment that outputs (None, 10) is:

ValueError: Input 0 of layer lstm is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [1, 1, None, 10]

The first "1" refers to the batch dimension and is to be expected. As an immediate workaround, one can add a squeeze layer to the model, something along the lines of Input>Squeeze>LSTM>Output.

  • Check that you are up-to-date with the master branch of Keras-RL. You can update with:
    pip install git+git://github.com/wau/keras-rl2.git --upgrade --no-deps

  • Check that you are up-to-date with the master branch of Keras. You can update with:
    pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps

  • Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short). If you report an error, please include the error message and the backtrace.

Example Code:

import rl.memory
import rl.agents
import rl.core
import tensorflow as tf
import numpy as np

BATCH_SIZE = 1
DATA_DIM = 10

class Environment(rl.core.Env):
	def __init__(self, data_dim = 10, game_length = 50):
		self.reward_counter = 0
		self.data_dim = data_dim
		self.game_length = game_length
		self.reward = 0.1
		self.observation = [[0] * self.data_dim]
		self.observation[0][0] = 1
		self.done = False

	def step(self, action):
		action_number = np.argmax(action)
		if not self.reward_counter + action_number % self.data_dim or np.random.rand() < 0.05:
			self.reward *= 1.1
			self.observation.append([0]*self.data_dim)
			self.observation[-1][self.reward_counter%self.data_dim] = 1
			self.reward_counter += 1
			reward = self.reward
			observation = self.observation
		observation = np.array(observation)
		if len(self.observation) > self.game_length and np.random.rand() < 0.05:
			self.done = True
			done = self.done
		info = {}
		return observation, reward, done, info

	def reset(self):
		self.done = False
		self.reward_counter = 0
		self.reward = 0.1
		self.observation = [[0] * self.data_dim]
		self.observation[0][0] = 1
		observation = self.observation
		observation = np.array(observation)
		return observation

	def close(self):
		self.__del__()

if __name__ is '__main__':
	lstm_input = tf.keras.Input(batch_shape = (BATCH_SIZE, 1, None, DATA_DIM))
	# lstm_input = tf.keras.backend.squeeze(lstm_input, 1) # uncomment squeeze layer to fix model.
	x = tf.keras.layers.LSTM(20)(lstm_input)
	x = tf.keras.layers.Dense(10, activation='softmax')(x) # output size doesn't actually matter here
	model = tf.keras.Model(inputs = [lstm_input], outputs = [x])

	memory = rl.memory.SequentialMemory(50000, window_length=BATCH_SIZE)
	processor = rl.core.Processor()

	agent = rl.agents.DQNAgent(model, memory=memory, processor=processor, nb_actions=10, batch_size=BATCH_SIZE)
	agent.compile(optimizer=tf.keras.optimizers.RMSprop(learning_rate=0.01))

	env = Environment(data_dim=DATA_DIM)

	agent.fit(env, nb_steps=int(5e5), log_interval=1000)

AttributeError: Tensor.op is meaningless when eager execution is enabled

  • Check that you are up-to-date with the master branch of Keras-RL. You can update with:
    pip install git+git://github.com/wau/keras-rl2.git --upgrade --no-deps

  • Check that you are up-to-date with the master branch of Keras. You can update with:
    pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps

  • Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short). If you report an error, please include the error message and the backtrace.

I am running the example scripts after installing precisely as documented above.

python examples/dqn_cartpole.py

Produces the error and trace:

raining for 50000 steps ...
2020-04-06 16:34:29.589490: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1483] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set.  If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU.  To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
/opt/venv/keras-rl2/lib/python3.6/site-packages/rl/memory.py:40: UserWarning: Not enough entries to sample without replacement. Consider increasing your warm-up phase to avoid oversampling!
  warnings.warn('Not enough entries to sample without replacement. Consider increasing your warm-up phase to avoid oversampling!')
Traceback (most recent call last):
  File "examples/dqn_cartpole.py", line 46, in <module>
    dqn.fit(env, nb_steps=50000, visualize=False, verbose=2)
  File "/opt/venv/keras-rl2/lib/python3.6/site-packages/rl/core.py", line 194, in fit
    metrics = self.backward(reward, terminal=done)
  File "/opt/venv/keras-rl2/lib/python3.6/site-packages/rl/agents/dqn.py", line 322, in backward
    metrics = self.trainable_model.train_on_batch(ins + [targets, masks], [dummy_targets, targets])
  File "/opt/venv/keras-rl2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 917, in train_on_batch
    self._make_train_function()
  File "/opt/venv/keras-rl2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 1986, in _make_train_function
    **self._function_kwargs)
  File "/opt/venv/keras-rl2/lib/python3.6/site-packages/tensorflow/python/keras/backend.py", line 3544, in function
    return EagerExecutionFunction(inputs, outputs, updates=updates, name=name)
  File "/opt/venv/keras-rl2/lib/python3.6/site-packages/tensorflow/python/keras/backend.py", line 3438, in __init__
    add_sources=True, handle_captures=True, base_graph=source_graph)
  File "/opt/venv/keras-rl2/lib/python3.6/site-packages/tensorflow/python/eager/lift_to_graph.py", line 325, in lift_to_graph
    add_sources=add_sources))
  File "/opt/venv/keras-rl2/lib/python3.6/site-packages/tensorflow/python/eager/lift_to_graph.py", line 114, in _map_subgraph
    ops_to_visit = [_as_operation(init_tensor)]
  File "/opt/venv/keras-rl2/lib/python3.6/site-packages/tensorflow/python/eager/lift_to_graph.py", line 37, in _as_operation
    return op_or_tensor.op
  File "/opt/venv/keras-rl2/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 987, in op
    "Tensor.op is meaningless when eager execution is enabled.")
AttributeError: Tensor.op is meaningless when eager execution is enabled.

(I'll note that editing the example to increase the warmup resolves the warning message, but I still see this same AttributeError message. I get the same error and trace when attempting the other examples as well.)

p.s. thanks for providing and maintaining this excellent resource!

Preprocessing Data from Observations

Hello,

is there a way to implement a custom preprocessing / featurizing routine into the training process?
Is such a feature already available?

I am currently making use of a featurizer to preprocess the observations from the environment.
As I haven't found a way to implement it into the agent, I had to define this preprocessor as a part of the environment.
Unfortunately, the preprocessor transforms the low-dimensional environment state into a high-dimensional feature vector,
which is then appended to the memory buffer.
Consequently, the training uses a huge amount of RAM, although it should be possible to perform the preprocessing just in time, directly after low-dimensional observations have been loaded from the memory.

Thank you.

AttributeError: 'Sequential' object has no attribute 'uses_learning_phase'

I tried running one of the DDPG examples: 'python ddpg_pendulum.py'
After 81 iterations the program stopped with an error:
AttributeError: 'Sequential' object has no attribute 'uses_learning_phase'

'Sequential' is a Keras object that indeed does not have this attribute in a few versions I checked.

Can you check this out and comment on this problem?

Thank you.

Package versions have conflicting dependencies

I am getting the following error trying to install keras-rl2 on my M1 Macbook. The tensorflow version I have is 2.4.0rc0.

ERROR: Cannot install keras-rl2==1.0.0, keras-rl2==1.0.1, keras-rl2==1.0.2, keras-rl2==1.0.3 and keras-rl2==1.0.4 because these package versions have conflicting dependencies.

What should I do to solve this problem?

AttributeError when running dqn_cartpole.py

I ran the suggested pip commands to update keras-rl2 and keras. No errors with the pip commands.

I am using Python 3.7.5

Any help on how to fix would be appreciated. When I run dqn_cartpole.py, I am getting the following error:

AttributeError: Tensor.op is meaningless when eager execution is enabled.

Full traceback:

Traceback (most recent call last):
File "dqn_cartpole.py", line 46, in
dqn.fit(env, nb_steps=50000, visualize=True, verbose=2)
File "/home/valmiki/miniconda3/envs/gym_keras_rl_n2/lib/python3.7/site-packages/rl/core.py", line 194, in fit
metrics = self.backward(reward, terminal=done)
File "/home/valmiki/miniconda3/envs/gym_keras_rl_n2/lib/python3.7/site-packages/rl/agents/dqn.py", line 324, in backward
metrics = self.trainable_model.train_on_batch(ins + [targets, masks], [dummy_targets, targets])
File "/home/valmiki/miniconda3/envs/gym_keras_rl_n2/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 917, in train_on_batch
self._make_train_function()
File "/home/valmiki/miniconda3/envs/gym_keras_rl_n2/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 1986, in _make_train_function
**self._function_kwargs)
File "/home/valmiki/miniconda3/envs/gym_keras_rl_n2/lib/python3.7/site-packages/tensorflow/python/keras/backend.py", line 3544, in function
return EagerExecutionFunction(inputs, outputs, updates=updates, name=name)
File "/home/valmiki/miniconda3/envs/gym_keras_rl_n2/lib/python3.7/site-packages/tensorflow/python/keras/backend.py", line 3438, in init
add_sources=True, handle_captures=True, base_graph=source_graph)
File "/home/valmiki/miniconda3/envs/gym_keras_rl_n2/lib/python3.7/site-packages/tensorflow/python/eager/lift_to_graph.py", line 325, in lift_to_graph
add_sources=add_sources))
File "/home/valmiki/miniconda3/envs/gym_keras_rl_n2/lib/python3.7/site-packages/tensorflow/python/eager/lift_to_graph.py", line 114, in _map_subgraph
ops_to_visit = [_as_operation(init_tensor)]
File "/home/valmiki/miniconda3/envs/gym_keras_rl_n2/lib/python3.7/site-packages/tensorflow/python/eager/lift_to_graph.py", line 37, in _as_operation
return op_or_tensor.op
File "/home/valmiki/miniconda3/envs/gym_keras_rl_n2/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 987, in op
"Tensor.op is meaningless when eager execution is enabled.")
AttributeError: Tensor.op is meaningless when eager execution is enabled.

"IndexError: list index out of range" in examples/dqn_cartpole.py

I run the program "examples/dqn_cartpole.py ", but there was error.
The error is that:

IndexError: list index out of range

According to message, that happens in line 46.

dqn.fit(env, nb_steps=50000, visualize=True, verbose=2)

I'm using tensorflow 2.3.0. Is that cause of that error?
By the way, version of Keras-RL2 is 1.0.4.

TypeError: len is not well defined for symbolic Tensors. (activation_4/Identity:0) Please call `x.shape` rather than `len(x)` for shape information.

Running latest Keras-RL and nightly TensorFlow 2.0 (tf-nightly-2.0-preview), I get the following error trying to run:

https://github.com/wau/keras-rl2/blob/master/examples/dqn_atari.py

I get the following error. Running Python 3.6 (Anaconda):

2019-08-08 17:55:57.294275: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-08-08 17:55:57.312932: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fbb2e93a0a0 executing computations on platform Host. Devices:
2019-08-08 17:55:57.312954: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Host, Default Version
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
permute (Permute)            (None, 84, 84, 4)         0         
_________________________________________________________________
conv2d (Conv2D)              (None, 20, 20, 32)        8224      
_________________________________________________________________
activation (Activation)      (None, 20, 20, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 9, 9, 64)          32832     
_________________________________________________________________
activation_1 (Activation)    (None, 9, 9, 64)          0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 7, 7, 64)          36928     
_________________________________________________________________
activation_2 (Activation)    (None, 7, 7, 64)          0         
_________________________________________________________________
flatten (Flatten)            (None, 3136)              0         
_________________________________________________________________
dense (Dense)                (None, 512)               1606144   
_________________________________________________________________
activation_3 (Activation)    (None, 512)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 4)                 2052      
_________________________________________________________________
activation_4 (Activation)    (None, 4)                 0         
=================================================================
Total params: 1,686,180
Trainable params: 1,686,180
Non-trainable params: 0
_________________________________________________________________
None
Traceback (most recent call last):
  File "atari2.py", line 96, in <module>
    train_interval=4, delta_clip=1.)
  File "/Users/jheaton/.local/lib/python3.6/site-packages/rl/agents/dqn.py", line 107, in __init__
    if hasattr(model.output, '__len__') and len(model.output) > 1:
  File "/Users/jheaton/miniconda3/envs/tensorflow2/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 738, in __len__
    "shape information.".format(self.name))
TypeError: len is not well defined for symbolic Tensors. (activation_4/Identity:0) Please call `x.shape` rather than `len(x)` for shape information.

Please make sure that the boxes below are checked before you submit your issue. If your issue is an implementation question, please ask your question in the Discord.

Thank you!

  • Check that you are up-to-date with the master branch of Keras-RL. You can update with:
    pip install git+git://github.com/wau/keras-rl2.git --upgrade --no-deps

  • Check that you are up-to-date with the master branch of Keras. You can update with:
    pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps

  • Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short). If you report an error, please include the error message and the backtrace.

DDPGAgent is incompatible with MultiInputProcessor for HandReach-v0 env

DDPGAgent fails to train on the critic model while using a MultiInputProcessor within its backward method, specifically at lines 260-263:

                if len(self.critic.inputs) >= 3:
                    state1_batch_with_action = state1_batch[:]
                else:
                    state1_batch_with_action = [state1_batch]
                state1_batch_with_action.insert(self.critic_action_input_idx, target_actions)

This throws the error TypeError: unhashable type: 'slice' since state1_batch is a dictionary with three keys, as returned from the processor. It seems that this chunk of code automatically assumes that state1_batch will be a list instead of a dictionary. The same can be said a few lines down with state0_batch. I would love to be able to fix this myself, but am unsure why there was a hardcoded 3 in the logic or why the length of the inputs would make a difference. I'd love to understand if someone is willing to explain.

Here is the script: hand_reach.py

Please make sure that the boxes below are checked before you submit your issue. If your issue is an implementation question, please ask your question in the Discord.

Thank you!

  • Check that you are up-to-date with the master branch of Keras-RL. You can update with:
    pip install git+git://github.com/wau/keras-rl2.git --upgrade --no-deps

  • Check that you are up-to-date with the master branch of Keras. You can update with:
    pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps

  • Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short). If you report an error, please include the error message and the backtrace.

Pip install fails because of tf-version (pypi lists only version 1.0.2)

> sudo pip install keras-rl2 --no-cache-dir
Collecting keras-rl2
  Downloading keras-rl2-1.0.3.tar.gz (40 kB)
     |████████████████████████████████| 40 kB 308 kB/s 
ERROR: Could not find a version that satisfies the requirement tensorflow==2.0.0-beta1 (from keras-rl2) (from versions: none)
ERROR: No matching distribution found for tensorflow==2.0.0-beta1 (from keras-rl2)

The library seems to expect an exact tensorflow version

pip install git+git://github.com/wau/keras-rl2.git --upgrade --no-deps

works.

I think pip tries to install 1.0.2

ERROR: keras-rl2 1.0.3 has requirement tensorflow==2.0.0-beta1, but you'll have tensorflow 2.1.0 which is incompatible.

Hello,

requirement for keras-rl2 is tensorflow ==2.1.0
When I installed keras-rl2 by pip, it removed my tensorflow 2.1.0 and instaled 2.0.0beta version.

But my code did nor ran and I tryed install tensorflow 2.1.0 as it requires, but got this error
"ERROR: keras-rl2 1.0.3 has requirement tensorflow==2.0.0-beta1, but you'll have tensorflow 2.1.0 which is incompatible.
Installing collected packages: tensorflow
Attempting uninstall: tensorflow
Found existing installation: tensorflow 2.0.0b1
Uninstalling tensorflow-2.0.0b1:
Successfully uninstalled tensorflow-2.0.0b1
Successfully installed tensorflow-2.1.0"

  • [ X] Check that you are up-to-date with the master branch of Keras-RL. You can update with:
    pip install git+git://github.com/wau/keras-rl2.git --upgrade --no-deps

  • [ X] Check that you are up-to-date with the master branch of Keras. You can update with:
    pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps

Reference to model is changed when training starts

Hello. First of all, I am using version 1.0.3 but the issue holds for the latest (1.0.4) version. My problem is that the reference to my model is being messed up with. I need to use the reference to my created model throughout the training, but as soon as the training starts, it suddenly becomes a reference to an Agent, in my case a DQNAgent.

It would be great not to have any side effects when starting the training, since the model variable could still be used after having started the training, as was my case. I have been searching for solutions and I found out that I was not the only one having problems with this: there are some issues on the original repository which indirectly address this as well. As a result, this library (as well as its predecessor) ends up not being fully compatible with some of the original Keras stuff, such as the TensorBoard callback: see keras-rl/keras-rl#255.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.