aws / sagemaker-rl-container Goto Github PK

A set of dockerfiles that provide Reinforcement Learning solutions for use in SageMaker.

License: Apache License 2.0

HCL 4.18% C 0.29% Shell 2.29% Python 86.95% Dockerfile 2.17% Jupyter Notebook 4.12%

sagemaker-rl-container's Issues

Add stable baselines 3 containers for sagemaker

Add https://github.com/DLR-RM/stable-baselines3 and https://github.com/DLR-RM/rl-baselines3-zoo
container for sagemaker, so training rl models with stable baselines 3 will be possible.

Generate release?

Is it possible to generate a release on github for this version, as set in the setup.py? Thanks!

Missing sagemaker-rl-vw-container:adf image

I was trying to follow the walkthrough bandits_movielens_testbed walkthrough because I want to build and vowpal wabbit adf model and it seems that the sagemaker-rl-vw-container:adf image has been removed and theres no documentation on how to use it.

Using redis on system memory instead of GPU memory.

Hi.

Good day.

Is it possible to not run redis on the GPU? At the moment I am getting the following error using it training deep racer:

subscribe scheduled to be closed ASAP for overcoming of output buffer limits

It seems as though the process wants to allocate gigs of memory into redis but the GPU only has about 7GB, whereas the system has free memory to use.

I'm just not sure how to get it to use the CPU. I tried creating an image myself and making the following change to start.sh:

CUDA_VISIBLE_DEVICES=-1 redis-server --bind 0.0.0.0 &

But when I run the image it doesn't use GPU at all.

Any ideas how to have redis use the systems memory and not the GPU memory? Thanks.

Regards.

Error while launching a training process with Ray/RLLib 0.8.2/Gym

Container: sagemaker-rl-ray-container:ray-0.8.2-tf-*-py36

A bug was introduced in this container after an update in the latest version of pyglet. This update breaks the API contract and causes some errors when visualization is enabled (stack trace).

Solution: Downgrade pyglet to version 1.3.2 --> pyglet==1.3.2

Could you change that in the Dockerfile and also update the built images available in SageMaker, please?

Source: tensorflow/agents#163

ray.exceptions.RayTaskError(AttributeError): #33[36mray::RolloutWorker.sample()#33[39m (pid=119, ip=10.2.216.148)
File "python/ray/_raylet.pyx", line 452, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 430, in ray._raylet.execute_task.function_executor
File "/usr/local/lib/python3.6/dist-packages/ray/rllib/evaluation/rollout_worker.py", line 488, in sample
batches = [self.input_reader.next()]
File "/usr/local/lib/python3.6/dist-packages/ray/rllib/evaluation/sampler.py", line 52, in next
batches = [self.get_data()]
File "/usr/local/lib/python3.6/dist-packages/ray/rllib/evaluation/sampler.py", line 95, in get_data
item = next(self.rollout_provider)
File "/usr/local/lib/python3.6/dist-packages/ray/rllib/evaluation/sampler.py", line 301, in _env_runner
base_env.poll()
File "/usr/local/lib/python3.6/dist-packages/ray/rllib/env/base_env.py", line 308, in poll
self.new_obs = self.vector_env.vector_reset()
File "/usr/local/lib/python3.6/dist-packages/ray/rllib/env/vector_env.py", line 96, in vector_reset
return [e.reset() for e in self.envs]
File "/usr/local/lib/python3.6/dist-packages/ray/rllib/env/vector_env.py", line 96, in
return [e.reset() for e in self.envs]
File "/usr/local/lib/python3.6/dist-packages/gym/wrappers/monitor.py", line 39, in reset
self._after_reset(observation)
File "/usr/local/lib/python3.6/dist-packages/gym/wrappers/monitor.py", line 188, in _after_reset
self.reset_video_recorder()
File "/usr/local/lib/python3.6/dist-packages/gym/wrappers/monitor.py", line 209, in reset_video_recorder
self.video_recorder.capture_frame()
File "/usr/local/lib/python3.6/dist-packages/gym/wrappers/monitoring/video_recorder.py", line 101, in capture_frame
frame = self.env.render(mode=render_mode)
File "/usr/local/lib/python3.6/dist-packages/gym/core.py", line 249, in render
return self.env.render(mode, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/gym/envs/classic_control/continuous_mountain_car.py", line 143, in render
return self.viewer.render(return_rgb_array = mode=='rgb_array')
File "/usr/local/lib/python3.6/dist-packages/gym/envs/classic_control/rendering.py", line 105, in render
arr = np.frombuffer(image_data.data, dtype=np.uint8)

aws / sagemaker-rl-container Goto Github PK

sagemaker-rl-container's Issues

Add stable baselines 3 containers for sagemaker

Generate release?

coach tf containers broken

Missing sagemaker-rl-vw-container:adf image

Using redis on system memory instead of GPU memory.

Error while launching a training process with Ray/RLLib 0.8.2/Gym

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent