tianheyu927 / mopo Goto Github PK
View Code? Open in Web Editor NEWCode for MOPO: Model-based Offline Policy Optimization
License: MIT License
Code for MOPO: Model-based Offline Policy Optimization
License: MIT License
When adding FC layer, BNN class adds copy of layer, according to
https://github.com/tianheyu927/mopo/blob/master/mopo/models/bnn.py#L141
Further, when copying a layer, the code uses repr function:
https://github.com/tianheyu927/mopo/blob/master/mopo/models/fc.py#L132
https://github.com/tianheyu927/mopo/blob/master/mopo/models/fc.py#L51
However, the sn attribute is not specified in repr, and it uses default value sn=False.
Even if I add sn in the repr, tensorflow gives error due to the redundant name 'u'.
And it looks to me like this module is indeed missing.
Traceback:
(mopo) capybara@blg4101:~/mopo$ mopo run_local examples.development --config=examples.config.d4rl.halfcheetah_mixed --gpus=1 --trial-gpus=1
WARNING: Logging before flag parsing goes to stderr.
W0909 03:02:30.004770 47483036384896 lazy_loader.py:50]
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
* https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
* https://github.com/tensorflow/addons
* https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.
running build_ext
building 'mujoco_py.cymj' extension
gcc -pthread -B /home/capybara/anaconda3/envs/mopo/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/home/capybara/anaconda3/envs/mopo/lib/python3.6/site-packages/mujoco_py -I/home/capybara/.mujoco/mujoco200/include -I/home/capybara/anaconda3/envs/mopo/lib/python3.6/site-packages/numpy/core/include -I/home/capybara/anaconda3/envs/mopo/include/python3.6m -c /home/capybara/anaconda3/envs/mopo/lib/python3.6/site-packages/mujoco_py/cymj.c -o /home/capybara/anaconda3/envs/mopo/lib/python3.6/site-packages/mujoco_py/generated/_pyxbld_2.0.2.10_36_linuxcpuextensionbuilder/temp.linux-x86_64-3.6/home/capybara/anaconda3/envs/mopo/lib/python3.6/site-packages/mujoco_py/cymj.o -fopenmp -w
gcc -pthread -B /home/capybara/anaconda3/envs/mopo/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/home/capybara/anaconda3/envs/mopo/lib/python3.6/site-packages/mujoco_py -I/home/capybara/.mujoco/mujoco200/include -I/home/capybara/anaconda3/envs/mopo/lib/python3.6/site-packages/numpy/core/include -I/home/capybara/anaconda3/envs/mopo/include/python3.6m -c /home/capybara/anaconda3/envs/mopo/lib/python3.6/site-packages/mujoco_py/gl/osmesashim.c -o /home/capybara/anaconda3/envs/mopo/lib/python3.6/site-packages/mujoco_py/generated/_pyxbld_2.0.2.10_36_linuxcpuextensionbuilder/temp.linux-x86_64-3.6/home/capybara/anaconda3/envs/mopo/lib/python3.6/site-packages/mujoco_py/gl/osmesashim.o -fopenmp -w
creating /home/capybara/anaconda3/envs/mopo/lib/python3.6/site-packages/mujoco_py/generated/_pyxbld_2.0.2.10_36_linuxcpuextensionbuilder/lib.linux-x86_64-3.6
creating /home/capybara/anaconda3/envs/mopo/lib/python3.6/site-packages/mujoco_py/generated/_pyxbld_2.0.2.10_36_linuxcpuextensionbuilder/lib.linux-x86_64-3.6/mujoco_py
gcc -pthread -shared -B /home/capybara/anaconda3/envs/mopo/compiler_compat -L/home/capybara/anaconda3/envs/mopo/lib -Wl,-rpath=/home/capybara/anaconda3/envs/mopo/lib -Wl,--no-as-needed -Wl,--sysroot=/ /home/capybara/anaconda3/envs/mopo/lib/python3.6/site-packages/mujoco_py/generated/_pyxbld_2.0.2.10_36_linuxcpuextensionbuilder/temp.linux-x86_64-3.6/home/capybara/anaconda3/envs/mopo/lib/python3.6/site-packages/mujoco_py/cymj.o /home/capybara/anaconda3/envs/mopo/lib/python3.6/site-packages/mujoco_py/generated/_pyxbld_2.0.2.10_36_linuxcpuextensionbuilder/temp.linux-x86_64-3.6/home/capybara/anaconda3/envs/mopo/lib/python3.6/site-packages/mujoco_py/gl/osmesashim.o -L/home/capybara/.mujoco/mujoco200/bin -Wl,-R/home/capybara/.mujoco/mujoco200/bin -lmujoco200 -lglewosmesa -lOSMesa -lGL -o /home/capybara/anaconda3/envs/mopo/lib/python3.6/site-packages/mujoco_py/generated/_pyxbld_2.0.2.10_36_linuxcpuextensionbuilder/lib.linux-x86_64-3.6/mujoco_py/cymj.cpython-36m-x86_64-linux-gnu.so -fopenmp
Warning: Flow failed to import. Set the environment variable D4RL_SUPPRESS_IMPORT_ERROR=1 to suppress this message.
No module named 'flow'
Warning: CARLA failed to import. Set the environment variable D4RL_SUPPRESS_IMPORT_ERROR=1 to suppress this message.
No module named 'carla'
Traceback (most recent call last):
File "/home/capybara/anaconda3/envs/mopo/bin/mopo", line 33, in <module>
sys.exit(load_entry_point('mopo', 'console_scripts', 'mopo')())
File "/home/capybara/mopo/softlearning/scripts/console_scripts.py", line 202, in main
return cli()
File "/home/capybara/anaconda3/envs/mopo/lib/python3.6/site-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/home/capybara/anaconda3/envs/mopo/lib/python3.6/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/home/capybara/anaconda3/envs/mopo/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/capybara/anaconda3/envs/mopo/lib/python3.6/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/capybara/anaconda3/envs/mopo/lib/python3.6/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/home/capybara/mopo/softlearning/scripts/console_scripts.py", line 71, in run_example_local_cmd
return run_example_local(example_module_name, example_argv)
File "/home/capybara/mopo/examples/instrument.py", line 209, in run_example_local
example_args = example_module.get_parser().parse_args(example_argv)
File "/home/capybara/mopo/examples/development/__init__.py", line 37, in get_parser
from examples.utils import get_parser
File "/home/capybara/mopo/examples/utils.py", line 9, in <module>
import softlearning.environments.utils as env_utils
File "/home/capybara/mopo/softlearning/environments/utils.py", line 6, in <module>
import mopo.env as env_overwrite
ModuleNotFoundError: No module named 'mopo.env'
Hi, thanks so much for your nice work and code. I have a question about the results in Table 3, specifically, the MOPO-no_pen and MBPO. Could you clarify the difference between these two experiments? Noticed that, on page 19, the paper mentioned: "For simplicity, we use MBPO, which essentially MOPO without reward penalty, for this ablation study", so what causes the performance gap in Table 3?
Do you have code example on how to predict and evaluate MOPO after fitting the model?
Hi, I really appreciate your open source code. My question is how is your performance number reported in the paper.
For example, in Table 1, do you use the max evaluation return during the learning process or use the last evaluation return. The return of the policy has large variance in different iteration.
Thanks,
Yue
Given that the working packages of viskit have been updated in July 2021, one should make sure he/she uses the working packages supported before when installing viskit.
vitchyr/viskit@5411264
Hello!
Do you have any plan to fit in tensorflow-gpu 2?
Because of the python and cuda version, I have to use mopo with tensorflow-gpu 2.7. So I have done some normal modifications to fit it from 1.14.0 to 2.7. Before adding in tape, it says tape should be added as:
raise ValueError("tape
is required when a Tensor
loss is passed. "
ValueError: tape
is required when a Tensor
loss is passed. Received: loss=Tensor("BNN_1/add_9:0", shape=(), dtype=float32), tape=None.
But once I added var_list=self.optvars,tape=tf.GradientTape() in BNN.py, it raised error as below:
File "/code/com//mopo/mopo/models/bnn.py", line 256, in finalize
self.train_op = self.optimizer.minimize(train_loss, var_list=self.optvars, tape=tf.GradientTape())
File "/root/miniconda3/envs/mopo/lib/python3.9/site-packages/keras/optimizer_v2/optimizer_v2.py", line 532, in minimize
return self.apply_gradients(grads_and_vars, name=name)
File "/root/miniconda3/envs/mopo/lib/python3.9/site-packages/keras/optimizer_v2/optimizer_v2.py", line 633, in apply_gradients
grads_and_vars = optimizer_utils.filter_empty_gradients(grads_and_vars)
File "/root/miniconda3/envs/mopo/lib/python3.9/site-packages/keras/optimizer_v2/utils.py", line 73, in filter_empty_gradients
raise ValueError(f"No gradients provided for any variable: {variable}. "
ValueError: No gradients provided for any variable: (['BNN/Layer0_mean/FC_weights:0', 'BNN/Layer0_mean/FC_biases:0', 'BNN/Layer1_mean/FC_weights:0', 'BNN/Layer1_mean/FC_biases:0', 'BNN/Layer2_mean/FC_weights:0', 'BNN/Layer2_mean/FC_biases:0', 'BNN/Layer3_mean/FC_weights:0', 'BNN/Layer3_mean/FC_biases:0', 'BNN/Layer4_mean/FC_weights:0', 'BNN/Layer4_mean/FC_biases:0', 'BNN/Layer0_var/FC_weights:0', 'BNN/Layer0_var/FC_biases:0', 'BNN/max_log_var:0', 'BNN/min_log_var:0'],). Provided grads_and_vars
is ((None, <tf.Variable 'BNN/Layer0_mean/FC_weights:0' shape=(7, 14, 200) dtype=float32>), (None, <tf.Variable 'BNN/Layer0_mean/FC_biases:0' shape=(7, 1, 200) dtype=float32>), (None, <tf.Variable 'BNN/Layer1_mean/FC_weights:0' shape=(7, 200, 200) dtype=float32>), (None, <tf.Variable 'BNN/Layer1_mean/FC_biases:0' shape=(7, 1, 200) dtype=float32>), (None, <tf.Variable 'BNN/Layer2_mean/FC_weights:0' shape=(7, 200, 200) dtype=float32>), (None, <tf.Variable 'BNN/Layer2_mean/FC_biases:0' shape=(7, 1, 200) dtype=float32>), (None, <tf.Variable 'BNN/Layer3_mean/FC_weights:0' shape=(7, 200, 200) dtype=float32>), (None, <tf.Variable 'BNN/Layer3_mean/FC_biases:0' shape=(7, 1, 200) dtype=float32>), (None, <tf.Variable 'BNN/Layer4_mean/FC_weights:0' shape=(7, 200, 267) dtype=float32>), (None, <tf.Variable 'BNN/Layer4_mean/FC_biases:0' shape=(7, 1, 267) dtype=float32>), (None, <tf.Variable 'BNN/Layer0_var/FC_weights:0' shape=(7, 200, 267) dtype=float32>), (None, <tf.Variable 'BNN/Layer0_var/FC_biases:0' shape=(7, 1, 267) dtype=float32>), (None, <tf.Variable 'BNN/max_log_var:0' shape=(1, 267) dtype=float32>), (None, <tf.Variable 'BNN/min_log_var:0' shape=(1, 267) dtype=float32>)).
Do you have any idea to solve this problem? And do you have any plan to fit in tensorflow-gpu 2?
Thank you very much!
Hi, first of all, thanks for sharing the code of your paper.
I encountered many problems during the installation, since viskit has incompatible dependencies to the one you specified in your library (e.g., the NumPy
version, and so on). I solved it by removing some of the version specification in the installer of your library and the viskit's.
For this reason, I share with you in the end of this messages, all the libraries with relative version of the mopo
conda environment.
When I lunch mopo run_local examples.development --config=examples.config.d4rl.halfcheetah_mixed --gpus=1 --trial-gpus=1
from the main folder i get the following error stack,
Traceback (most recent call last):
File "/home/samuele/miniconda3/envs/mopo/bin/mopo", line 33, in <module>
sys.exit(load_entry_point('mopo', 'console_scripts', 'mopo')())
File "/home/samuele/miniconda3/envs/mopo/bin/mopo", line 25, in importlib_load_entry_point
return next(matches).load()
File "/home/samuele/miniconda3/envs/mopo/lib/python3.6/site-packages/importlib_metadata/__init__.py", line 100, in load
module = import_module(match.group('module'))
File "/home/samuele/miniconda3/envs/mopo/lib/python3.6/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 994, in _gcd_import
File "<frozen importlib._bootstrap>", line 971, in _find_and_load
File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 678, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/home/samuele/Projects/mopo/softlearning/scripts/console_scripts.py", line 26, in <module>
from examples.instrument import (
ModuleNotFoundError: No module named 'examples.instrument'
I launched the command from different folders, e.g, the main folder, the mopo folder, etc.
I always experience the same error.
The detail about my environment
aiohttp-cors 0.7.0 pypi_0 pypi
aioredis 1.3.1 pypi_0 pypi
asn1crypto 1.4.0 py_0
astor 0.8.1 pypi_0 pypi
astunparse 1.6.3 pypi_0 pypi
async-timeout 3.0.1 pypi_0 pypi
atari-py 0.2.6 pypi_0 pypi
atomicwrites 1.4.0 pypi_0 pypi
attrs 20.3.0 pypi_0 pypi
blessings 1.7 pypi_0 pypi
boto3 1.17.14 pypi_0 pypi
botocore 1.20.14 pypi_0 pypi
brotlipy 0.7.0 py36h27cfd23_1003
ca-certificates 2021.1.19 h06a4308_0
cachetools 4.2.1 pypi_0 pypi
certifi 2020.12.5 py36h06a4308_0
cffi 1.14.5 pypi_0 pypi
chardet 3.0.4 pypi_0 pypi
click 7.1.2 pypi_0 pypi
cloudpickle 1.6.0 pypi_0 pypi
colorama 0.4.4 pypi_0 pypi
colorful 0.5.4 pypi_0 pypi
conda 4.9.2 py36h06a4308_0
conda-package-handling 1.7.2 py36h03888b9_0
contextvars 2.4 pypi_0 pypi
cryptography 3.4.6 pypi_0 pypi
cython 0.29.22 pypi_0 pypi
d4rl 1.1 pypi_0 pypi
dask 2021.2.0 pypi_0 pypi
dataclasses 0.8 pypi_0 pypi
decorator 4.4.2 pypi_0 pypi
deepdiff 5.2.3 pypi_0 pypi
dm-control 0.0.355168290 pypi_0 pypi
dm-env 1.4 pypi_0 pypi
dm-tree 0.1.5 pypi_0 pypi
dotmap 1.3.23 pypi_0 pypi
filelock 3.0.12 pypi_0 pypi
flask 1.0.2 pypi_0 pypi
flatbuffers 1.12 pypi_0 pypi
funcsigs 1.0.2 pypi_0 pypi
gast 0.3.3 pypi_0 pypi
gitdb 4.0.5 pypi_0 pypi
gitdb2 4.0.2 pypi_0 pypi
gitpython 3.1.13 pypi_0 pypi
glfw 2.0.0 pypi_0 pypi
google-api-core 1.26.0 pypi_0 pypi
google-api-python-client 1.12.8 pypi_0 pypi
google-auth 1.27.0 pypi_0 pypi
google-auth-httplib2 0.0.4 pypi_0 pypi
google-auth-oauthlib 0.4.2 pypi_0 pypi
google-pasta 0.2.0 pypi_0 pypi
googleapis-common-protos 1.52.0 pypi_0 pypi
gpflow 2.1.4 pypi_0 pypi
gpustat 0.6.0 pypi_0 pypi
grpcio 1.32.0 pypi_0 pypi
gtimer 1.0.0b5 pypi_0 pypi
gym 0.18.0 pypi_0 pypi
h5py 2.10.0 pypi_0 pypi
hiredis 1.1.0 pypi_0 pypi
httplib2 0.19.0 pypi_0 pypi
idna 2.10 pyhd3eb1b0_0
idna-ssl 1.1.0 pypi_0 pypi
immutables 0.15 pypi_0 pypi
importlib-metadata 3.6.0 pypi_0 pypi
iniconfig 1.1.1 pypi_0 pypi
itsdangerous 1.1.0 pypi_0 pypi
jmespath 0.10.0 pypi_0 pypi
joblib 1.0.1 pypi_0 pypi
jsonschema 3.2.0 pypi_0 pypi
keras-applications 1.0.8 pypi_0 pypi
keras-preprocessing 1.1.2 pypi_0 pypi
kiwisolver 1.3.1 pypi_0 pypi
labmaze 1.0.3 pypi_0 pypi
libedit 3.1.20191231 h14c3975_1
libffi 3.2.1 hf484d3e_1007
libgcc-ng 9.1.0 hdf63c60_0
libstdcxx-ng 9.1.0 hdf63c60_0
lockfile 0.12.2 pypi_0 pypi
lxml 4.6.2 pypi_0 pypi
lz4 3.1.3 pypi_0 pypi
markdown 3.3.3 pypi_0 pypi
matplotlib 2.0.2 pypi_0 pypi
mjrl 1.0.0 pypi_0 pypi
mopo 0.0.1 dev_0 <develop>
more-itertools 8.7.0 pypi_0 pypi
msgpack 1.0.2 pypi_0 pypi
multidict 5.1.0 pypi_0 pypi
multipledispatch 0.6.0 pypi_0 pypi
multiworld 0.0.0 pypi_0 pypi
ncurses 6.2 he6710b0_1
networkx 2.5 pypi_0 pypi
numpy 1.16.0 pypi_0 pypi
nvidia-ml-py3 7.352.0 pypi_0 pypi
oauthlib 3.1.0 pypi_0 pypi
opencensus 0.7.12 pypi_0 pypi
opencensus-context 0.1.2 pypi_0 pypi
opencv-python-headless 4.3.0.36 pypi_0 pypi
openssl 1.0.2u h7b6447c_0
opt-einsum 3.3.0 pypi_0 pypi
ordered-set 4.0.2 pypi_0 pypi
packaging 20.9 pypi_0 pypi
pandas 1.1.5 pypi_0 pypi
patchelf 0.9 he6710b0_3
pip 21.0.1 py36h06a4308_0
plotly 4.0.0 pypi_0 pypi
pluggy 0.13.1 pypi_0 pypi
prometheus-client 0.9.0 pypi_0 pypi
protobuf 3.15.2 pypi_0 pypi
psutil 5.8.0 pypi_0 pypi
py 1.10.0 pypi_0 pypi
py-spy 0.3.4 pypi_0 pypi
pyasn1 0.4.8 pypi_0 pypi
pyasn1-modules 0.2.8 pypi_0 pypi
pybullet 3.0.8 pypi_0 pypi
pycosat 0.6.3 py36h27cfd23_0
pycparser 2.20 py_2
pygame 2.0.1 pypi_0 pypi
pyglet 1.5.0 pypi_0 pypi
pyopengl 3.1.5 pypi_0 pypi
pyopenssl 20.0.1 pypi_0 pypi
pyrsistent 0.17.3 pypi_0 pypi
pysocks 1.7.1 py36h06a4308_0
pytest 6.2.2 pypi_0 pypi
python 3.6.5 hc3d631a_2
pytz 2021.1 pypi_0 pypi
pywavelets 1.1.1 pypi_0 pypi
pyyaml 5.4.1 pypi_0 pypi
ray 1.2.0 pypi_0 pypi
readline 7.0 h7b6447c_5
redis 3.5.3 pypi_0 pypi
requests 2.25.1 pyhd3eb1b0_0
requests-oauthlib 1.3.0 pypi_0 pypi
retrying 1.3.3 pypi_0 pypi
rsa 4.7.2 pypi_0 pypi
ruamel_yaml 0.15.87 py36h7b6447c_1
s3transfer 0.3.4 pypi_0 pypi
scikit-image 0.17.2 pypi_0 pypi
scikit-learn 0.24.1 pypi_0 pypi
scipy 1.5.4 pypi_0 pypi
serializable 0.1.0 pypi_0 pypi
setproctitle 1.2.2 pypi_0 pypi
setuptools 52.0.0 py36h06a4308_0
six 1.15.0 py36h06a4308_0
smmap 3.0.5 pypi_0 pypi
smmap2 3.0.1 pypi_0 pypi
sqlite 3.33.0 h62c20be_0
tabulate 0.8.9 pypi_0 pypi
tensorboard 2.4.1 pypi_0 pypi
tensorboard-plugin-wit 1.8.0 pypi_0 pypi
tensorboardx 2.1 pypi_0 pypi
tensorflow 2.4.1 pypi_0 pypi
tensorflow-estimator 2.4.0 pypi_0 pypi
tensorflow-gpu 2.4.1 pypi_0 pypi
tensorflow-probability 0.12.1 pypi_0 pypi
termcolor 1.1.0 pypi_0 pypi
threadpoolctl 2.1.0 pypi_0 pypi
tifffile 2020.9.3 pypi_0 pypi
tk 8.6.10 hbc83047_0
toml 0.10.2 pypi_0 pypi
toolz 0.11.1 pypi_0 pypi
tqdm 4.57.0 pypi_0 pypi
typing-extensions 3.7.4.3 pypi_0 pypi
uritemplate 3.0.1 pypi_0 pypi
urllib3 1.26.3 pyhd3eb1b0_0
viskit 0.1 dev_0 <develop>
werkzeug 1.0.1 pypi_0 pypi
wheel 0.36.2 pyhd3eb1b0_0
wrapt 1.12.1 pypi_0 pypi
xz 5.2.5 h7b6447c_0
yaml 0.2.5 h7b6447c_0
yarl 1.6.3 pypi_0 pypi
zipp 3.4.0 pypi_0 pypi
zlib 1.2.11 h7b6447c_3
Thanks in advance!
Best regards,
Samuele.
This line of the installation instructions didn't work for me:
pip install -e viskit
Does this just mean that you should clone viskit somewhere else and install it with pip install -e
?
e.g.
cd ~
git clone [email protected]:vitchyr/viskit.git
cd viskit
pip install -e .
I recently found an MOPO code implemented using pytorch (https://github.com/junming-yang/mopo-pytorch). I cannot find the difference between MOPO and SAC. The only difference is that there data are sampled from the rollout replay buffer generated by the learned dynamcis?
(base) hzx@hzx-System-Product-Name:~/mopo$ conda env create -f environment/gpu-env.yml
Collecting package metadata (repodata.json): done
Solving environment: done
==> WARNING: A newer version of conda exists. <==
current version: 22.9.0
latest version: 23.10.0
Please update conda by running
$ conda update -n base -c defaults conda
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
Installing pip dependencies: | Ran pip subprocess with arguments:
['/home/hzx/anaconda3/envs/mopo/bin/python', '-m', 'pip', 'install', '-U', '-r', '/home/hzx/mopo/environment/condaenv.dv0e6q7u.requirements.txt']
Pip subprocess output:
Collecting git+https://github.com/vitchyr/multiworld.git@d76b3dae2e8cbca02924f93d6cc0239c552f6408 (from -r /home/hzx/mopo/environment/requirements.txt (line 53))
Cloning https://github.com/vitchyr/multiworld.git (to revision d76b3dae2e8cbca02924f93d6cc0239c552f6408) to /tmp/pip-req-build-l357d2x_
Pip subprocess error:
Running command git clone -q https://github.com/vitchyr/multiworld.git /tmp/pip-req-build-l357d2x_
error: RPC failed; curl 92 HTTP/2 stream 0 was not closed cleanly: CANCEL (err 8)
fatal: the remote end hung up unexpectedly
fatal: early EOF
fatal: index-pack failed
WARNING: Discarding git+https://github.com/vitchyr/multiworld.git@d76b3dae2e8cbca02924f93d6cc0239c552f6408. Command errored out with exit status 128: git clone -q https://github.com/vitchyr/multiworld.git /tmp/pip-req-build-l357d2x_ Check the logs for full command output.
ERROR: Command errored out with exit status 128: git clone -q https://github.com/vitchyr/multiworld.git /tmp/pip-req-build-l357d2x_ Check the logs for full command output.
failed
CondaEnvException: Pip failed
Hi, thanks for sharing your great work. However, I am confused about the rollout generation process.
As I see in the code, the agent can access to a pre-defined terminal function to cut down the unrealistic state. Is this assumption generally holds up for broad cases of offline RL? To my understanding, in the offline setting, the agent should only access to a fix dataset without anything else. It feels like a little cheating for me, especially when, in the paper, the authors argue that one of the difference between MOPO and MOReL is that the soft penalty, rather than a hard terminal, of MOPO allow the agent to take risky actions.
Besides, if MOPO really needs the terminal function, why not learn one by neural net? I have already seen many model-based works on Atari games that uses a learned terminal function.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.