Comments (17)
Thanks a lot - I'm seeing the same flavour of results as you in that colab.
Interestingly though I do not see this error in our runs inside Google.
My suspicion is that something is going wrong with the versioning/export... but at the moment I don't understand what that is...
We will try and get to the bottom of this ASAP - thank you for raising!
from bsuite.
Hi bycn!
That's disturbing... although I can confirm that our internal tests that run daily do not have any issues.
I'll have a look into this and also try and think about how we can narrow the gap between google-internal and opensource
from bsuite.
My guess is that there are some versioning/installation issues happening with the TF install-in-colab.
Did you try the Jax agent?
This may also have the same issue, but it might help narrow this down
from bsuite.
When I run the install in a fresh colab I get an error:
ERROR: tensorflow-probability 0.8.0 has requirement cloudpickle==1.1.1, but you'll have cloudpickle 1.3.0 which is incompatible.
Do you see a similar error?
Can you please set up a colab that I can run from start-finish to reproduce your issue?
from bsuite.
Yes I saw a similar error and looked into it — it seems that this only affects tfp which is used in actor critic only. The colab just installs bsuite[baselines] and ran the above code, and runs using GPU runtime, that's all. Do you see similar results?
i.e.:
!pip install bsuite[baselines]
<code above>
from bsuite.
I can confirm that I've run this inside Google, using the most up to date versioning.
When I do that I learn size=30 in about ~800 bad episodes every time.
deep_sea/0 typically takes < 100 episodes.
Can you link to a colab that loads/examines the results in exactly the same way?
from bsuite.
https://colab.research.google.com/drive/1XjdHeLmkiYW2b8-ybQfF-V9P9KbE74c7?usp=sharing
from bsuite.
Thanks for looking into it. Any ideas so far?
from bsuite.
I can't work this out at the moment... did you try the JAX implementation?
Is that one working for you?
(Both of them are working completely fine inside Google)
from bsuite.
I haven't — actually I would like to use TF to compare agents in my research, so it would be great to figure out the problem there. Does the internal version use the exact same code (implementation, hyperparameters for default agent)?
from bsuite.
In particular it would be great if you could confirm the frequency of gradient updates. The current default has a sgd_period of 1 which appears to update the agent with a minibatch after every step, whereas the original paper says to do this after every episode.
from bsuite.
Yes those differences in SGD period are there, but I they won't make a big difference like this.
It seems like something is wrong with the TF agents... and I'm not sure what.
The solution to this may involve us deleting/deprecating the TF agents and supporting only Jax.
Nothing has changed in our actual code, but potentially something has changed in one of the dependencies.
from bsuite.
Bryan - have you tried running this outside of colab?
My current suspicion is something is going wrong specifically in colab dependency installation.
from bsuite.
In fact, I've started a new virtual environment and tried to follow the instructions:
So, it does look like a versioning issue that is getting silenced somehow in the colab
from bsuite.
Yep, so: 1) able to fix those installation errors you see (and I also got) outside of Colab by switching to Python 3.6.12
2) I tried Jax in Colab and it works consistently so it is a tf agent issue
I haven't actually ran the successful installation yet but I'm guessing it'll work fine, as for Colab I'm not really sure but the silent error is quite concerning 😄
from bsuite.
OK I think we have now fixed this with an updated set of versioning...
Can you check if it's working for you now?
from bsuite.
In fact, I've run the colab you linked to and everything is fine now!
Seems like the issue was something to do with tensorflow probability versioning and poor installation in colab
from bsuite.
Related Issues (20)
- Rendering control environments HOT 2
- Cannot import Random HOT 2
- bsuite_tutorial problem when build PPO OpenAI baseline agent HOT 1
- How is the 'generalization' score computed? HOT 1
- DQN mnist & mountain car performance HOT 16
- setup.py broken after last commit HOT 2
- dependency on trfl breaks TF2 HOT 2
- Question about DQN's loss HOT 1
- Using the agent's RNG, and not numpy's, to select actions HOT 1
- Importing ABC directly from collections will be removed in Python 3.10 HOT 2
- Documentation: Clarify mapping from high-level agent properties to experiments and environments HOT 2
- The signature for `update` does not allow for sarsa or n-step methods? HOT 1
- Environment seeding HOT 1
- Cartpole environment observation parameters HOT 1
- `Catch._observation` does not follow the other environments with `_get_observation`
- How to add the results to results.py? What's the results format should be? HOT 1
- Tensorflow BOOT DQN agent loses performance after first iteration
- Incompatible with numpy>0.24
- Incompatible with Python 3.12 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bsuite.