Comments (3)
I used example with default toy model. Here is my procedure:
- Modify the EPOCHS as 10 to train and dump the Q_network in ./nnet for new Q_network to resume from.
- Modify the EPOCHS as 10 to train the second model by using setNetwork to resume the trained Q_network.
- Modify the EPOCHS as 20 to train the third Q_network for comparison.
We can find that, the epoch 1-10 in the second log is different from the epoch 11-20 in the third log.
Specifically, the most important different part is the V value for each action.
More over, if we train long enough, we'll find that the Learning rate / Discount factor / Epsilon also did not transferred from the dumped Q_network.
The setNetwork / dumpNetwork just deal with the layer parameters in the Q_network. When we resume from the dumped Q_network, the training result are not identical with the original training process.
from deer.
Hi,
In fact, it is intended to work that way (I can maybe add some doc about it to make it clear). When you use the function getAllParams/setAllParams, the params are the ones of the neural network, not the hyper-parameters for the training of the Q-network. The goal is that you can afterwards start off with a NN that is already trained. In case you wish to continue the training, you can define any hyper parameters you like along with it, but it doesn't always make sense to use the exact same hyper-parameters you used when you dumped the NN.
from deer.
Thanks a lot! I understood, it is truly more reasonable.
from deer.
Related Issues (20)
- ReadTheDocs Link Broken HOT 1
- MemoryError on run_PLE.py example HOT 1
- q_networks.AC_net_keras, q_networks.q_network_keras and q_networks.q_network_theano only use 95mb of GPU HOT 1
- Is there any pre-trained model? HOT 1
- TypeError: _buildDQN() takes exactly 2 arguments (1 given) HOT 1
- AC_net_keras qnetwork.getAllParams() HOT 3
- Action limits are getting exceeded HOT 1
- TRPO algorithm HOT 2
- Conv2D channels_last in the Keras HOT 1
- Error for bleeding edge version installation HOT 1
- [Feature Request] Weight Normalization HOT 6
- How to use LSTM? HOT 4
- More information about the LongerExplorationPolicy HOT 3
- DDPG implementation HOT 4
- CRAR continuous action space HOT 2
- Agent not learning for maze environment with CRAR HOT 2
- how can i test my deer model HOT 2
- MG_two_storages HOT 1
- MG example with custom environment HOT 6
- MG two sorages HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deer.