Comments (6)
Actually, I'm fixing some other stuff right now. I'll add that portion in. find it on branch mspryn_bugfixes shortly.
from autonomousdrivingcookbook.
I checked the code, those are intentional.
When training a RL model, you need to balance exploration (trying new strategies) and exploitation (improving the best known strategy incrementally). The way we do this in our code is via a strategy call linear-epsilon annealing. With this method, we start by making decisions completely at random, as our model is meaningless. Even with transfer learning, we have dense layers that are initialized at random. Over time, as our model better learns to predict the Q values, we decrease the percentage of time that we explore (i.e. take random actions), and increase the percentage of time we exploit (i.e. follow the model's predictions). If you look around line 348 in distributed_agent.py, you can see where we are decreasing the epsilon after each successful iteration. But, during training, we never want to stop exploring entirely, as this can cause the model to get stuck in a local minimum. This has the consequence that even a perfect model will crash during training, as we will still be making random decisions. The final convergence value is set by the parameter min_epsilon, which defaults to 0.1, meaning 10% random actions.
Few more comments / questions:
- RL models are prone to overfitting like any model. You need to be really careful if you resume training of an already trained model. I'm not surprised that running hours of training on the already trained model leads to garbage.
- Note that always_random is only set to True when filling the replay memory for the first time. After that, it's false, which means linear epsilon annealing happens (check around line 160).
- We don't want to overwhelm the training machine with data, so we always stop and perform a training iteration after 30 seconds. In this case, AirSim will keep the last control signal that is being sent, meaning that the car will most likely crash.
- You can try playing around with the min_epsilon and per_iter_epsilon_reduction parameters to modify the training schedule. The former will control the minimum percentage of time that we explore, and the latter will control how quickly we move from a full-explore to mostly-exploit strategy. The parameters provided in the notebooks have been shown to work well for most cases, but there may be a better combination that will allow for faster training.
- How are you determining "convergence?"
- Are you running the final models using the RunModel.ipynb?
from autonomousdrivingcookbook.
Thanks!
Yes, I'm checking using RunModel. I have no good definition to convergence, I simply run RunModel 5 times and if all of them crashes within 5 seconds I assume the model is bad. This happens even if I do as said, and use "pretrained_model_weights.h5" for initialization of the Transfer learning.
from autonomousdrivingcookbook.
I think I found another related bug:
at distributed_agent.py line 40:
self.__train_conv_layers = bool(parameters['train_conv_layers'])
should be:
self.__train_conv_layers = bool(int(parameters['train_conv_layers']))
Otherwise it's always True, which might explain why my Transfer Learning didn't work.
from autonomousdrivingcookbook.
I see. That value will be the string "True" or "False", so the fix should be
self.__train_conv_layers = (parameters['train_conv_layers'].lower().strip() == 'true')
Can you submit a PR with that?
from autonomousdrivingcookbook.
The PR has been merged, so I'll close this issue.
from autonomousdrivingcookbook.
Related Issues (20)
- Received an empty batch. Batches should at least contain one item.
- Gear attribute in airsim_rec.txt
- Train Model Keras Issue HOT 1
- no Cooking HOT 2
- ValueError: cannot reshape array of size 1 into shape (0,0,4) HOT 1
- Fails to create test.h5 and eval.h5 in DataExplorationAndPreparation HOT 2
- Modifications for throttle prediciton
- Getting IndexError: list index out of range while running TrainModel.ipynb ([in 5]). HOT 2
- When sending the steering angle to Carclient ,should the predition angle multiply 0.69 ? HOT 1
- JSONDecodeError HOT 1
- Mr. Spryn! How can I change the Generator.py and Cooking.py to store the images in the batches in the same order as they are in the folders and then entered into the model for train?
- Kaggle AirSim End-to-End Learning to share HOT 1
- File not found error
- Help! training not starting! #urgent HOT 6
- AirsimE2EDeepLearning code seems to change the tone of my image data HOT 2
- Dataset link does not work HOT 5
- This repo is missing important files HOT 1
- AD_Cookbook_AirSim.7z download link do not work! help! HOT 3
- Do not suit for uav
- Lane Following and Collision Avoidance for Self-Driving Cars
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from autonomousdrivingcookbook.