Comments (5)
whoa, this is weird. I'm sure I ran the model several times both on 1080 and TitanX gpus without getting NaN.
The problem might not be in the data, otherwise people training the steering model would have complained as well.
May I ask you what is your gpu and TF version?
from research.
by any chance, would you have multigpu setup and are asking TF to use only one GPU?
also, are you able to continue training from the checkpoint? if you try to continue training, does it crash in the same point again? I remember getting random crashes due to TF rounding problems but I could continue training from the checkpoint.
from research.
Graphics card: GTX 1060
TF: tensorflow (0.10.0rc0)
Cuda compilation tools, release 7.5, V7.5.17
cuDnn version 4
I only have 1 GPU, and am using it for training.
I am not sure how to continue training from a checkpoint. I wasn't aware TF automatically creates checkpoints. I have simply been restarting the server and running the training again from scratch everytime i get this error. (By the way It seems to be almost finished now at epoch 195, so fingers crossed.) I just don't think its safe to leave a bug (if it exists) like this laying around, since it could waste days of training.
For more info, i trained this on a Nvidia Tesla K20 and although it was slower than my 1060, it worked the first time without any errors. Again, I'm kind of scared that this might be a randomly created error, which can make it hard to hunt down.
from research.
tensorflow does not do that automatically.
but our code does. Add the flag --loadweights
continue from a checkpoint:
https://github.com/commaai/research/blob/master/train_generative_model.py#L137
Yeah, I guess its some rounding error in TF beyond my reach for now... But let me know if the checkpoint thing works for you.
from research.
how do you train the train_generative_model.py autoencoder successfully ,i meet some difficuty , have to doing somehting in code?thanks
from research.
Related Issues (20)
- TypeError: slice indices must be integers or None or have an __index__ method
- TypeError: slice indices must be integers or None or have an __index__ method HOT 1
- About the dividing of training set and testing set HOT 1
- server.py file HOT 1
- bad marshal data errorin the view steering model.py HOT 4
- How can i get the camera parameter about the pic? HOT 2
- Suggestion: use "pygame.event.pump()" to prevent not responding. HOT 1
- autoencoder (ONLY) model refactored for TensorFlow 1.10+
- can not understand transition some code. Help,thanks
- What is the mapping between the frame and the steering angle HOT 1
- how to see the video
- Error during training the transition model: ValueError: could not broadcast input array from shape (0,3,160,320) into shape (60,3,160,320) HOT 1
- How can I train a model?
- The dataset contains random images HOT 1
- Hey am new to this and don't understand much of so any help would be much appreciated
- Hsck
- velodyne_gps
- Is the model in Comma 3 open sourced? HOT 2
- can not find keys start with 'app_'
- Steering Angle in Dataset
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from research.