The purpose of this project is to create a classifier for traffic signs. The model is trained and validated with the German Traffic Sign Dataset. The dataset contained 39,920 32x32 color images split amongst a training, validation, and test set. The dataset was not evenly distributed as shown in the histogram below.
Four types of data augmentation were performed.
-
Rotation to adjust for seeing signs at angles. Additional steps to augment data not performed here would be to warp images.
-
Histogram equalization to increase contrast
-
Contrast limited histogram equalization to increase contrast with a larger kernel size
Images were converted to grayscale and normalized
- Performed on training, validation and test data.
- Grayscale images to reduce the number of parameters.
- Sermanet Lecun 2011 found color didn't improve accuracy a lot. Prior models run with color agreed with this.
- Data is normalized by subtracting the mean image and normalizing following common best practices in order to keep our features in a consistent range. This will reduce the likelihood of our gradients getting out of control through vanishing gradient / saturating neurons in the network
The model is based on VGG. It achieves an accuracy of 95% on the validation set.
An accuracy of 95% was achieved on the validation set (model 6)
More models were trained with fewer epochs to test ensemble learning. One model didn't achieve the 93% accuracy after 500 epochs.
- Learning rate set to 1e-4. The learning rate of Adam decays over time and adjusts based on the gradient of the variables with momentum.
- Adam Optimizer adjusts the learning rate using the Adam algorithm This optimizer is derived from Adagrad an adaptive learning algorithm. The algorithm monotonically reduces its learning rate over time. RMSProp improved on this by using a moving average of gradients to reduce the aggressiveness of Adagrad. Adam improved on RMSProp by adding momentum. The default momentum and decay rate are used from tensorflow.
- A batch size of 64 is a function of using a large VGG-like architecture and having an older GPU. The first LeNet-like architecture had a batch size of 512.
- Dropout parameter of 0.5 is used as a default because it works Srivastava et al 2014 supposedly because it maximizes the regularization value of the dropout layer.
- Based on VGGNet
- The number of epochs was set to 1000 originally. This is because it was a large number at which point it seemed like the loss had reached a plateau. When training multiple networks 500 was used because it took a long time and graphs of validation accuracy of the prior run had plateued around then.
Traffic_Sign_Classifier.ipynb contains the entirety of the project. This includes data exploration, augmentation, and preprocessing as well as model training, testing, and explanations.
All images placed in examples folder. CSV contain validation accuracy for each model.