Coder Social home page Coder Social logo

carnd-vehicle-detection's Introduction

Vehicle Detection Project

Our implementation for the project 5 of Term 1 of Self Driving Car Nanodegree.

Code Organization


The code is organized in following files:

  • setup.py Downloads the vehicle/non-vehicle zip data, extract features and stores them as a single file.
  • features.py Set of utility functions for computing hog, color and spatial filters.
  • classifier.py Import the features created by setup.py and trains a linear SVM classifier. The classifier is also stored in a pickle file.
  • tracker.py The implementation of our main pipeline.

How to Run


  1. Download the data and create the feature set.
python setup.py
  1. Train the classifier
python classifier.py
  1. Run the tracker
python tracker.py video/project_video.mp4

Pipeline


Our pipeline works by running a vehicle classifier on each frame of the video and detecting vehicles using a sliding window algorithms. In this section, we describe the implementation of our pipeline in little more detail. The video result of this tracking along with some discussion about our approach is presented towards the end of this writeup.

1.Feature Extraction.

The first step was to train a classifier that can detect vehicles in a video. We use a Linear SVM classifier with the training data provided with the project.

The training data consists of 64 by 64 images of vehicles and non-vehicles. An example is shown below.

alt text

The feature matrix consists of HOG (Histogram of Oriented Gradients), histogram and spatial features. For HOG features, we used all the color channels in YCrCb color space. The color space and number of channels were decided through experimentation with a training/validation set. The HOG parameters used were: orientation=9, pixels_per_cell=(8, 8), and cells_per_block=(2, 2). The bins=32 was used for computing the color histogram. Similarily a raw sample of pixel at size (32, 32) were used as spatial feature.

We experimented with using a single color space and with/without histogram and spatial features, The accuracy test on validation set was much higher when all these features were used together as compared to using a single feature.

The code for hog / color feature is in features.py:extract_features():line#98. Below is an example image with hog features computed across all color channels.

alt text

2. Classifier Training

The code for SVM trainer is in train.py:train():line#9. This file loads a pickle file containing the feature matrix. The feature matrix is generated by setup.py:preprocess().

The data was split into training/validation/test sets. To avoid the bias in data because of the time series nature of GTI dataset, we used first ~70% of the images (the order in which they are stored on the disk) as training, 15% of next set of images as validation and the remaining 15% as the test set. The KITTI dataset was first shuffled and then we split it into 70/15/15% training/validation/test set. We ensured that output class is roughly balanced betwen all these datsets. The code for creating this dataset is in setup.py:preprocess()

A Linear SVM was trained using the training data. We did basic parameter tunning by manually changing the value of C parameter and testing the accuracy on the validation set. Finally the value of 1.0 was chosen for the final classifier.

3. Search and Detection

We used a sliding window approach to detect vehicles in the video frame. The HOG features are computed for the whole image and then sub-sampling is used to extract features for each overlapping window position. An overlap of 75% was used. Since our vehicles can appear at different scales in the video, we run our window search algorithm at different image scales. Additionally, We only search the region of the image below the horizon (~400 y position). These detection windows are then combined using a heatmap and further processed to eliminate duplicates and false positives.

As the classifier can have false positive (detecting vehicle where there is none), we handle this situation using a heatmap. The heatmap keeps track of the regions in the video frame where we detected a vehicle. This is done by just incrementing the value of the count associated with the pixels in the detected region. The basic idea is that the region of the image with vehicle will get multiple hits as compared to to an area of false positive. Next, by thresholding the heatmap, we can get rid of the false positives in the image. Additionally, we compute the connected components within the image using the ``scipy.ndimage.measurements.label()` to remove duplicates from the image.

The plot below shows the result of our tracking on the test images. Each column shows an example image at different stage in the pipeline. These stages are computing the HOG features, applying the heatmap to filter out the false positives and then using scipy.ndimage.measurements.label() to compute the bounding boxes for connected blobs. The code for sliding window is in tracker.py:find_cars:line#45.

For drawing the bounding box. We compute the center of the connected blobs return by the label() method above and then compute the width and height of the bounding box based on the size of the connected blob. We tried different methods for drawing bounding box including drawing a fixed sized window around the center and adjusting (scaling) the size of the window based on the distance of the detection from the horizon. However, we found the center + blob size approach to be little bit more stable. Additional smoothing of the bounding box can be achieved through average of detection windows over frames (Not implemented).

alt text

Video Processing


The video tracking is implemented by the Tracker class. The code for this class is in tracker.py file. Its method find_cars:line#45 runs the sliding window algorithms and computes the HOG features and runs the classifier. This method also computes all detections windows.

To optimize the classification and reduce the overall time to run our detection and tracking algorithm, we run the detection algorithm on every 5th frame. After each detection, we discount (reduce) the number of hits in the heatmap that are greater than the threshold by threshold-1. The idea is during the next detection run, there should be at least a given number of positive detections within a given region in order for it to be again consider as a possible match. Given more time, we could have tried few other discounting methods to see how they improve the results of our pipeline. The code for adjusting the heatmap on every 5th frame is given in tracker.py:line#180.

Here's a link to my video result

Below is the tracking pipeline applied to six frames in our video. The heatmap get hot again after the 5th frame as the tracker updates its detection every 5th frame.

alt text

Discussion


I We presented a vehicle detection and tracking pipeline using a combination of HOG, color and spatial filters along with a sliding window algorithm that uses a linear SVM classifier to detect vehicles. The pipeline works decently and has very few false positives. These false positives can be eliminated further by experimenting with the thresholding of heatmap or using a better classifier.

Our pipeline does seem to struggle when cars are not fully visible yet or behind each other. This can be fixed by tracking/predicting the position of each car in the next frame.

There is a lot of additional room for improvement with our current approach. To extend this project further, we would be interested in applying more robust classifier such as YOLO (you only look once) - a deep learning based object detection network. A better classifier can reduce the number of false and duplication detections and also provide a more robust bounding box around the object. It might also be possible to use a pixel level classifier instead of window based classifier that will provide a much tighter bounds for the presence of the vehicle. Additionally, the idea of hard negative mining (re-training classifier with the negative examples) to eliminate false negative matching also seems something worth pursuing.

carnd-vehicle-detection's People

Contributors

iamfaisalkhan avatar ryan-keenan avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.