Coder Social home page Coder Social logo

ericlavigne / carnd-detect-lane-lines-and-vehicles Goto Github PK

View Code? Open in Web Editor NEW
68.0 8.0 36.0 442.42 MB

Use segmentation networks to recognize lane lines and vehicles. Infer position and curvature of lane lines relative to self.

Python 100.00%
udacity-self-driving-car lane-detection convolutional-neural-networks segmentation lane-lines

carnd-detect-lane-lines-and-vehicles's Introduction

Detecting Lane Lines and Vehicles

Udacity - Self-Driving Car NanoDegree

This project satisfies the requirements for both the Advanced Lane Finding project and the Vehicle Detection project for Udacity's Self-Driving Car Engineer nanodegree. Primary goals include detecting the lane lines, determining the curvature of the lane as well as the car's position within the lane, and detecting other vehicles.

I chose to use convolutional neural networks to detect lane lines and cars, rather than the gradient and SVM-based approaches recommended for these projects. I annotated training images with the correct answers by adding extra layers to indicate which parts of the picture were part of lane lines or cars, then trained convolutional neural networks to produce such image masks for other images from the video. The process of curating training data and training convolutional neural networks will be discussed further later in this document.

See presentation slides for presentation at Ft Lauderdale Machine Learning Meetup.

Note: Find the latest version of this project on Github.

The Project

The goals / steps of this project are the following:

  • Compute the camera calibration matrix and distortion coefficients given a set of chessboard images.
  • Apply a distortion correction to raw images.
  • Create thresholded binary images representing pixels of interest: lane markings and cars.
  • Apply a perspective transform to rectify binary image ("birds-eye view").
  • Detect lane pixels and fit to find the lane boundary.
  • Determine the curvature of the lane and vehicle position with respect to center.
  • Warp the detected lane boundaries back onto the original image.
  • Output visual display of the lane boundaries and numerical estimation of lane curvature and vehicle position.
  • Detect vehicle pixels and place bounding boxes around each detected vehicle.

Calibrating the Camera

The first step was to calibrate the camera, correcting for distortions. I used pictures of chessboard taken from various angles, with the assumptions that the chessboard was made up of perfectly aligned grid squares, to characterize the lens distortions. I then applied the resulting calibration to the calibration images to confirm that all visible distortions had been removed. See the calibrate_chessboard function in main.py.

Original Image Undistorted Image
original image undistorted image

I applied the same calibration to undistort images from the dashboard camera. The effect is subtle - note the difference in shape around the left and right edges of the car hood in the images below.

Original Image Undistorted Image
original image undistorted image

Training Data

The original images used for training are in the test_images directory. These include 8 images provided as examples by Udacity and 6 images extracted from the project video.

I copied each of these images to the training directory, for annotation. I converted the images to Pixen format, added layers to represent lane markings and cars, and created image masks in those layers to indicate the locations of lane markings and cars. I saved each layer separately with filenames ending in "x", "lanes", and "cars" so they could easily be imported into Python for training convolutional neural networks.

Original Image Cars Layer Lanes Layer
original image cars layer lanes layer

Note: Annotation accuracy is important. With only 14 images, it's okay to annotate slowly and focus on accuracy. Zoom in to paint individual pixels around the edge of a lane line, ensuring pixel-perfect accuracy at the edges. Then use the fill tool to finish the center of the lane line quickly.

Pre-Processing

All images are cropped to rectangular regions of interest (mostly just cutting out the sky) as well as scaled down by a factor of two both vertically and horizontally. Both cropping and scaling are primarily intended to save memory during training.

All images are gaussian blurred and scaled to a range of -0.5 to 0.5, both of which are intended to improve convergence for convolutional neural network training.

Convolutional Neural Networks

The lane markings and cars are identified by separate (but architecturally identical) convolutional neural networks. All layers use SAME border mode (with no flattening) so that the network's output (after thresholding) is an image of the same dimensions as the input. The lane model produces an image mask indicating which pixels are part of lane markings. The car model produces an image mask indicating which pixels are part of cars.

The neural network architecture consists of 7 convolutional layers. The input has 3 channels for R, G, and B. Hidden convolutions have depths of 20, 30, 30, 30, 20, and 10. The output layer has a depth of only 1 to produce a single-channel image mask. Dropouts of 50% are applied after each hidden layer to prevent over-fitting.

Lane lines and cars are both under-represented classes compared to the background, so I used a custom loss function called weighted_binary_crossentropy to increase the weight of minority classes by a factor of 50.

You can see in the image below that the convolutional neural networks for cars can identify cars on the horizon or cars that are barely visible over the barrier on the left. The pink overlay is the thresholded output from the lane markings convolutional network. The cyan overlay is the thresholded output from the cars convolutional network. (There is also one false positive, part of the fence on the right.)

Original Image Annotated by Conv Net
original image annotated

Fitting and Characterizing Lane Lines

After identifying lane marking pixels, I needed to transform those marking positions into a bird's eye view for further analysis. I identified fixed points on the lane lines in the following image with straight lane lines: two points near the car and two points near the horizon. These points form a trapezoid in the image but a rectangle seen from above. I used proportions from the Udacity project description, 3.7 meter lane width and 30 meter visible distance, to define a transformation into a bird's eye view. Relevant functions include perspective_matrices and perspective_transform.

Annotated Image
original image
Bird's Eye Detected Markings Parabolic Fit
birds eye markings fit

With the identified lane markings transformed into an overhead perspective, I could fit parabolas to each lane, then calculate curvature and position of the car within the lane. Relevant functions include find_lane_lines, draw_lane_lines, radius_of_lane_lines, and offset_from_lane_center.

Annotated Image
original image
Bird's Eye Detected Markings Parabolic Fit
birds eye markings fit

Annotating Video

When annotating a video, rather than an image, there is an opportunity to take advantage of information from previous frames. I chose to stabilize the lane fitting by blending the identified lane markings from a random sample of 10 out of the previous 30 frames, fitting lane lines based on pixels that were identified as lane pixels in at least 3 of the 10 frames. See the video_processor class for details.

Project Video
project video

Discussion

Both lane detection and vehicle detection depend on neural network models trained on 14 example images. These models are unlikely to work on roads with different lane appearances, or even on different car models. This could be fixed just by collecting images on which the models performed poorly, adding labels, and including these new images in the training set.

The search algorithm for lane lines in this project assumes that the car is fairly close to the center of the lanes. That search algorithm would need to be modified to find lane lines in arbitrary positions with respect to the car.

The perspective transformation assumes that the road is flat. That algorithm would not be useable on hills.

Installation

  1. Clone the repository
git clone https://github.com/ericlavigne/CarND-Detect-Lane-Lines-And-Vehicles
  1. Setup virtualenv.
cd CarND-Detect-Lane-Lines-And-Vehicles
virtualenv -p python3 env
source env/bin/activate
pip install -r requirements-cpu.txt (or requirements-gpu.txt if CUDA is available)
deactivate

Running the project

cd CarND-Detect-Lane-Lines-And-Vehicles
source env/bin/activate
python main.py
deactivate

Installing new library

cd CarND-Detect-Lane-Lines-And-Vehicles
source env/bin/activate
pip freeze > requirements.txt (or requirements-gpu.txt if CUDA is available)
deactivate

carnd-detect-lane-lines-and-vehicles's People

Contributors

brok-bucholtz avatar ericlavigne avatar ryan-keenan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

carnd-detect-lane-lines-and-vehicles's Issues

a error

Traceback (most recent call last):
File "main.py", line 529, in
main()
File "main.py", line 522, in main
'output_images/birds_eye_lines')
File "main.py", line 54, in transform_image_files
dst_img = transformation(img)
File "main.py", line 380, in convert_lane_heatmap_to_lane_lines_image
lines = fit_parabolas_to_lane_centroids(centroids)
File "main.py", line 332, in fit_parabolas_to_lane_centroids
min_y = np.amin(y_vals)
File "/home/syp/CarND-Detect-Lane-Lines-And-Vehicles/env/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 2352, in amin
out=out, **kwargs)
File "/home/syp/CarND-Detect-Lane-Lines-And-Vehicles/env/lib/python3.6/site-packages/numpy/core/_methods.py", line 29, in _amin
return umr_minimum(a, axis, None, out, keepdims)
ValueError: zero-size array to reduction operation minimum which has no identity
I change the test_images's image to my image,and the size was 1280*720, but it has a error.

Running ERROR with my own video

hello,i am using the program to process my own videos which are already resize to 1280*720,and error below is appearing.
`
File "C:\Users\10401\AppData\Local\conda\conda\envs\TensorFlow\lib\site-packages\numpy\core_methods.py", line 29, in _amin
return umr_minimum(a, axis, None, out, keepdims)

ValueError: zero-size array to reduction operation minimum which has no identity`
i am not familiar with python.So why it appears and how can i solve the problem?

images to Pixen format and added layers to represent lane markings and cars

While you mention below, I was trying to make one for my own data. However, I couldn't find a way like your precise marked on lane and car. Would you please show some tips for it?

"I copied each of these images to the training directory, for annotation. I converted the images to Pixen format, added layers to represent lane markings and cars, and created image masks in those layers to indicate the locations of lane markings and cars. I saved each layer separately with filenames ending in "x", "lanes", and "cars" so they could easily be imported into Python for training convolutional neural networks."

Where Can I Set the Batch Size?

Hi,

Thank you for the sharing. May I know how I can set the batch size since it will run out of memory if more images are trained. Thank you.

Error in main.py

File "main.py", line 532, in
main()
File "main.py", line 526, in main
transform_image_files(lambda img: video_processor(lane_model=lane_model,car_model=car_model,calibration=calibration).process_image(img),
File "main.py", line 57, in transform_image_files
dst_img = transformation(img)
File "main.py", line 526, in
transform_image_files(lambda img: video_processor(lane_model=lane_model,car_model=car_model,calibration=calibration).process_image(img),
File "main.py", line 451, in process_image
markings = image_to_prediction(undistorted, self.lane_model, lane_settings)
File "main.py", line 234, in image_to_prediction
result = uncrop_scale(result,opt)
File "main.py", line 140, in uncrop_scale
img = uncrop(img,opt)
File "main.py", line 115, in uncrop
frame[opt['crop_min_y']:opt['crop_max_y'], opt['crop_min_x']:opt['crop_max_x'], 0:3] = img
ValueError: could not broadcast input array from shape (2,2,3) into shape (246,880,3)

Can anyone help with this issue? @ericlavigne @Brok-Bucholtz @ryan-keenan

ERROR for python main.py

Hello,

In the step of running the project, I get the following error.
File "main.py", line 407, in annotate_original_image
if lane_markings_img != None:
The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.