Coder Social home page Coder Social logo

semantic-segmentation's Introduction

Semantic-Segmentation

Project Overview

In this project,we will label the pixels of a road in images using a Fully Convolutional Network(FCN) programmed in Python3.5 . We experimented with multiple epochs,batch size,learning rate,dropouts hyperparameter during the training. And finally, we settled on the following hyper parameters since the performence on test dataset is so well and do not over fit or under fit during the testing.

  • epochs = 20
  • learning rate = 0.0001
  • batch size =1 (One 160x576 pixel RGB image)
  • dropouts = 0.2

The following animation shows a shot of final results of this Fully Convolutional Network for Semantic Segmentation: anim

Deep Neural Network Architecture

This neural network is by using Layer 3, 4, 7 from vgg and having 1x1 convolutional layers,skip connections and upsampling. The following pictures show a short brief about these three special layers. (pictures provided from Udacity)

1x1 Convolution ------> Upsampling ------> Skip Connections
1x1conv upsample skip

FCN(Fully Convolutional Network)

FCN is conprised of two main parts:encoder && decoder. The mechanism behind this network is that endocer extracts features that will later be used by decoder and decoder upscales the output of encoder. The following short graph of the construction of the FCN is shown below. (pictures provided from Udacity)

|--------------------Vgg Model------------------|---------1x1 Conv-------|---------------Upsampling-----------------------|

fcn

Reflection

The difference between a fully-connected layer and a fully-convolutional layer is that:

  • A fully connected layer obtains each neuron which is connected to every neuron in the previous layer and each connection has its own weight. Thus fully connected layers can only deal with input of a fixed size as it requires a certain amount of parameters to fully connect the input and output

  • A fully convolutional layer has two components: encoder and decoder. Encoder is followed by decoder and the decoder upsamples the output of encoder to the original image size

  • The great advantage of using a fully convolutional layer instead of a fully connected layer is that a fully convolutional layer preserves the spatial information through the entire network and thus could work on image of any size

Setup

Frameworks and Packages

Make sure you have the following is installed:

Dataset

Download the Kitti Road dataset from here. Extract the dataset in the data folder. This will create the folder data_road with all the training a test images.

semantic-segmentation's People

Watchers

sword-ace avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.