Coder Social home page Coder Social logo

semantic-segmentation's Introduction

Semantic Segmentation

Using UNet Model and Jupyter Notebook

Image Mask
1e6f48393e17_03 1e6f48393e17_03_mask

Installation

  1. Create conda environment
conda create --name env-name gitpython
  1. Clone Github
from git import Repo
Repo.clone_from("https://github.com/ihamdi/Semantic-Segmentation.git","/your/directory/")

       or download and extract a copy of the files.

  1. Install PyTorch according to your machine. For example:
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
  1. Install dependencies from requirements.txt file:
pip install -r requirements.txt
  1. Download Data:

       Run python scripts/download_data.py to download the data using the Kaggle API and extract it automatically. If you haven't used Kaggle API before, please take a look at the instructions at the bottom on how to get your API key.

       Otherwise, download the files from the official Carvana Image Masking Challenge page and extract "train_hq.zip" to imgs and "train_masks.zip" to masks folders in the data directory.

Folder Structure

  1. data directory contains imgs and masks folders.

       i. imgs subfolder is where the images are expected to be.

       ii. masks subfolder is where the images are expected to be.

  1. scripts directory contains download_data.py used to download the dataset directly from Kaggle.
  2. unet directory contains UNet model.
  3. utils directory contains data-loading and dice-score files.

Dataset

Data is obtained from Kaggle's Carvana Image Masking Challenge competition. Images and Masks archieves are provided in both normal and high quality. This code utilizes the train_hp.zip as well as train_masks.zip.

There are 318 cars in the train_hq.zip archieve. Each car has exactly 16 images, each one taken at different angles. In addition, each car has a unique id and images are named according to id_01.jpg, id_02.jpg ... id_16.jpg.

How to use

Run the following command

python train.py

The program by default will train with 5 epochs with a batch size = 1, learning rate = 0.00001, num workers = 0, scale = 0.5, mixed precision enabled, and uses 10% of the dataset for validation. You can pass the following arguments to change the default values:

  1. Epochs: --epochs
  2. Batch Size: --batch-size
  3. Learning Rate: --learning-rate
  4. Subset Size: --sample-size
  5. Number of Workers: --num-workers
  6. Image Scale: --scale
  7. Percentage used as Validation: --validation
  8. Mixed Precision: --amp

For example:

python train.py sample-size 500 num-workers 15 --amp

Results

Dice score is printed at the end of every validation round. However, the program uses Weights and Biases to log training loss and accuracy as well as the dice score. This makes it quite easy to visualize the results and check the status of runs without being at the training machine.

Changes made to Original Code

  1. Fixed data download problem from Kaggle. The code no longer gives an "unauthorized" error.
  2. Introduced sample_size to enable reduction of dataset if needed (mainly used for testing).
  3. Added num_workers as a variable so it doesn't need to be changed manually inside train.py.
  4. Added sample_size and num_workers to logging.
  5. Added sample_size and num_workers to arguments so they can be set easily when calling python train.py.
  6. Changed RMSProp to Adam optimizer for better results.
  7. Fixed progress bar for training. Original code show a fixed number of total iterations even if batch size is changed. The update was also choppy and only happening every 2 iterations.
  8. Fixed validation loop. Now it runs at the end of each epoch.
  9. Removed commented out lines and unused import statements.

Background:

This was created to learn about semantic segmentation and UNet, therefore only the training data is utilized.


Contact:

For any questions or feedback, please feel free to post comments or contact me at [email protected]


Referernces:

Pytorch-UNet was used as base for this code.

U-Net: Convolutional Networks for Biomedical Image Segmentation by Olaf Ronneberger, Philipp Fischer, Thomas Brox


*Getting Key for Kaggle's API

image

image

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.