Coder Social home page Coder Social logo

ml-lab / pytorch-unet Goto Github PK

View Code? Open in Web Editor NEW

This project forked from milesial/pytorch-unet

0.0 3.0 0.0 47.29 MB

Pytorch implementation of the U-Net for image semantic segmentation, with dense CRF post-processing

License: GNU General Public License v3.0

Python 100.00%

pytorch-unet's Introduction

Pytorch-UNet

input and output for a random image in the test dataset

Customized implementation of the U-Net in Pytorch for Kaggle's Carvana Image Masking Challenge from a high definition image. This was used with only one output class but it can be scaled easily.

This model was trained from scratch with 5000 images (no data augmentation) and scored a dice coefficient of 0.988423 (511 out of 735) on over 100k test images. This score is not quite good but could be improved with more training, data augmentation, fine tuning, playing with CRF post-processing, and applying more weights on the edges of the masks.

The model used for the last submission is stored in the MODEL.pth file, if you wish to play with it. The data is available on the Kaggle website.

Usage

Prediction

You can easily test the output masks on your images via the CLI.

To see all options: python predict.py -h

To predict a single image and save it:

python predict.py -i image.jpg -o output.jpg

To predict a multiple images and show them without saving them:

python predict.py -i image1.jpg image2.jpg --viz --no-save

You can use the cpu-only version with --cpu.

You can specify which model file to use with --model MODEL.pth.

Training

python train.py -h should get you started. A proper CLI is yet to be added.

Warning

In order to process the image, it is split into two squares (a left on and a right one), and each square is passed into the net. The two square masks are then merged again to produce the final image. As a consequence, the height of the image must be strictly superior than half the width. Make sure the width is even too.

Dependencies

This package depends on pydensecrf, available via pip install.

Notes on memory

The model has be trained from scratch on a GTX970M 3GB. Predicting images of 1918*1280 takes 1.5GB of memory. Training takes approximately 3GB, so if you are a few MB shy of memory, consider turning off all graphical displays. This assumes you use bilinear up-sampling, and not transposed convolution in the model.

pytorch-unet's People

Contributors

changjo avatar milesial avatar rht avatar ziyuanzhangtony avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.