Coder Social home page Coder Social logo

aurotripathy / ssd-spacenet Goto Github PK

View Code? Open in Web Editor NEW
56.0 10.0 19.0 46.18 MB

Detect buildings in the Spacenet satellite imagery dataset using Single Shot MultiBox Detector (SSD)

License: Other

CMake 2.40% Makefile 0.61% Shell 0.44% C++ 78.78% Cuda 5.90% MATLAB 0.78% M 0.01% Python 9.34% Protocol Buffer 1.75%

ssd-spacenet's Introduction

Building Detection in the Spacenet Satellite Imagery Dataset using Single Shot MultiBox Detector (SSD)

This project applies the Caffe-based Single-Shot Detector (SSD) algorithm to the Spacenet dataset. Early results (demonstrating feasibility) are shown below.

Results

Spacenet data-preparation details are here.

Steps to train and test are here

As of October 1, 2016, I've layed the ground-work (annotating data, creating lmdb, partitioning data into test/train sets, and creating the training/detecting script). Improvements will be applied on this foundation. At the moment, the training regime uses the same hyperparameters as the Pascal VOC dataset in the original SSD implementation. Also, no additional data-augmentation techniques have yet been applied and is the next refinement under considerarion.

You can visualize the SSD training network here using Netscope

Below is the documentation of the actual project

SSD: Single Shot MultiBox Detector

Build Status

By Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg.

Introduction

SSD is an unified framework for object detection with a single network. You can use the code to train/evaluate a network for object detection task. For more details, please refer to our arXiv paper.

SSD Framework

System VOC2007 test mAP FPS (Titan X) Number of Boxes
Faster R-CNN (VGG16) 73.2 7 300
Faster R-CNN (ZF) 62.1 17 300
YOLO 63.4 45 98
Fast YOLO 52.7 155 98
SSD300 (VGG16) 72.1 58 7308
SSD300 (VGG16, cuDNN v5) 72.1 72 7308
SSD500 (VGG16) 75.1 23 20097

Citing SSD

Please cite SSD in your publications if it helps your research:

@article{liu15ssd,
  Title = {{SSD}: Single Shot MultiBox Detector},
  Author = {Liu, Wei and Anguelov, Dragomir and Erhan, Dumitru and Szegedy, Christian and Reed, Scott and Fu, Cheng-Yang and Berg, Alexander C.},
  Journal = {arXiv preprint arXiv:1512.02325},
  Year = {2015}
}

Contents

  1. Installation
  2. Preparation
  3. Train/Eval
  4. Models

Installation

  1. Get the code. We will call the directory that you cloned Caffe into $CAFFE_ROOT
git clone https://github.com/weiliu89/caffe.git
cd caffe
git checkout ssd
  1. Build the code. Please follow Caffe instruction to install all necessary packages and build it.
# Modify Makefile.config according to your Caffe installation.
cp Makefile.config.example Makefile.config
make -j8
# Make sure to include $CAFFE_ROOT/python to your PYTHONPATH.
make py
make test -j8
make runtest -j8
# If you have multiple GPUs installed in your machine, make runtest might fail. If so, try following:
export CUDA_VISIBLE_DEVICES=0; make runtest -j8
# If you have error: "Check failed: error == cudaSuccess (10 vs. 0)  invalid device ordinal",
# first make sure you have the specified GPUs, or try following if you have multiple GPUs:
unset CUDA_VISIBLE_DEVICES

Preparation

  1. Download fully convolutional reduced (atrous) VGGNet. By default, we assume the model is stored in $CAFFE_ROOT/models/VGGNet/

  2. Download VOC2007 and VOC2012 dataset. By default, we assume the data is stored in $HOME/data/

# Download the data.
cd $HOME/data
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
# Extract the data.
tar -xvf VOCtrainval_11-May-2012.tar
tar -xvf VOCtrainval_06-Nov-2007.tar
tar -xvf VOCtest_06-Nov-2007.tar
  1. Create the LMDB file.
cd $CAFFE_ROOT
# Create the trainval.txt, test.txt, and test_name_size.txt in data/VOC0712/
./data/VOC0712/create_list.sh
# You can modify the parameters in create_data.sh if needed.
# It will create lmdb files for trainval and test with encoded original image:
#   - $HOME/data/VOCdevkit/VOC0712/lmdb/VOC0712_trainval_lmdb
#   - $HOME/data/VOCdevkit/VOC0712/lmdb/VOC0712_test_lmdb
# and make soft links at examples/VOC0712/
./data/VOC0712/create_data.sh

Train/Eval

  1. Train your model and evaluate the model on the fly.
# It will create model definition files and save snapshot models in:
#   - $CAFFE_ROOT/models/VGGNet/VOC0712/SSD_300x300/
# and job file, log file, and the python script in:
#   - $CAFFE_ROOT/jobs/VGGNet/VOC0712/SSD_300x300/
# and save temporary evaluation results in:
#   - $HOME/data/VOCdevkit/results/VOC2007/SSD_300x300/
# It should reach 72.* mAP at 60k iterations.
python examples/ssd/ssd_pascal.py

If you don't have time to train your model, you can download a pre-trained model at here.

  1. Evaluate the most recent snapshot.
# If you would like to test a model you trained, you can do:
python examples/ssd/score_ssd_pascal.py
  1. Test your model using a webcam. Note: press esc to stop.
# If you would like to attach a webcam to a model you trained, you can do:
python examples/ssd/ssd_pascal_webcam.py

Here is a demo video of running a SSD500 model trained on MSCOCO dataset.

  1. Check out examples/ssd_detect.ipynb or examples/ssd/ssd_detect.cpp on how to detect objects using a SSD model.

  2. To train on other dataset, please refer to data/OTHERDATASET for more details. We currently add support for MSCOCO and ILSVRC2016.

Models

  1. Models trained on VOC0712: SSD300, SSD500

  2. Models trained on MSCOCO trainval35k: SSD300, SSD500

  3. Models trained on ILSVRC2015 trainval1: SSD300, SSD500 (46.4 mAP on val2)

ssd-spacenet

ssd-spacenet's People

Contributors

aurotripathy avatar cypof avatar dgolden1 avatar ducha-aiki avatar eelstork avatar erictzeng avatar flx42 avatar fyu avatar jamt9000 avatar jeffdonahue avatar jyegerlehner avatar kkhoot avatar kloudkl avatar longjon avatar lukeyeager avatar mavenlin avatar mohomran avatar mtamburrano avatar netheril96 avatar philkr avatar qipeng avatar rbgirshick avatar ronghanghu avatar sergeyk avatar sguada avatar shelhamer avatar tnarihi avatar weiliu89 avatar yangqing avatar yosinski avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ssd-spacenet's Issues

Making standalone application of .CaffeModels

Hello,

How can I use the final deployed (contains deploy.prototxt ) .caffelmodel files for standalone classification/detection/segmentation projects?

Mostly Caffe provides some python files but they should be used in a System with installed & complied Caffe and Python (mainly Linux). This is not good for a guy who wants to make some commercial apps. What should I do?

Thanks.

Models

Hello,
I can't download your models: Models trained on VOC0712: SSD300, SSD500
Models trained on MSCOCO trainval35k: SSD300, SSD500
Models trained on ILSVRC2015 trainval1: SSD300, SSD500 (46.4 mAP on val2)

Could you give me a link that I can get your models, thank you!

preparing data and training on multi-channel images

Hello,

I have trained SSD on a customized version of the VOC dataset and used the steps to prepare the data (as in SSD's readme). I would now like to use the above model to fine-tune on multi-channel images (RGB+D) as input. Where can I edit the channels dimensions as I see that the ssd_pascal.py script loads the VGGNet.

Also, what should I do differently to prepare the dataset correctly?

Regards,
Ankit

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.