Coder Social home page Coder Social logo

deep-residual-networks's Introduction

Deep Residual Networks

By Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun.

Microsoft Research Asia (MSRA).

Table of Contents

  1. Introduction
  2. Citation
  3. Disclaimer and known issues
  4. Models
  5. Results
  6. Third-party re-implementations

Introduction

This repository contains the original models (ResNet-50, ResNet-101, and ResNet-152) described in the paper "Deep Residual Learning for Image Recognition" (http://arxiv.org/abs/1512.03385). These models are those used in [ILSVRC] (http://image-net.org/challenges/LSVRC/2015/) and COCO 2015 competitions, which won the 1st places in: ImageNet classification, ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

Note

  1. Re-implementations with training code and models from Facebook AI Research (FAIR): blog, code
  2. Code of improved 1K-layer ResNets with 4.62% test error on CIFAR-10 in our new arXiv paper: https://github.com/KaimingHe/resnet-1k-layers

Citation

If you use these models in your research, please cite:

@article{He2015,
	author = {Kaiming He and Xiangyu Zhang and Shaoqing Ren and Jian Sun},
	title = {Deep Residual Learning for Image Recognition},
	journal = {arXiv preprint arXiv:1512.03385},
	year = {2015}
}

Disclaimer and known issues

  1. These models are converted from our own implementation to a recent version of Caffe (2016/2/3, b590f1d). The numerical results using this code are as in the tables below.
  2. These models are for the usage of testing or fine-tuning.
  3. These models were not trained using this version of Caffe.
  4. If you want to train these models using this version of Caffe without modifications, please notice that:
    • GPU memory might be insufficient for extremely deep models.
    • Changes of mini-batch size should impact accuracy (we use a mini-batch of 256 images on 8 GPUs, that is, 32 images per GPU).
    • Implementation of data augmentation might be different (see our paper about the data augmentation we used).
    • We randomly shuffle data at the beginning of every epoch.
    • There might be some other untested issues.
  5. In our BN layers, the provided mean and variance are strictly computed using average (not moving average) on a sufficiently large training batch after the training procedure. The numerical results are very stable (variation of val error < 0.1%). Using moving average might lead to different results.
  6. In the BN paper, the BN layer learns gamma/beta. To implement BN in this version of Caffe, we use its provided "batch_norm_layer" (which has no gamma/beta learned) followed by "scale_layer" (which learns gamma/beta).
  7. We use Caffe's implementation of SGD with momentum: v := momentum*v + lr*g. If you want to port these models to other libraries (e.g., Torch, CNTK), please pay careful attention to the possibly different implementation of SGD with momentum: v := momentum*v + (1-momentum)*lr*g, which changes the effective learning rates.

Models

  1. Visualizations of network structures (tools from ethereon):

  2. Model files:

Results

  1. Curves on ImageNet (solid lines: 1-crop val error; dashed lines: training error): Training curves

  2. 1-crop validation error on ImageNet (center 224x224 crop from resized image with shorter side=256):

    model top-1 top-5
    VGG-16 28.5% 9.9%
    ResNet-50 24.7% 7.8%
    ResNet-101 23.6% 7.1%
    ResNet-152 23.0% 6.7%
  3. 10-crop validation error on ImageNet (averaging softmax scores of 10 224x224 crops from resized image with shorter side=256), the same as those in the paper:

    model top-1 top-5
    ResNet-50 22.9% 6.7%
    ResNet-101 21.8% 6.1%
    ResNet-152 21.4% 5.7%

Third-party re-implementations

Deep residual networks are very easy to implement and train. We recommend to see also the following third-party re-implementations and extensions:

  1. By Facebook AI Research (FAIR), with training code in Torch and pre-trained ResNet-18/34/50/101 models for ImageNet: blog, code
  2. Torch, CIFAR-10, with ResNet-20 to ResNet-110, training code, and curves: code
  3. Lasagne, CIFAR-10, with ResNet-32 and ResNet-56 and training code: code
  4. Neon, CIFAR-10, with pre-trained ResNet-32 to ResNet-110 models, training code, and curves: code
  5. Torch, MNIST, 100 layers: blog, code
  6. A winning entry in Kaggle's right whale recognition challenge: blog, code
  7. Neon, Place2 (mini), 40 layers: blog, code
  8. MatConvNet, CIFAR-10, with ResNet-20 to ResNet-110, training code, and curves: code
  9. TensorFlow, CIFAR-10, with ResNet-32,110,182 training code and curves: code
  10. MatConvNet, reproducing CIFAR-10 and ImageNet experiments (supporting official MatConvNet), training code and curves: blog, code
  11. Keras, ResNet-50: code

Converters:

  1. MatConvNet: url
  2. TensorFlow: url

deep-residual-networks's People

Contributors

apark263 avatar fchollet avatar kaiminghe avatar ppwwyyxx avatar shaoqingren avatar zhanghang1989 avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.