Coder Social home page Coder Social logo

eigenfoo / batch-renorm Goto Github PK

View Code? Open in Web Editor NEW
13.0 5.0 0.0 770 KB

A Tensorflow re-implementation of batch renormalization, first introduced by Sergey Ioffe.

License: MIT License

Python 98.19% Shell 1.81%
batch-renormalization tensorflow deep-learning batch-renorm batch-normalization batch-norm sergey-ioffe

batch-renorm's Introduction

Batch Renormalization

A Tensorflow implementation of batch renormalization, first introduced by Sergey Ioffe.

Paper: Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models, Sergey Ioffe https://arxiv.org/abs/1702.03275

GitHub repository: https://github.com/eigenfoo/batch-renorm

The goal of this project is to reproduce the following figure from the paper:

Below is our reproduction:

Description

There were a few things that we did differently from the paper:

  • We used the CIFAR-100 dataset, instead of the ImageNet dataset.
  • We used a plain convolutional network, instead of the Inception-v3 architecture.
  • We used the Adam optimizer, instead of the RMSProp optimizer.
  • We split minibatches into 800 microbatches of 2 examples each, instead of 400 microbatches of 4 examples each. Note that each minibatch still consists of 1600 examples.
  • We trained for a mere 8k training updates, instead of 160k training updates.
  • We ran the training 5 separate times, and averaged the learning curves from all runs. This was not explicitly stated in the paper.

The reproduced results do not exactly mirror the paper's results: for instance, the learning curves for batch norm and batch renorm do not converge to the same value, and the learning curve for batch norm even appears to be curving down towards the end of training.

We suspect that these discrepancies are due to two factors:

  1. Not training for long enough time (8k training steps is nothing compared to 160k), and
  2. Using a different architecture/dataset to reproduce the same results. While the behavior should still be the same, it may be the case that certain hyperparameters are ill-chosen.

batch-renorm's People

Contributors

drey7925 avatar eigenfoo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.