Coder Social home page Coder Social logo

enorm's Introduction

Equi-normalization of Neural Networks

ENorm is a fast and iterative method for minimizing the L2 norm of the weights of a given neural network that provably converges to a unique solution. Interleaving ENorm with SGD during training improves the test accuracy.

This repository contains the implementation of ENorm as detailed in our paper Equi-normalization of Neural Networks (ICLR 2019). The library is easy to use and requires adding only two lines of code to your usual training loop.

Matrices $W_k$ and $W_{k+1}$ are updated by multiplying the columns of the first matrix with rescaling coefficients. The rows of the second matrix are inversely rescaled to ensure that the product of the two matrices is unchanged. The rescaling coefficients are strictly positive to ensure functional equivalence when the matrices are interleaved with \mbox{ReLUs}. This rescaling is applied iteratively to each pair of adjacent matrices

Dependencies

ENorm works with Python 3.6 and newest. To run the code, you must have the following packages installed:

These dependencies can be installed with: pip install -r requirements.txt

How to use ENorm

The training procedure consists in performing one ENorm cycle (iterating ENorm on the entire network once) after each SGD step as detailed below.

from enorm import ENorm


# defining model and optimizer
model = ...
criterion = ...
optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.9 weight_decay=1e-4)

# instantiating ENorm (here with asymmetric scaling coefficient c=1)
enorm = ENorm(model.named_parameters(), optimizer, c=1)

# training loop
for batch, target in train_loader:
  # forward pass
  output = model(input)
  loss = criterion(output, target)

  # backward pass
  optimizer.zero_grad()
  loss.backward()

  # SGD and ENorm steps
  optimizer.step()
  enorm.step()

Some precisions about the usage of ENorm (for details, see our paper):

  • ENorm is compatible with feedforward fully connected and/or convolutional architectures with ReLU and pooling layers.
  • The asymmetric scaling coefficient c penalizes the layers exponentially according to their depth. Usually, values of c equal to or slightly above 1 give the best results.
  • Currently, only the SGD optimizer is supported due to the momentum buffer update. Indeed, with using momentum, ENorm performs a jump in the parameter space, thus we update the momentum accordingly.
  • Optionally, one can perform more ENorm cycles or apply ENorm every k SGD iterations (k > 1). In our experience, performing one ENorm cycle after each SGD iteration generally works best.
  • In practice, we have found the training to be more stable when not balancing the biases.
  • When applying ENorm to a network with BatchNorm layers, we simply ignore the BatchNorm weights and perform the ENorm cycle on the network as usual.
  • Use the documentation of the file enorm.py file to adapt ENorm to your favourite network architecture.

Results

You can reproduce the results of our paper by running the following commands:

# fully connected network on CIFAR10 with 15 intermediary layers
python main.py --dataset cifar10 --model-type linear --n-layers 15 --enorm 0 --epochs 60 --lr 0.1 --weight-decay 1e-3 --momentum 0 --n-iter 5
python main.py --dataset cifar10 --model-type linear --n-layers 15 --enorm 1.2 --epochs 60 --lr 0.1 --weight-decay 1e-3 --momentum 0 --n-iter 5

# fully convolutional network on CIFAR10
python main.py --dataset cifar10 --model-type conv --enorm 0 --epochs 128  --lr 0.05 --weight-decay 1e-3 --momentum 0.9 --n-iter 5
python main.py --dataset cifar10 --model-type conv --enorm 1.1 --epochs 128 --lr 0.05 --weight-decay 1e-3 --momentum 0.9 --n-iter 5

License

ENorm is released under Creative Commons Attribution 4.0 International (CC BY 4.0) license, as found in the LICENSE file.

Bibliography

Please consider citing [1] if you found the resources in this repository useful.

[1] P. Stock, B. Graham, R. Gribonval and H. Jégou. Equi-normalization of Neural Networks.

@inproceedings{stock2018enorm,
  title = {Equi-normalization of Neural Networks},
  author = {Stock, Pierre and Graham, Benjamin and Gribonval, Rémi and Jégou, Hervé},
  booktitle = {International Conference on Learning Representations (ICLR)},
  year = {2019}
}

enorm's People

Stargazers

Im Sunghoon avatar Alessandro Pappalardo avatar lilkypimp1 avatar Daniel avatar  avatar  avatar daymaerd avatar Lunjun Zhang avatar sriharsha annamaneni avatar Kirill Klimov avatar Alexandr Kalinin avatar chris avatar Ben Day avatar Thomas Erhel avatar Nadav Timor avatar Erdi Mollahüseyinoğlu avatar Honggui avatar  avatar  avatar  avatar  avatar luca avatar Schubert Carvalho avatar Joshua Barney avatar Alexander Morgun avatar Alexander Veysov avatar Joshua Obiende avatar  avatar  avatar Dayvid Welles avatar Artyom avatar Vladislav avatar Yuri Baburov avatar  avatar  avatar Solbiati Alessandro avatar  avatar André Hollstein avatar Jason Wu avatar Ihor Dotsenko avatar Mauricio Uribe avatar Chris Tabor avatar Daniel Mahler avatar WS Jeon avatar Joshua Levy avatar  avatar Rodrigo Neves avatar Prince avatar zhoudi avatar  avatar Huaiwen Zhang avatar  avatar Qcy avatar Arthur Suilin avatar Vadim Velicodnii avatar Wu Huikai avatar Andreas Urbanski avatar Nuno Edgar Nunes Fernandes avatar Slice avatar  avatar  avatar Andrew FigPope avatar Hu Xu avatar John S. Dvorak avatar igor avatar Daniel Sobrado avatar Karlind avatar Adolfo Eliazat avatar Albert Tavares de Almeida avatar Miki Oracle avatar Amit Portnoy avatar Sriram Narra avatar  avatar  avatar Junho Kim avatar Emanuele Ballarin avatar Yannic Kilcher avatar ranjanprj avatar Rituraj Singh avatar 爱可可-爱生活 avatar Gabor Dolla avatar Chandra Prakash avatar  avatar Mayur Bhangale avatar Pranav Maddula avatar Will Boyd avatar Chad avatar Motoki Wu avatar Jeff Farris avatar  avatar The web walker avatar 杜世橋 Du Shiqiao avatar Ben Ahlbrand avatar Iraquitan Cordeiro Filho avatar Meet Shah  avatar Arka Sadhu avatar Chris Gregory avatar Sohel Ahmed Mesaniya avatar  avatar Thomas Wood avatar

Watchers

Killian Murphy avatar James Cloos avatar Nuno Edgar Nunes Fernandes avatar Pierre Stock avatar Arun Sathiya avatar Rituraj Singh avatar  avatar Stéphanie avatar Joshua Obiende avatar  avatar paper2code - bot avatar

enorm's Issues

Problem with installation: ModuleNotFoundError

I'm installing this from conda:

git clone https://github.com/facebookresearch/enorm.git
cd enorm
python3 setup.py install
cd ..
rm -rf enorm # to make sure it loads from the site-packages
python3 -c 'from enorm import ENorm'
# ModuleNotFoundError: No module named 'enorm'

What succeeds is:

git clone https://github.com/facebookresearch/enorm.git
cd enorm
python3 setup.py install
cd ..
# no removing, loading from the source tree
python3 -c 'from enorm.enorm.enorm.enorm import ENorm' # a bit too many enorms?

setup.py install output:

running install
running bdist_egg
running egg_info
creating enorm.egg-info
writing enorm.egg-info/PKG-INFO
writing dependency_links to enorm.egg-info/dependency_links.txt
writing top-level names to enorm.egg-info/top_level.txt
writing manifest file 'enorm.egg-info/SOURCES.txt'
package init file 'enorm/__init__.py' not found (or not a regular file)
reading manifest file 'enorm.egg-info/SOURCES.txt'
writing manifest file 'enorm.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
creating build
creating build/lib
creating build/lib/enorm
copying enorm/main.py -> build/lib/enorm
creating build/bdist.linux-x86_64
creating build/bdist.linux-x86_64/egg
creating build/bdist.linux-x86_64/egg/enorm
copying build/lib/enorm/main.py -> build/bdist.linux-x86_64/egg/enorm
byte-compiling build/bdist.linux-x86_64/egg/enorm/main.py to main.cpython-37.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying enorm.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying enorm.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying enorm.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying enorm.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
zip_safe flag not set; analyzing archive contents...
creating dist
creating 'dist/enorm-1.0-py3.7.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing enorm-1.0-py3.7.egg
Removing /home/user/env/lib/python3.7/site-packages/enorm-1.0-py3.7.egg
Copying enorm-1.0-py3.7.egg to /home/user/env/lib/python3.7/site-packages
enorm 1.0 is already the active version in easy-install.pth

Installed /home/user/env/lib/python3.7/site-packages/enorm-1.0-py3.7.egg
Processing dependencies for enorm==1.0
Finished processing dependencies for enorm==1.0

Doesn't work with torch >= 1.0 and 1D data.

File "train.py", line 479, in train_batch
enorm.step()
File "/Projects/myproj/.venv/lib/python3.6/site-packages/enorm/enorm/enorm.py", line 67, in step
self._step_conv()
File "/Projects/myproj/.venv/lib/python3.6/site-packages/enorm/enorm/enorm.py", line 85, in _step_conv
right_w = self._get_weight(1, 'r')
File "/Projects/myproj/.venv/lib/python3.6/site-packages/enorm/enorm/enorm.py", line 60, in _get_weight
param = param.permute(1, 2, 3, 0).contiguous().view(param.size(1), -1)
RuntimeError: number of dims don't match in permute

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.