Coder Social home page Coder Social logo

wdevazelhes / bregmanlearning Goto Github PK

View Code? Open in Web Editor NEW

This project forked from timroith/bregmanlearning

0.0 0.0 0.0 201 KB

Optimizing neural networks via an inverse scale space flow.

License: MIT License

Python 43.90% Jupyter Notebook 56.10%

bregmanlearning's Introduction

๐Ÿ“ˆ BregmanLearning

Implementation of the inverse scale space training algorithms for sparse neural networks, proposed in A Bregman Learning Framework for Sparse Neural Networks [1]. Feel free to use it and please refer to our paper when doing so.

@misc{bungert2021bregman,
      title={A Bregman Learning Framework for Sparse Neural Networks}, 
      author={Leon Bungert and Tim Roith and Daniel Tenbrinck and Martin Burger},
      year={2021},
      eprint={2105.04319},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

๐Ÿ’ก Method Description

Our Bregman learning framework aims at training sparse neural networks in an inverse scale space manner, starting with very few parameters and gradually adding only relevant parameters during training. We train a neural network parametrized by weights using the simple baseline algorithm

where

  • denotes a loss function with stochastic gradient ,
  • is a sparsity-enforcing functional, e.g., the -norm,
  • is the proximal operator of .

Our algorithm is based on linearized Bregman iterations [2] and is a simple extension of stochastic gradient descent which is recovered choosing . We also provide accelerations of our baseline algorithm using momentum and Adam [3].

The variable is a subgradient of with respect to the elastic net functional

and stores the information which parameters are non-zero.

๐ŸŽฒ Initialization

We use a sparse initialization strategy by initializing parameters non-zero with a small probability. Their variance is chosen to avoid vanishing or exploding gradients, generalizing Kaiming-He or Xavier initialization.

๐Ÿ”ฌ Experiments

The different experiments can be executed as Jupyter notebooks in the notebooks folder.

Classification

Mulit Layer Perceptron

In this experiment we consider the MNIST classification task using a simple multi layer perceptron. We compare the LinBreg optimizer to standard SGD and proximal descent. The respective notebook can be found at MLP-Classification.

Convolutions and Group Sparsity

In this experiment we consider the Fashion-MNIST classification task using a simple convolutional net. The experiment can be excecuted as a notebook, namely via the file ConvNet-Classification.

ResNet

In this experiment we consider the CIFAR10 classification task using a ResNet. The experiment can be excecuted as a notebook, namely via the file ResNet-Classification.

NAS

This experiment implements the neural architecture search as proposed in [4].

The corresponding notebooks are DenseNet and Skip-Encoder.

โ˜๏ธ Miscellaneous

The notebooks will throw errors if the datasets cannot be found. You can change the default configuration 'download':False to 'download':True in order to automatically download the necessary dataset and store it in the appropriate folder.

If you want to run the code on your CPU you should replace 'use_cuda':True, 'num_workers':4 by 'use_cuda':False, 'num_workers':0 in the configuration of the notebook.

๐Ÿ“ References

[1] Leon Bungert, Tim Roith, Daniel Tenbrinck, Martin Burger. "A Bregman Learning Framework for Sparse Neural Networks." arXiv preprint arXiv:2105.04319 (2021). https://arxiv.org/abs/2105.04319

[2] Woatao Yin, Stanley Osher, Donald Goldfarb, Jerome Darbon. "Bregman iterative algorithms for \ell_1-minimization with applications to compressed sensing." SIAM Journal on Imaging sciences 1.1 (2008): 143-168.

[3] Diederik Kingma, Jimmy Lei Ba. "Adam: A Method for Stochastic Optimization." arXiv preprint arXiv:1412.6980 (2014). https://arxiv.org/abs/1412.6980

[4] Leon Bungert, Tim Roith, Daniel Tenbrinck, Martin Burger. "Neural Architecture Search via Bregman Iterations." arXiv preprint arXiv:2106.02479 (2021). https://arxiv.org/abs/2106.02479

bregmanlearning's People

Contributors

timroith avatar leon-bungert avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.