group-sparsity-sbp's Introduction

Structured Bayesian Pruning
via Log-Normal Multiplicative Noise

This repo contains the code for our NIPS17 paper, Structured Bayesian Pruning via Log-Normal Multiplicative Noise (poster). In the paper, we propose a new Bayesian model that takes into account the computational structure of neural networks and provides structured sparsity, e.g. removes neurons and/or convolutional channels in CNNs. To do this we inject noise to the neurons outputs while keeping the weights unregularized.

Spotlight video

Launch experiments

Example for launching LeNet5 experiment.

python ./scripts/lenet5-sbp.py

Example for launching VGG-like experiment. To obtain sparse VGG-like architecture we use pretrained network, so you can use your own weights, or train the network from scratch using the following command.

python ./scripts/vgglike.py --num_gpus <num GPUs>

Don't forget to adjust batch size to obtain the same number of iterations. For instance, for one GPU we use batch_size=100, for 2 GPUs we use batch_size=50.

Finally, use the following command to launch SBP model for VGG-like architecture.

python ./scripts/vgglike-sbp.py --num_gpus <num GPUs> --checkpoint <path to pretrained checkpoint>

MNIST Experiments

Results for LeNet architectures on MNIST

Network	Method	Error	Neurons per Layer	CPU	GPU	FLOPs
Lenet-fc	Original	1.54	784 - 500 - 300 - 10	1.00 X	1.00 X	1.00 X
	SparseVD	1.57	537 - 217 - 130 - 10	1.19 X	1.03 X	3.73 X
	SSL	1.49	434 - 174 - 78 - 10	2.21 X	1.04 X	6.06 X
	StructuredBP	1.55	245 - 160 - 55 - 10	2.33 X	1.08 X	11.23 X

LeNet5	Original	0.80	20 - 50 - 800 - 500	1.00 X	1.00 X	1.00 X
	SparseVD	0.75	17 - 32 - 329 - 75	1.48 X	1.41 X	2.19 X
	SSL	1.00	3 - 12 - 800 - 500	5.17 X	1.80 X	3.90 X
	StructuredBP	0.86	3 - 18 - 284 - 283	5.41 X	1.91 X	10.49 X

CIFAR-10 Experiments

Results for VGG-like architecture on CIFAR-10 dataset. Here speed-up is reported for CPU. More detailed results are provided in the paper.

Citation

If you found this code useful please cite our paper

@incollection{
  neklyudov2018structured,
  title = {Structured Bayesian Pruning via Log-Normal Multiplicative Noise},
  author = {Neklyudov, Kirill and Molchanov, Dmitry and Ashukha, Arsenii and Vetrov, Dmitry P},
  booktitle = {Advances in Neural Information Processing Systems 30},
  editor = {I. Guyon and U. V. Luxburg and S. Bengio and H. Wallach and R. Fergus and S. Vishwanathan and R. Garnett},
  pages = {6778--6787},
  year = {2017},
  publisher = {Curran Associates, Inc.},
  url = {http://papers.nips.cc/paper/7254-structured-bayesian-pruning-via-log-normal-multiplicative-noise.pdf}
}

group-sparsity-sbp's People

Contributors

Stargazers

Watchers

group-sparsity-sbp's Issues

the final loss

In paper, the final loss function is presented in equation (12),
the estimated expected log-likelihood through SGVB and KL divergence.

It seems that the SBP layer only takes KL divergence into account, why don't we need to deal with the expected log-likelihood term?

Is the log likelihood included as our objective function?

Elaboration on the pretrained model used

The paper mentions that for VGG-like training, a pretrained model was used. Could a link be provided for the checkpoint file of the pretrained model so the vgglike-sbp.py experiment can be replicated independently?

different erfcx approximation error on pytorch

Hi, I try to re-implement your paper on pytorch. I changed your erfcx function to adapt pytorch tensor. After compared to the values of special.erfcx(x), the average absolute error is approximate 2.11-08, and average relative error is approximate 3.91e-08, both are much larger than your erfcx approximation. Could this be a problem?

Thanks ,
Shangqian

Recommend Projects

necludov / group-sparsity-sbp Goto Github PK

group-sparsity-sbp's Introduction

Structured Bayesian Pruning
via Log-Normal Multiplicative Noise

Spotlight video

Launch experiments

MNIST Experiments

CIFAR-10 Experiments

Citation

group-sparsity-sbp's People

Contributors

Stargazers

Watchers

Forkers

group-sparsity-sbp's Issues

the final loss

Elaboration on the pretrained model used

different erfcx approximation error on pytorch

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent