Coder Social home page Coder Social logo

femalegeekinsv / rethinking-network-pruning Goto Github PK

View Code? Open in Web Editor NEW

This project forked from eric-mingjie/rethinking-network-pruning

0.0 0.0 0.0 168 KB

Rethinking the Value of Network Pruning (Pytorch) (ICLR 2019)

License: MIT License

Python 100.00%

rethinking-network-pruning's Introduction

Rethinking the Value of Network Pruning

This repository contains the code for reproducing the results, and trained ImageNet models, in the following paper:

Rethinking the Value of Network Pruning. [arXiv] [OpenReview]

Zhuang Liu*, Mingjie Sun*, Tinghui Zhou, Gao Huang, Trevor Darrell (* equal contribution).

ICLR 2019. Also Best Paper Award at NIPS 2018 Workshop on Compact Deep Neural Networks.

Several pruning methods' implementations contained in this repo can also be readily used for other research purposes.

Paper Summary

Fig 1: A typical three-stage network pruning pipeline.

Our paper shows that for structured pruning, training the pruned model from scratch can almost always achieve comparable or higher level of accuracy than the model obtained from the typical "training, pruning and fine-tuning" (Fig. 1) procedure. We conclude that for those pruning methods:

  1. Training a large, over-parameterized model is often not necessary to obtain an efficient final model.
  2. Learned “important” weights of the large model are typically not useful for the small pruned model.
  3. The pruned architecture itself, rather than a set of inherited “important” weights, is more crucial to the efficiency in the final model, which suggests that in some cases pruning can be useful as an architecture search paradigm.

Our results suggest the need for more careful baseline evaluations in future research on structured pruning methods.

Fig 2: Difference between predefined and automatically discovered target architectures, in channel pruning. The pruning ratio x is user-specified, while a, b, c, d are determined by the pruning algorithm. Unstructured sparse pruning can also be viewed as automatic. Our finding has different implications for predefined and automatic methods: for a predefined method, it is possible to skip the traditional "training, pruning and fine-tuning" pipeline and directly train the pruned model; for automatic methods, the pruning can be seen as a form of architecture learning.


We also compare with the "Lottery Ticket Hypothesis" (Frankle & Carbin 2019), and find that with optimal learning rate, the "winning ticket" initialization as used in Frankle & Carbin (2019) does not bring improvement over random initialization. For more details please refer to our paper.

Implementation

We evaluated the following seven pruning methods.

  1. L1-norm based channel pruning
  2. ThiNet
  3. Regression based feature reconstruction
  4. Network Slimming
  5. Sparse Structure Selection
  6. Soft filter pruning
  7. Unstructured weight-level pruning

The first six is structured while the last one is unstructured (or sparse). For CIFAR, our code is based on pytorch-classification and network-slimming. For ImageNet, we use the official Pytorch ImageNet training code. The instructions and models are in each subfolder.

For experiments on The Lottery Ticket Hypothesis, please refer to the folder cifar/lottery-ticket.

Our experiment environment is Python 3.6 & PyTorch 0.3.1.

Contact

Feel free to discuss papers/code with us through issues/emails!

sunmj15 at gmail.com
liuzhuangthu at gmail.com

Citation

If you use our code in your research, please cite:

@inproceedings{liu2018rethinking,
  title={Rethinking the Value of Network Pruning},
  author={Liu, Zhuang and Sun, Mingjie and Zhou, Tinghui and Huang, Gao and Darrell, Trevor},
  booktitle={ICLR},
  year={2019}
}

rethinking-network-pruning's People

Contributors

eric-mingjie avatar jjxxmiin avatar liuzhuang13 avatar quelleg avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.