Coder Social home page Coder Social logo

filter-pruning-with-attention-preserving-self-distillation's Introduction

Filter Pruning with Attention-Preserving Self-Distillation

Overview

NGGM (Filter Pruning) HAP (Self-Distillation)
drawing drawing
  • Benchmarks 3 state-of-the-art filter pruning methods in PyTorch:

    Paper (Filter Pruning) Name
    FPEC (ICLR'17) Pruning Filters for Efficient ConvNets
    SFP (IJCAI'18) Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks
    FPGM (CVPR'19) Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration
  • Benchmarks 3 state-of-the-art knowledge distillation methods in PyTorch:

    Paper (Distillation) Name
    AT (ICLR'17) Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer
    SP (ICCV'19) Similarity-Preserving Knowledge Distillation
    AFD (AAAA'21) Show, Attend and Distill: Knowledge Distillation via Attention-based Feature Matching

Requirements

  • Python (>3.6)
  • PyTorch (>1.7.1)
  • torchVision
  • numpy
  • tensorboardx
  • sklearn
  • tqdm

Running

Vanilla ResNet Training

  • Running commands in scripts/run_vanilla.sh. An example of running ResNet-56 on CIFAR-10 is given by:
    python3 initial_train.py --model resnet56 --dataset cifar10 --lr 0.01 --schedule 1 60 120 160 --lr-drops 10 0.2 0.2 0.2 --batch-size 128 --seed 8152
    where the flags are explained as:
    • --schedule: specify at which epoch to drop the learning rate.
    • --lr-drops: specify how much the learning rate should be multiplied by the epoch corresponding to the schedule.
    • Note: the length of --schedule and --lr-drops should be same.

Pruned ResNet Training

  • Running commands in scripts/run_pruning.sh. An example of running ResNet-56 on CIFAR-10 and using distillation method AT (ICLR'17) is given by:
    python3 pruning.py --t-model resnet56 --s-copy-t --dataset cifar10 --prune-rates 0.6 --prune-mode filter-r --t-path saves/1625594011/model_best.pt --distill at --betas 1000 --log-name PRUNED-CIFAR10.txt --seed 8152
    where the flags are explained as:
    • --t-model: specify the model used by the teacher.
    • --s-copy-t: copy the parameters of the pre-trained teacher model as the student initialization parameters. Note: it can not be used for distillation of different architectures.
    • --prune-rates: specify the proportion of filters to be retained for each convolutional layer, default: 1.0, i.e. no filters will be pruned by default.
    • --prune-mode: specify what pruning method to use, including:
      • filter-r: prune randomly.
      • filter-a: prune by L1-norm of the filter, i.e. PFEC (ICLR'17).
      • filter-gm: prune by geometric-median, i.e. FGPM (CVPR'19).
      • filter-nggm: prune by our method.
    • --t-path: pre-trained model corresponding to --t-model.
    • --distill: specify what distillation method to use, including:
      • at: AT (ICLR'17).
      • sp: SP (ICCV'19).
      • afd: AFD (AAAI'21).
      • hap: our method.
      • Note: by default, we add KD (NIPS'14) to all the baselines.
    • --log-name: specify the name of the log file. By default, the log file will be saved at ./saves directory.

Quantized ResNet Training + Huffman Coding

  • Running commands in scripts/run_quantization_encode.sh.
  • Note: we ensure the accuracies of the model before huffman encoding and after decoding are the same to ensure the correctness of our implementation..

Benchmark Results on CIFAR-100

drawing

  • Note: The value in Acc. after pruning (%) column is the mean of four experiments with different but static seeds.

filter-pruning-with-attention-preserving-self-distillation's People

Contributors

uuuchen avatar willy142857 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.