Coder Social home page Coder Social logo

mmkd's Introduction

MMKD

This repo covers the implementation of the following ICME 2023 paper: Adaptive Multi-Teacher Knowledge Distillation with Meta-Learning

Installation

This repo was tested with Python 3.6, PyTorch 1.8.1, and CUDA 11.1.

Running

Before distill the student, be sure to put the teacher model directory in setting.py.

nohup python train_meta.py --model_s vgg8 --teacher_num 3 --distill inter --ensemble_method META --nesterov -r 1 -a 1 -b 100 --hard_buffer  --convs  --trial 0  --gpu_id 0&

where the flags are explained as:

  • --distill: specify the distillation method
  • --model_s: specify the student model, see 'models/init.py' to check the available model types.
  • -r: the weight of the cross-entropy loss between logit and ground truth, default: 1
  • -a: the weight of the KD loss, default: 1
  • -b: the weight of other distillation losses, default: 0
  • --teacher_num: specify the ensemble size (number of teacher models)
  • --ensemble_method: specify the ensemble_method
  • --hard_buffer: whether a hard buffer is required
  • convs: the way of feature alignment. If not, just use 1x1 convolution for alignment

Citation

If you find this repository useful, please consider citing the following paper:


Acknowledgement

The implementation of compared methods are mainly based on the author-provided code and the open-source benchmark https://github.com/HobbitLong/RepDistiller and https://github.com/alinlab/L2T-ww.

mmkd's People

Contributors

rorozhl avatar

Stargazers

Dorra avatar XianMu avatar Behnam Sa avatar  avatar  avatar Boyang Yang avatar  avatar  avatar Charlie Cheng-Jie Ji avatar He Jiabei avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

IronMan avatar  avatar

Forkers

lliai aust-hansen

mmkd's Issues

About Paper

Congratulations on the acceptance of MMKD, and when will the paper be released? Thanks!

Best
lujun

about error

File "D:\Multi-Teacher Knowledge Distillation\MMKD-main\helper\meta_optimizer.py", line 90, in meta_backward
a_new = (a[0].mul(1-lr*wd).add_(wd, a[1]).add_(p.grad.data),
AttributeError: 'NoneType' object has no attribute 'data'

How to solve it?

About pretrain model

Hello, I didn't find any instructions on pre-training models in redeme, where should I download them or which code should I use to train them, thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.