Coder Social home page Coder Social logo

zejiangh / filter-gap Goto Github PK

View Code? Open in Web Editor NEW
37.0 37.0 8.0 12.7 MB

The official PyTorch implementation of CHEX: CHannel EXploration for CNN Model Compression (CVPR 2022). Paper is available at https://openaccess.thecvf.com/content/CVPR2022/papers/Hou_CHEX_CHannel_EXploration_for_CNN_Model_Compression_CVPR_2022_paper.pdf

License: Other

Dockerfile 0.11% Python 9.80% Shell 0.30% Cuda 1.41% C++ 0.38% Jupyter Notebook 87.92% C 0.08%

filter-gap's People

Contributors

zejiangh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

filter-gap's Issues

Training Logs

Hi! Thanks for the great work. I am currently trying to reproduce the ImageNet results but encounter some troubles. Is it possible to provide training(tensorflow for eg.) logs for the 77.4 results for debugging purpose? Thanks a lot!

Pruning some layers completely as result of layer importance

Did you ever come across the situation in which the layer ratios for a certain layer reaches 0 (or 1 - not sure which way around it is), so that the mask sets all channels to 0, essentially rendering the layer useless? How do you mitigate this?

Note, I reimplemented the paper in tensorflow, so I have slightly different code.

请问一下图片分类的数据集问题

你好,非常感谢你的开源代码!另外,请问如果我打算训练cifar10/100, 除了需要改数据集的路径,还需要改其他的训练参数吗?

CUDA error: device-side assert triggered

../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:276: operator(): block: [0,0,0], thread: [0,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:276: operator(): block: [0,0,0], thread: [1,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:276: operator(): block: [0,0,0], thread: [2,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
idx_dim < index_size && "index out of bounds"failed. ../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:276: operator(): block: [0,0,0], thread: [122,0,0] Assertionidx_dim >= 0 && idx_dim < index_size && "index out of bounds"failed. ../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:276: operator(): block: [0,0,0], thread: [123,0,0] Assertionidx_dim >= 0 && idx_dim < index_size && "index out of bounds"failed. ../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:276: operator(): block: [0,0,0], thread: [124,0,0] Assertionidx_dim >= 0 && idx_dim < index_size && "index out of bounds"failed. ../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:276: operator(): block: [0,0,0], thread: [125,0,0] Assertionidx_dim >= 0 && idx_dim < index_size && "index out of bounds"failed. ../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:276: operator(): block: [0,0,0], thread: [126,0,0] Assertionidx_dim >= 0 && idx_dim < index_size && "index out of bounds"failed. ../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:276: operator(): block: [0,0,0], thread: [127,0,0] Assertionidx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
terminate called after throwing an instance of 'c10::CUDAError'
what(): CUDA error: device-side assert triggered
Exception raised from record at ../aten/src/ATen/cuda/CUDAEvent.h:119 (most recent call first):

Is there any suggestion?

结构化网络剪枝?

作者你好,从论文介绍看,CHEX方法应该是结构化剪枝,但不同压缩率下提供的Checkpoints模型权重所占存储体积为什么都是一样的呢?例如:resnet50_1g/2g/3g都是195MB。

此外,readme中resnet50_2g的模型权重给成resnet50_3g了,麻烦更新下哈,谢谢!

CLS问题中 剪枝阶段BN层初始参数一直在训练期间保存在prev_model中吗 是代码逻辑问题还是有意设计

prune_utils.py 中IS_update_channel_mask函数最后给当前模型BN层参数全部赋值为上一次剪枝的参数,然后prev_model为赋值后的当前模型深拷贝,这样每次更新的prev_model一直保存着最初的bn层权重

是否应该对MRU再生的通道对应的BN层参数赋值

是否在IS_update_channel_mask中更新bn层参数的代码逻辑有问题

请给出解释

SSD detection

十分感谢你的出色工作并开源!
我有一点困惑,在README中提到ssd detection会训练650epoch,这与一般的配置(120epoch)有些不同。
image
同时在补充材料中提到训练240k~129epoch(240k/(118287/64) ~129)。
image
哪种配置才是实际使用的呢?

Evaluation pretrained SSD model

Hi, I'm trying to reproduce the paper but am stuck at the SSD model's evaluation step.
I'm not claiming anything but I feel like the instruction for SSD in this repo is partially wrong:
First, an SSD300 is initialized as a full model:

ssd300 = SSD300(backbone=ResNet(args.backbone, args.backbone_path))

And then the checkpoint model is loaded :
load_checkpoint(ssd300.module if args.distributed else ssd300, args.checkpoint)

And then evaluate this model:
acc = evaluate(ssd300, val_dataloader, cocoGt, encoder, inv_map, args)

The model has not been pruned or modified, so the params/flops stay identical to the original model, but it's claimed that it reduces 50% of FLOPs.
Is this wrong?
Many thank for the explanation!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.