Coder Social home page Coder Social logo

ecanet's Introduction

ECA-Net: Efficient Channel Attention

ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks

This is an implementation of ECA-Net(CVPR2020,paper), created by Banggu Wu.

Poster

Introduction

Recently, channel attention mechanism has demonstrated to offer great potential in improving the performance of deep convolutional neuralnetworks (CNNs). However, most existing methods dedicate to developing more sophisticated attention modules for achieving better performance,which inevitably increase model complexity. To overcome the paradox of performance and complexity trade-off, this paper proposes an EfficientChannel Attention (ECA) module, which only involves a handful of parameters while bringing clear performance gain. By dissecting the channelattention module in SENet, we empirically show avoiding dimensionality reduction is important for learning channel attention, and appropriatecross-channel interaction can preserve performance while significantly decreasing model complexity. Therefore, we propose a localcross-channel interaction strategy without dimensionality reduction, which can be efficiently implemented via 1D convolution. Furthermore,we develop a method to adaptively select kernel size of 1D convolution, determining coverage of local cross-channel interaction. Theproposed ECA module is efficient yet effective, e.g., the parameters and computations of our modules against backbone of ResNet50 are 80 vs.24.37M and 4.7e-4 GFLOPs vs. 3.86 GFLOPs, respectively, and the performance boost is more than 2% in terms of Top-1 accuracy. We extensivelyevaluate our ECA module on image classification, object detection and instance segmentation with backbones of ResNets and MobileNetV2. Theexperimental results show our module is more efficient while performing favorably against its counterparts.

Citation

@InProceedings{wang2020eca,
   title={ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks},
   author={Qilong Wang, Banggu Wu, Pengfei Zhu, Peihua Li, Wangmeng Zuo and Qinghua Hu},
   booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
   year={2020}
 }

Changelog

2020/02/26 Upload ECA-Resnet34 model.

2020/03/05 Upload RetinaNet-ecanet50 and RetinaNet-ecanet101 model.

2020/03/24 Update the Introduction and Citation.

2020/03/30 Upload ECA-Resnet18 model.

2020/05/06 Update the poster.

ECA module

ECA_module

Comparison of (a) SE block and (b) our efficient channel attention (ECA) module. Given the aggregated feature using global average pooling (GAP), SE block computes weights using two FC layers. Differently, ECA generates channel weights by performing a fast 1D convolution of size k, where k is adaptively determined via a function of channel dimension C.

Installation

Requirements

  • Python 3.5+
  • PyTorch 1.0+
  • thop

Our environments

  • OS: Ubuntu 16.04
  • CUDA: 9.0/10.0
  • Toolkit: PyTorch 1.0/1.1
  • GPU: GTX 2080Ti/TiTan XP

Start Up

Train with ResNet

You can run the main.py to train or evaluate as follow:

CUDA_VISIBLE_DEVICES={device_ids} python main -a {model_name} --ksize {eca_kernel_size} {the path of you datasets}

For example:

CUDA_VISIBLE_DEVICES=0,1,2,3 python main -a eca_resnet50 --ksize 3557 ./datasets/ILSVRC2012/images

Train with MobileNet_v2

It is same with above ResNet replace main.py by light_main.py.

Compute the parameters and FLOPs

If you have install thop, you can paras_flosp.py to compute the parameters and FLOPs of our models. The usage is below:

python paras_flops.py -a {model_name}

Experiments

ImageNet

Model Param. FLOPs Top-1(%) Top-5(%) BaiduDrive(models) Extract code GoogleDrive
ECA-Net18 11.15M 1.70G 70.92 89.93 eca_resnet18_k3577 utsy eca_resnet18_k3577
ECA-Net34 20.79M 3.43G 74.21 91.83 eca_resnet34_k3357 o4dh eca_resnet34_k3357
ECA-Net50 24.37M 3.86G 77.42 93.62 eca_resnet50_k3557 no6u eca_resnet50_k3557
ECA-Net101 42.49M 7.35G 78.65 94.34 eca_resnet101_k3357 iov1 eca_resnet101_k3357
ECA-Net152 57.41M 10.83G 78.92 94.55 eca_resnet152_k3357 xaft eca_resnet152_k3357
ECA-MobileNet_v2 3.34M 319.9M 72.56 90.81 eca_mobilenetv2_k13 atpt eca_mobilenetv2_k13

COCO 2017

Detection with Faster R-CNN and Mask R-CNN

Model Param. FLOPs AP AP_50 AP_75 Pre trained models Extract code GoogleDrive
Fast_R-CNN_ecanet50 41.53M 207.18G 38.0 60.6 40.9 faster_rcnn_ecanet50_k5_bs8_lr0.01 divf faster_rcnn_ecanet50_k5_bs8_lr0.01
Fast_R-CNN_ecanet101 60.52M 283.32G 40.3 62.9 44.0 faster_rcnn_ecanet101_3357_bs8_lr0.01 d3kd faster_rcnn_ecanet101_3357_bs8_lr0.01
Mask_R-CNN_ecanet50 44.18M 275.69G 39.0 61.3 42.1 mask_rcnn_ecanet50_k3377_bs8_lr0.01 xe19 mask_rcnn_ecanet50_k3377_bs8_lr0.01
Mask_R-CNN_ecanet101 63.17M 351.83G 41.3 63.1 44.8 mask_rcnn_ecanet101_k3357_bs8_lr0.01 y5e9 mask_rcnn_ecanet101_k3357_bs8_lr0.01
RetinaNet_ecanet50 37.74M 239.43G 37.3 57.7 39.6 RetinaNet_ecanet50_k3377_bs8_lr0.01 my44 RetinaNet_ecanet50_k3377_bs8_lr0.01
RetinaNet_ecanet101 56.74M 315.57G 39.1 59.9 41.8 RetinaNet_ecanet101_k3357_bs8_lr0.01 2eu5 RetinaNet_ecanet101_k3357_bs8_lr0.01

Instance segmentation with Mask R-CNN

Model Param. FLOPs AP AP_50 AP_75 Pre trained models Extract code GoogleDrive
Mask_R-CNN_ecanet50 44.18M 275.69G 35.6 58.1 37.7 mask_rcnn_ecanet50_k3377_bs8_lr0.01 xe19 mask_rcnn_ecanet50_k3377_bs8_lr0.01
Mask_R-CNN_ecanet101 63.17M 351.83G 37.4 59.9 39.8 mask_rcnn_ecanet101_k3357_bs8_lr0.01 y5e9 mask_rcnn_ecanet101_k3357_bs8_lr0.01
RetinaNet_ecanet50 37.74M 239.43G 35.6 58.1 37.7 RetinaNet_ecanet50_k3377_bs8_lr0.01 my44 RetinaNet_ecanet50_k3377_bs8_lr0.01
RetinaNet_ecanet101 56.74M 315.57G 37.4 59.9 39.8 RetinaNet_ecanet101_k3357_bs8_lr0.01 2eu5 RetinaNet_ecanet101_k3357_bs8_lr0.01

Contact Information

If you have any suggestion or question, you can leave a message here or contact us directly: [email protected] . Thanks for your attention!

ecanet's People

Contributors

bangguwu avatar developer0hye avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ecanet's Issues

Why do you use only kernel_size=3?

An adaptive kernel_size is proposed with Eq. (9) in your paper.

But in your source code, all the kernel_size is not calculated by Eq.(9) but 3.

Is there any specific reason to use only 3?

Channel and spatial attention mechanism

Very good idea!in your conclusion. such as further investigate incorporation of ECA with spatial attention module.I think the spatial attention mechanism is formed by the sliding of the convolution kernel. Do you have any thoughts on the spatial attention mechanism now?
thanks!

Adaptive Selection

Hello, thank you for the code! I wanted to ask if the adaptive selection of the k size was implemented in this repository?

Unable to load_state_dict from eca_resnet50_k3557.pth.tar

Hi,

i have tried to load_state_dict from eca_resnet50_k3557.pth.tar file and has already rename the keys, but meets a RuntimeError: Error(s) in loading state_dict for ResNet:
size mismatch for layer2.0.eca.conv.weight: copying a param with shape torch.Size([1, 1, 5]) from checkpoint, the shape in current model is torch.Size([1, 1, 3]).
size mismatch for layer2.1.eca.conv.weight: copying a param with shape torch.Size([1, 1, 5]) from checkpoint, the shape in current model is torch.Size([1, 1, 3]).
just like this, from layer2 to layer4

Could you please let me know how to solve this?

Thanks!

About the pre-trained ECA-Net152model.

When I load the pre-trained model with default parameters, an error will occur.
size mismatch for layer3.0.eca.conv.weight: copying a param with shape torch.Size([1, 1, 5]) from checkpoint, the shape in current model is torch.Size([1, 1, 3]).
Can be solved by the following code in line200:
k_size=[3, 3, 3, 3] changed to k_size=[3, 3, 5, 7]

Fix suggestion EcaNet-18

Hello, great work. I tried many of your eca-resnet models and all works great, except eca-resnet18. There is a slight mistake in the line of code below. It should be k_size = k_size.

model = ResNet(ECABasicBlock, [2, 2, 2, 2], num_classes=num_classes, k_size=[3, 3, 3, 3])

ECA-NS implementation

Thanks for your great works.

I want to apply ECA-NS. I found previous issue and below was your answer.
Use nn.functional.unfold() and conv1d with inchannel=C, outchannel=C and group=C.
I can't get it how to use it. is there any code for it?

Version in keras

Hi, I saw this project and I want to try it, but my project is all in keras.
(I never used pytorch)
Has anyone translated the code into keras?
There's a way to get this code compatible with keras?

Thanks.

The accuracy is not correct

In Table 3 of the paper, the top1 of GSoP-Net1 is 77.68, while the one in their paper is 77.98. Which one is correct?

Also, the top1 accuracy of ResNet and SENet in AANet is much high than the ones in your paper (75.2 vs 76.4 for ResNet, 76.71 vs 77.5 for SENet), could you please tell me why?

Comparison with RCAN?

Dear author:

This is a great research in the field of channel attention. However, I have a little question that if you compare your network with RCAN(Image Super-Resolution Using Very Deep Residual Channel Attention Networks)?

It seems that the channel attention module proposed in RCAN is very similar to your architecture and they apply 1x1 convolution on all channels first downsampling and then upsampling instead of using k neighbors. In terms of the number of parameters, indeed, your parameter should be less than theirs because of the parameters sharing strategy. But I still wonder if you do the comparison on both performance and parameter size between yours and RCAN?

I will be appreciated if you could reply. Thanks.

Unused `channel` param

I found it interesting that in your ECA implementation there is an unused parameter channel. What is it for?

class eca_layer(nn.Module):
    def __init__(self, channel, k_size=3):
        super(eca_layer, self).__init__()
        self.avg_pool = nn.AdaptiveAvgPool2d(1)
        self.conv = nn.Conv1d(1, 1, kernel_size=k_size, padding=(k_size - 1) // 2, bias=False) 
        self.sigmoid = nn.Sigmoid()

A question on adaptive selection of kernel size

Thank you for your work on ECANet. Here I have a question. Although the parameter k doesn't need to be tuned manually, the parameters gamma and b in Eq. (9) are still needed to be selected properly. So I'm wondering that if we want to obtain good results, we should still carefully tune these parameters. Or are there many tricks on the selection of the above two parameters? Looking forward to your guidance. Thank you!

Need to change state_dict while using pretrained model? performance not going up

@BangguWu Very interested in your work which is impressive. I tried to reimplement your work in detectron2(FasterRcnn)but performance doesn't really go up. I notice that you evaluate object detection task using mmdetectrion framework, would you please tell except for adding the ECA Layer into ResNet, is there any other place I should modify?

One more question, when training on COCO dataset, did you use the pretrained (ECA+ResNet) backbone or just pretrained (ResNet) backbone? Is there a large difference when doing detectrion task in COCO? Thanks

Conflicting numbers for ECANet-50?

Hello

The Top-1 and Top-5 Accuracy for ECA-Net 50 (ResNet 50 + ECA) reported in the paper and the poster is different as reported in the readme of this repository. For instance, the paper states that Top-1 is 77.48 while the repository states it's 77.42. Could you please clarify which are the correct numbers? Thanks!

Any plans to provide ECA-resnet.py on mmdetection?

Hi:
I tried your code on IMAGENET and get conpetitive performance. But when I added your eca-net from your code to resnet.py of mmdetection, I can't get same results as paper with RetinaNet. Your job is insight but do you have Any plans to provide codes on mmdetection? Thanks!

Applying ECA on 3D inputs?

Hi, I was wondering if this could be applied to models dealing with 3D inputs? Would the codes written below be correct? I'm not sure why the codes have squeezed the Width layer out. In 3D input, with X, Y and Z, which dimensions should be squeezed out? Should it be both X and Z? Would the codes below be correct?

 def forward(self, x):
        # x: input features with shape [b, c, h, w]
        b, c, z, h, w = x.size()

        # feature descriptor on the global spatial information
        y = self.avg_pool(x)

        # Two different branches of ECA module
        y = self.conv(y.squeeze(-1).squeeze(-2).transpose(-1, -2)).transpose(-1, -2).unsqueeze(-2).unsqueeze(-1)

        # Multi-scale information fusion
        y = self.sigmoid(y)

        return x * y.expand_as(x)

About kerel size

hello sir,would you mind tell me how to define the kerel size for different channel?Thank you ver much.Should i write the Eq(9) by myself to define?

Implementation of ECA module

Why is the implementation of ECA module in your project different from the implementation in your paper? Specifically, the computation of the kernel size.

MMDetection issues

Upon trying to use the Mask R-CNN ECANet-50 weights with MMDetection for simple inference, we (@iyaja) faced several issues which are listed below:

  1. Kernel Size discrepancy in the backbone ECANet-50 which we solved by having k_size to be 3 for the first two blocks and 7 for the last two in the backbone.
  2. For the mask head, the ECANet uses 81 classes instead of 80 which is used as default in MMDetection configs where the +1 class is accounted for the background class.
  3. For the Regression head, because of the 81 classes the input layer in the Linear layer now changed to 324 instead of 320 as in the default MMDetection config.
  4. Because MMDetection was updated with the bbox_head and mask_head keys being shifted to roi_head super keys, there is a key mismatch between the state_dict of the weights provided in this repository and the config in MMDetection.

Even after fixing the above issues, we were not able to obtain valid results, example:
download
Expected result:
coco_test_12510

We would like the authors to share which commit hash of MMDetection they used to obtain the results. Additionally, it would be very beneficial if the authors could add a dedicated folder providing the code to run both training and inference for all the object detection and segmentation models presented in the paper.

Unable to open eca_resnet50_k3557.pth.tar

Hi,

i have tried to open eca_resnet50_k3557.pth.tar file in linux but i am unable to open it, it seems this file is corrupted.
can you please check the file?

Thanks

What is the structure of SE-Var2 and SE-Var3

Hi, thank you for your novel and interesting work.

Page 3 to 4 of the paper
"""
Additionally, SE-Var3 employing one single FC layer performs better than two FC layers with dimensionality reduction in SE block.
"""
I think SE-Var2 has one FC layer and SE-Var3 should have two FC layers (according to Table 1 Param). Is there a mistake in this sentence, or it is my misunderstanding? What is the structure of SE-Var2 and SE-Var3?

About paramters and Flops?

Why are the parameters and calculations of the released code different from those reported in your paper? (resnet50 parameters: 25.56M vs 24.37M; Flops: 4.12G vs 3.86G)

Kernel Size Discrepancy

As described in this issue along with many others, the codebase uses a fixated kernel size of 3 for each layer in the ResNet-50 architecture as shown in this code file. However, upon calculation using the adaptive kernel size formula it was found that the kernel size values are [3,5,5,5].

  • Although the kernel size formula says the values should be [3,5,5,5] and the codebase has values set to be [3,3,3,3], the authors used [3,5,5,7] in their ECANet-50 which is different from the codebase fixed values and the values obtained from the formula.
  • Additionally, the results for Mask R-CNN which uses the pretrained ResNet-50 backbone has different kernel size [3,3,7,7] than what was originally used to train the ECA-ResNet-50 which was 3,5,5,7. Why was that so?

I would encourage the authors to clarify on the reason behind using different kernel sizes than what is obtained from the kernel size formula presented in the paper which as stated in the paper is used for every experiments.

Adaptive Kernel Size not implemented

It seems that the implementation under models/ differs from the paper's adaptive setting for k_size, given in Fig 3 of the paper.

Why do the resnet, mobilenet modules not use adaptive kernel sizes?

Training with custom datasets and labels configuration

Hello,

I would like to know about the dataset folder configuration to perform my training on detecting objects.

Something like this:

  • custom_dataset
    --- images
    ------- img1.jpg
    ------- img2.jpg
    ....
    --- labels
    ------- img1.txt
    ------- img2.txt
    ....

And about label configuration, for instance:
img1.txt
class,x,y,w,h

Cannot reproduce the results of the paper

First, great thanks to the authors for releasing the code. I have download the code and the pre-trained model and use the main.py to evaluate ilsvrc2012, but the results are far worse than that shown in the paper, here are some details
Screenshot from 2019-11-05 20-58-35

Downloading Pretrained Model

Hi, I was trying to download the resnet101 pretrained model, but have been unable to do so from the Baidu link. Is there any other way I could get the pretrained model?

VGG-16 Alexnet

Thank you for the nice work,

Can you share how to add eca-layer on Alexnet and VGG-16?

Pre-trained models

Hello Folks,

I am facing an issue while trying to use the pre-trained models. Here are the steps that I have followed.

  1. I downloaded the "eca_resnet50_k3557.pth.tar" from google drive.
  2. Constructed the model using models/eca_resnet.py and models/eca_module.py

Now the pre-trained model from step 1 has "module.conv1.." as its state keys while the model from step 2 has "conv1.weight.", which are key mismatches. I also found that the pre-trained model from Step 1 has a key "ärch = 'se_resnet50".

Question 1. Is this pre-trained model not of ECA Net but of SE Net?
Question 2. How to solve the key mismatch issues?

Could you please let me know how to solve this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.