FcaNet: Frequency Channel Attention Networks

PyTorch implementation of the paper "FcaNet: Frequency Channel Attention Networks".

Simplest usage

Models pretrained on ImageNet can be simply accessed by (without any configuration or installation):

model = torch.hub.load('cfzd/FcaNet', 'fca34' ,pretrained=True)
model = torch.hub.load('cfzd/FcaNet', 'fca50' ,pretrained=True)
model = torch.hub.load('cfzd/FcaNet', 'fca101' ,pretrained=True)
model = torch.hub.load('cfzd/FcaNet', 'fca152' ,pretrained=True)

Install

Please see INSTALL.md

Models

Classification models on ImageNet

Due to the conversion between FP16 training and the provided FP32 models, the evaluation results are slightly different(max -0.06%/+0.05%) compared with the reported results.

Model	Reported	Evaluation Results	Link
FcaNet34	75.07	75.02	GoogleDrive/BaiduDrive(code:m7v8)
FcaNet50	78.52	78.57	GoogleDrive/BaiduDrive(code:mgkk)
FcaNet101	79.64	79.63	GoogleDrive/BaiduDrive(code:8t0j)
FcaNet152	80.08	80.02	GoogleDrive/BaiduDrive(code:5yeq)

Detection and instance segmentation models on COCO

Model	Backbone	AP	AP50	AP75	Link
Faster RCNN	FcaNet50	39.0	61.1	42.3	GoogleDrive/BaiduDrive(code:q15c)
Faster RCNN	FcaNet101	41.2	63.3	44.6	GoogleDrive/BaiduDrive(code:pgnx)
Mask RCNN	Fca50 det Fca50 seg	40.3 36.2	62.0 58.6	44.1 38.1	GoogleDrive/BaiduDrive(code:d9rn)

Training

Please see launch_training_classification.sh and launch_training_detection.sh for training on ImageNet and COCO, respectively.

Testing

Please see launch_eval_classification.sh and launch_eval_detection.sh for testing on ImageNet and COCO, respectively.

FAQ

Since the paper is uploaded to arxiv, many academic peers ask us: the proposed DCT basis can be viewed as a simple tensor, then how about learning the tensor directly? Why use DCT instead of learnable tensor? Learnable tensor can be better than DCT.

Our concrete answer is: the proposed DCT is better than the learnable way, although it is counter-intuitive.

Method	ImageNet Top-1 Acc	Link
Learnable tensor, random initialization	77.914	GoogleDrive/BaiduDrive(code:p2hl)
Learnable tensor, DCT initialization	78.352	GoogleDrive/BaiduDrive(code:txje)
Fixed tensor, random initialization	77.742	GoogleDrive/BaiduDrive(code:g5t9)
Fixed tensor, DCT initialization (Ours)	78.574	GoogleDrive/BaiduDrive(code:mgkk)

To verify this results, one can select the cooresponding types of tensor in the L73-L83 in model/layer.py, uncomment it and train the whole network.

TODO

Object detection models
Instance segmentation models
Fix the incorrect results of detection models
Make the switching between configs more easier

zhe-liu / fcanet Goto Github PK

fcanet's Introduction

FcaNet: Frequency Channel Attention Networks

Simplest usage

Install

Models

Classification models on ImageNet

Detection and instance segmentation models on COCO

Training

Testing

FAQ

TODO

fcanet's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent