Coder Social home page Coder Social logo

doranlyong / spanet-official Goto Github PK

View Code? Open in Web Editor NEW
23.0 3.0 0.0 27.92 MB

Official implementation of SPANet in ICCV2023

Home Page: https://doranlyong.github.io/projects/spanet/

Jupyter Notebook 0.10% Shell 0.13% Python 99.77%
frequency-modulation spectral-analysis token transformer frequency-balancing convolution fourier-transform image-filtering

spanet-official's Introduction

SPANet Official (ongoing)

๐Ÿ’ฌ This repo is the official implementation of:

  • ICCV2023: SPANet: Frequency-balancing Token Mixer using Spectral Pooling Aggregation Modulation

๐Ÿค– It currently includes code and models for the following tasks:

๐Ÿ“– Introduction

SPANet is a new backbone network which can handle the balance problem of high- and low-frequency components for optimal feature representations.

Main results on ImageNet-1K

Please see image_classification for more details.

Model Pretrain Resolution Top-1 #Param. FLOPs
SPANet-S ImageNet-1K 224x224 83.1 28.7M 4.6G
SPANet-M ImageNet-1K 224x224 83.5 41.8M 6.8G
SPANet-MX ImageNet-1K 224x224 83.8 54.9M 9.0G
SPANet-B ImageNet-1K 224x224 84.0 75.9M 12.0G
SPANet-BX ImageNet-1K 224x224 84.4 99.8 M 15.8G

Main results on COCO object detection and instance segmentation

Please see object_detection for more details.

RetinaNet 1x

Backbone Lr Schd box mAP #params
SPANet-S 1x 43.3 38M
SPANet-M 1x 44.0 51M

Mask R-CNN 1x

Backbone Lr Schd box mAP mask mAP #params
SPANet-S 1x 44.7 40.6 48M
SPANet-M 1x 45.2 41.0 61M

Main results on ADE20K semantice segmentation

Please see semantic_segmentation for more details.

Semantic FPN

Backbone Lr Schd mIoU #params FLOPs
SPANet-S 80K 45.4 32M 46G
SPANet-M 80K 46.2 45M 57G

โญ Cite SPANet

If you find this repository useful, please give us stars and use the following BibTeX entry for citation.

@inproceedings{yun2023spanet,
  title={SPANet: Frequency-balancing Token Mixer using Spectral Pooling Aggregation Modulation},
  author={Yun, Guhnoo and Yoo, Juhan and Kim, Kijung and Lee, Jeongho and Kim, Dong Hwan},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={6113--6124},
  year={2023}
}

License

This project is released under the MIT license. Please see the LICENSE file for more information.

spanet-official's People

Contributors

doranlyong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

spanet-official's Issues

Question of Figure 1 in paper

Hello Authors, your work is very constructive in the field.
However, when I try to follow your work, i.e., visualise the spectrum of different modules, I have difficulty finding a clean and contrasting result as shown in your Figure 1. This may have something to do with the samples used and code replication. Please provide better samples or visualisations, thank you very much!

The code of figure1

Hello, I want to know how to visualize the Fourier spectrum maps (Figure 1 in main paper). Could you please provide me a toy code?
I am looking forward to your reply. Thank you~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.