Coder Social home page Coder Social logo

awesome-transformer-for-vision's Introduction

Awesome Transformer for Vision Resources List Awesome

A curated list of papers & resources linked to Transformer-based research mainly for vision and graphics tasks.

Contents

Papers

Original Paper

Attention Is All You Need. Ashish Vaswani*, Noam Shazeer*, Niki Parmar*, Jakob Uszkoreit*, Llion Jones*, Aidan N. Gomez*, Łukasz Kaiser*, Illia Polosukhin*. NIPs 2017.

2D Vision Tasks

Detection

Toward Transformer-Based Object Detection. Josh Beal*, Eric Kim*, Eric Tzeng, Dong Huk Park, Andrew Zhai, Dmitry Kislyuk. Arxiv 2020.

Rethinking Transformer-based Set Prediction for Object Detection. Zhiqing Sun*, Shengcao Cao*, Yiming Yang, Kris Kitani. Arxiv 2020.

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers. Zhigang Dai1, Bolun Cai, Yugeng Lin, Junying Chen. Arxiv 2020.

Segmentation

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers. Sixiao Zheng, Jiachen Lu, Hengshuang Zhao, Xiatian Zhu, Zekun Luo, Yabiao Wang, Yanwei Fu, Jianfeng Feng, Tao Xiang, Philip H.S. Torr, Li Zhang. Arxiv 2020.

End-to-End Video Instance Segmentation with Transformers. Yuqing Wang, Zhaoliang Xu, Xinlong Wang, Chunhua Shen, Baoshan Cheng, Hao Shen, Huaxia Xia. Arxiv 2020.

Tracking

TransTrack: Multiple-Object Tracking with Transformer. Peize Sun, Yi Jiang, Rufeng Zhang, Enze Xie, Jinkun Cao, Xinting Hu, Tao Kong, Zehuan Yuan, Changhu Wang, Ping Luo. Arxiv 2020.

Image Synthesis

Taming Transformers for High-Resolution Image Synthesis. Patrick Esser*, Robin Rombach*, Bjorn Ommer. Arxiv 2020.

Action Understanding

Video Action Transformer Network. Rohit Girdhar, Joao Carreira, Carl Doersch, Andrew Zisserman. CVPR 2019.

3D Vision Tasks

Point Cloud Processing

PCT: Point Cloud Transformer. Meng-Hao Guo, Jun-Xiong Cai, Zheng-Ning Liu, Tai-Jiang Mu, Ralph R. Martin, Shi-Min Hu. Arxiv 2020.

Point Transformer. Hengshuang Zhao, Li Jiang, Jiaya Jia, Philip Torr, Vladlen Koltun. Arxiv 2020.

Motion Modeling

Learning to Generate Diverse Dance Motions with Transformer. Jiaman Li, Yihang Yin, Hang Chu, Yi Zhou, Tingwu Wang, Sanja Fidler, Hao Li. Arxiv 2020.

A Spatio-temporal Transformer for 3D Human Motion Prediction. Emre Aksan*, Peng Cao*, Manuel Kaufmann, Otmar Hilliges. Arxiv 2020.

Human Body Modeling

End-to-End Human Pose and Mesh Reconstruction with Transformers. Kevin Lin, Lijuan Wang, Zicheng Liu. Arxiv 2020.

Others

Music Modeling

MUSIC TRANSFORMER: GENERATING MUSIC WITH LONG-TERM STRUCTURE. Cheng-Zhi Anna Huang, Ashish Vaswani, Jakob Uszkoreit, Noam Shazeer, Ian Simon, Curtis Hawthorne, Andrew M. Dai, Matthew D. Hoffman, Monica Dinculescu, Douglas Eck. Arxiv 2018.

Contributing

Please see CONTRIBUTING for details.

awesome-transformer-for-vision's People

Contributors

lijiaman avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.