Coder Social home page Coder Social logo

1040242795 / videomae-action-detection Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mcg-nju/videomae-action-detection

1.0 0.0 0.0 594 KB

[NeurIPS 2022 Spotlight] VideoMAE for Action Detection

License: Other

Shell 1.75% C++ 1.29% Python 89.29% C 0.53% Cuda 7.14%

videomae-action-detection's Introduction

VideoMAE for Action Detection (NeurIPS 2022 Spotlight) [Arxiv]

VideoMAE Framework

License: CC BY-NC 4.0
PWC

VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Zhan Tong, Yibing Song, Jue Wang, Limin Wang
Nanjing University, Tencent AI Lab

This repo contains the supported code and scripts to reproduce action detection results of VideoMAE. The code of pre-training is available in original repo.

πŸ“° News

[2023.1.16] Code and pre-trained models are available now!

πŸš€ Main Results

✨ AVA 2.2

Method Extra Data Extra Label Backbone #Frame x Sample Rate mAP
VideoMAE Kinetics-400 βœ— ViT-S 16x4 22.5
VideoMAE Kinetics-400 βœ“ ViT-S 16x4 28.4
VideoMAE Kinetics-400 βœ— ViT-B 16x4 26.7
VideoMAE Kinetics-400 βœ“ ViT-B 16x4 31.8
VideoMAE Kinetics-400 βœ— ViT-L 16x4 34.3
VideoMAE Kinetics-400 βœ“ ViT-L 16x4 37.0
VideoMAE Kinetics-400 βœ— ViT-H 16x4 36.5
VideoMAE Kinetics-400 βœ“ ViT-H 16x4 39.5
VideoMAE Kinetics-700 βœ— ViT-L 16x4 36.1
VideoMAE Kinetics-700 βœ“ ViT-L 16x4 39.3

πŸ”¨ Installation

Please follow the instructions in INSTALL.md.

➑️ Data Preparation

Please follow the instructions in DATASET.md for data preparation.

‴️ Fine-tuning with pre-trained models

The fine-tuning instruction is in FINETUNE.md.

πŸ“Model Zoo

We provide pre-trained and fine-tuned models in MODEL_ZOO.md.

☎️ Contact

Zhan Tong: [email protected]

πŸ‘ Acknowledgements

Thanks to Lei Chen for support. This project is built upon MAE-pytorch, BEiT and AlphAction. Thanks to the contributors of these great codebases.

πŸ”’ License

The majority of this project is released under the CC-BY-NC 4.0 license as found in the LICENSE file. Portions of the project are available under separate license terms: pytorch-image-models are licensed under the Apache 2.0 license. BEiT is licensed under the MIT license.

✏️ Citation

If you think this project is helpful, please feel free to leave a star⭐️ and cite our paper:

@inproceedings{tong2022videomae,
  title={Video{MAE}: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training},
  author={Zhan Tong and Yibing Song and Jue Wang and Limin Wang},
  booktitle={Advances in Neural Information Processing Systems},
  year={2022}
}

@article{videomae,
  title={VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training},
  author={Tong, Zhan and Song, Yibing and Wang, Jue and Wang, Limin},
  journal={arXiv preprint arXiv:2203.12602},
  year={2022}
}

videomae-action-detection's People

Contributors

yztongzhan avatar wanglimin avatar

Stargazers

hoangnam avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.