Coder Social home page Coder Social logo

svformer11's Introduction

SVFormer: Semi-supervised Video Transformer for Action Recognition

This is the official implementation of the paper SVFormer

@inproceedings{svformer,
  title={SVFormer: Semi-supervised Video Transformer for Action Recognition},
  author={Zhen Xing, Qi Dai, Han Hu, Jingjing Chen, Zuxuan Wu, Yu-Gang Jiang},
  booktitle={CVPR},
  year={2023}
}

Installation

We tested the released code with the following conda environment

conda create -n svformer python=3.7
conda activate svformer
bash env.sh

Data Preparation

We expect that --train_list_path and --val_list_path command line arguments to be a data list file of the following format

<path_1> <label_1>
<path_2> <label_2>
...
<path_n> <label_n>

where <path_i> points to a video file, and <label_i> is an integer between 0 and num_classes - 1. --num_classes should also be specified in the command line argument.

Additionally, <path_i> might be a relative path when --data_root is specified, and the actual path will be relative to the path passed as --data_root.

We provide example as list_hmdb_40.

Train script of SVFormer-B at Kinetic-400 1% setting

bash train.sh

Main Results in paper

This is an original-implementation for open-source use. We are still re-running some models, and their scripts, checkpoints will be released later. In the following table we report the accuracy in original paper.

Backbone UCF101-1% UCF101-10% Kinetic400-1% Kinetic400-10%
SVFormer-S 31.4 79.1 32.6 61.6
SVFormer-B 46.3 86.7 49.1 69.4
Backbone HMDB51-40% HMDB51-50% HMDB51-60%
SVFormer-S 56.2 58.2 59.7
SVFormer-B 61.6 64.4 68.2

Acknowledgements

Our code is modified from TimeSformer. Thanks for their awesome work!

svformer11's People

Contributors

chenhsing avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.