Coder Social home page Coder Social logo

evanet-iccv19's Introduction

Evolving Space-Time Neural Architectures for Videos

This repository (forked from google_research) contains the code and pretrained models for EvaNet:

"Evolving Space-Time Neural Architectures for Videos"
AJ Piergiovanni, Anelia Angelova, Alexander Toshev, and Michael S. Ryoo
ICCV 2019

arXiv.

    @inproceedings{evanet,
          title={Evolving Space-Time Neural Architectures for Videos},
          booktitle={International Conference on Computer Vision (ICCV)},
      author={AJ Piergiovanni and Anelia Angelova and Alexander Toshev and Michael S. Ryoo},
      year={2019}
}

This code supports inference with an ensemble of models pretrained on Kinetics-400. An example video is included in the data directory. The video is from HMDB [1] corresponding to a cricket activity. Running the full evaluation on the Kinetics-400 validation set available in November 2018 (roughly 19200 videos) gives 77.2% accuracy.

iTGM Layer

The iTGM layer, a 3D version of the TGM layer from our ICML 2019 paper Temporal Gaussian Mixture Layer for Videos is in the tgm_layer.py file. This layer inflates a 2D spatial kernel based on a mixture of 1D (temporal) Gaussians. This allows the modeling of spatio-temporal filters with significantly fewer parameters.

Installation and Running

To install requirements:

pip install -r evanet/requirements.txt

Then download the model weights and place them in data/checkpoints.

To evalute the pre-trained EvaNet ensemble on a sample video:

python -m evanet.run_evanet --checkpoints=rgb1.ckpt,rgb2.ckpt,flow1.ckpt,flow2.ckpt

Results

Kinetics-400

These results are on the video available November 2018, about 10% less than the original dataset.

Method Accuracy
I3D 72.6
(2+1)D I3D 74.3
iTGM I3D 74.4
ResNet-50 (2+1)D 72.1
ResNet-101 (2+1)D 72.8
3D Ensemble 74.6
iTGM-Ensemble 74.7
Diverse Ensemble (3D, (2+1)D, iTGM) 75.3
Two-stream I3D 72.6
Two-stream S3D-G 76.2
ResNet-50 + Non-local 73.5
Arch. Ensemble (I3D, ResNet-50, ResNet-101) 75.4
Top 1 (Individual, ours) 76.4
Top 2 (Individual, ours) 75.5
Top 3 (Individual, ours) 75.7

HMDB (3 splits)

Method Accuacy
Two-stream 59.4
Two-stream+IDT 69.2
R(2+1)D 78.7
Two-stream I3D 80.9
PoTion 80.9
Dicrim. Pooling 81.3
DSP 81.5
Top model (Individual, ours) 81.3
3D-Ensemble 79.9
iTGM-Ensemble 80.1
EvaNet (Ensemble, ours) 82.3

Charades

Method mAP (%)
Two-Stream + LSTM 17.8
Async-TF 22.4
TRN 25.2
Non-local NN 37.5
3D-Ensemble (baseline) 35.2
iTGM-Ensemble (baseline) 35.7
Top 1 (individual, ours) 37.3
Top 2 (individual, ours) 36.8
Top 3 (individual, ours) 36.6
EvaNet (ensemble, ours) 38.1

Moments-in-Time

Method Accuracy
I3D 29.5
ResNet-50 30.5
ResNet-50 + NL 30.7
Arch. Ensemble (I3D, ResNet-50, ResNet-101) 30.9
Top 1 (Individual, ours) 30.5
EvaNet (Ensemble, ours) 31.8

References:

[1] H. Kuehne, H. Jhuang, E. Garrote, T. Poggio, and T. Serre. HMDB: A Large Video Database for Human Motion Recognition. ICCV, 2011

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.