Coder Social home page Coder Social logo

wanghuogen / p-cnn Goto Github PK

View Code? Open in Web Editor NEW

This project forked from gcheron/p-cnn

0.0 2.0 0.0 4.26 MB

P-CNN: Pose-based CNN Features for Action Recognition

License: MIT License

MATLAB 28.42% Shell 2.66% Makefile 1.37% TeX 8.80% Python 4.72% CSS 0.37% JavaScript 0.02% C++ 22.90% Cuda 22.47% C 3.09% Protocol Buffer 5.19%

p-cnn's Introduction

Information

This package contains a matlab implementation of Pose-based CNN (P-CNN) algorithm described in [1]. It includes pre-trained CNN appearance vgg-f model [2], a matlab version of the flow model of [3] and the optical flow implementation of [4]. CNN implementation uses the MatConvNet library [5]. The project webpage is http://www.di.ens.fr/willow/research/p-cnn/ .

####To run this package:

  • Prepare/download CNN models and data examples by running init.sh file from the P-CNN folder.
  • This package compiles MatConvNet [5] in "CPU mode". To speed up computation you may want to enable GPU support (much faster). To help you, we provide the my_build.m file in the matconvnet-1.0-beta11 folder that you can modify.
  • You may want to recompile Brox optical flow 2004 [4] (download sources).

####demo.m An example of P-CNN computation is given in this package. It computes P-CNN for few videos of the JHMDB dataset [6] (for 2 different splits) using pose ground truth annotations. The reproduce_ICCV15_results command reproduces the P-CNN results reported in [1]. Because we wanted to provide a "full matlab code", we converted all the code to matlab resulting to a slightly different result (-0.9% accuracy) from the published version due to the switch of the CNN package and retraining.

The provided algorithm takes as input the frames of a video and their corresponding pose joints (from ground truth annotation or from your favorite pose detector). There is a demo.m file in the package that you should be able to run.

####Datasets Two datasets have been used in our ICCV'15 paper:

  • JHMDB [6]: as explained above, the demo.m file shows how to use P-CNN with this dataset. The dataset and the ground-truth joint positions can be download here.
  • MPII Cooking Activities [7]: You can download the dataset and the estimated joint positions we computed for our experiments. Note that, in MPII Cooking Activities, we do not use the same parameters as for JHMDB (e.g there is no full body part). Then, you have to modify the following parameters in the demo.m file:
param.lhandposition=11;
param.rhandposition=6;
param.upbodypositions=1:13;
param.lside = 120 ;

and in compute_pcnn_features.m:

param.partids = [1 2 3 4] ; % don't use full body part

####Cite If you use this package, please cite:

@inproceedings{cheronICCV15,
TITLE = {{P-CNN: Pose-based CNN Features for Action Recognition}},
AUTHOR = {Ch{'e}ron, Guilhem and Laptev, Ivan and Schmid, Cordelia},
BOOKTITLE = {ICCV},
YEAR = {2015},
}

####References [1] G. Chéron, I. Laptev, C. Schmid. P-CNN: Pose-based CNN Features for Action Recognition. ICCV 2015.

[2] K. Chatfield, K. Simonyan, A. Vedaldi, and A. Zisserman. Return of the devil in the details: Delving deep into convolutional nets. BMVC 2014.

[3] G. Gkioxari and J. Malik. Finding action tubes. CVPR 2015. ACM 2015.

[4] T. Brox, A. Bruhn, N. Papenberg, and J. Weickert. High accuracy optical flow estimation based on a theory for warping. ECCV 2004.

[5] A. Vedaldi and K. Lenc. MatConvNet - Convolutional Neural Networks for MATLAB.

[6] H. Jhuang, J. Gall, S. Zuffi, C. Schmid, and M. J. Black. Towards understanding action recognition. ICCV 2013.

[7] M. Rohrbach, S. Amin, M. Andriluka and B. Schiele. A Database for Fine Grained Activity Detection of Cooking Activities. CVPR 2012.

####Acknowledgements We graciously thank the authors of the previous code releases and video benchmark for making them publicly available.

p-cnn's People

Contributors

gcheron avatar

Watchers

James Cloos avatar Deshawn Nienow avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.