Coder Social home page Coder Social logo

pytorch-i3d-feature-extraction's Introduction

I3D Feature Extraction

Usage

  • Format the videos to 25 FPS.
  • Convert the videos into frame images and optical flows.
  • python3 extract_features.py ...

Parameters

--mode:              rgb or flow
--load_model:        path of the I3D model
--input_dir:         folder of converted videos
--output_dir:        folder of extracted features
--batch_size:        batch size for snippets
--sample_mode:       oversample, center_crop or resize
--frequency:         how many frames between adjacent snippet
--usezip/no-usezip:  whether the frame images are zipped

Important: Use PyTorch 0.3

Input Folder Structure

InputFolder
├── video1
│   ├── flow_x.zip
│   ├── flow_y.zip
│   └── img.zip
└── video2
    ├── flow_x.zip
    ├── flow_y.zip
    └── img.zip

Frame images and flows can also be unzipped.

I3D models trained on Kinetics (Old Readme)

Overview

This repository contains trained models reported in the paper "Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset" by Joao Carreira and Andrew Zisserman.

This code is based on Deepmind's Kinetics-I3D. Including PyTorch versions of their models.

Note

This code was written for PyTorch 0.3. Version 0.4 and newer may cause issues.

Fine-tuning and Feature Extraction

We provide code to extract I3D features and fine-tune I3D for charades. Our fine-tuned models on charades are also available in the models director (in addition to Deepmind's trained models). The deepmind pre-trained models were converted to PyTorch and give identical results (flow_imagenet.pt and rgb_imagenet.pt). These models were pretrained on imagenet and kinetics (see Kinetics-I3D for details).

Fine-tuning I3D

train_i3d.py contains the code to fine-tune I3D based on the details in the paper and obtained from the authors. Specifically, this version follows the settings to fine-tune on the Charades dataset based on the author's implementation that won the Charades 2017 challenge. Our fine-tuned RGB and Flow I3D models are available in the model directory (rgb_charades.pt and flow_charades.pt).

This relied on having the optical flow and RGB frames extracted and saved as images on dist. charades_dataset.py contains our code to load video segments for training.

Feature Extraction

extract_features.py contains the code to load a pre-trained I3D model and extract the features and save the features as numpy arrays. The charades_dataset_full.py script loads an entire video to extract per-segment features.

pytorch-i3d-feature-extraction's People

Contributors

finspire13 avatar piergiaj avatar emptybird avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.