Coder Social home page Coder Social logo

pinatafarms / feartracker Goto Github PK

View Code? Open in Web Editor NEW
177.0 7.0 17.0 35.51 MB

Official repo for FEAR: Fast, Efficient, Accurate and Robust Visual Tracker (ECCV 2022)

License: MIT License

Swift 33.31% Python 66.69%
comuter-vision deep-learning eccv eccv2022 pytorch pytorch-lightning visual-tracking

feartracker's Introduction

FEAR: Fast, Efficient, Accurate and Robust Visual Tracker

Paper Conference

FEAR architecture

This is an official repository for the paper

FEAR: Fast, Efficient, Accurate and Robust Visual Tracker
Vasyl Borsuk, Roman Vei, Orest Kupyn, Tetiana Martyniuk, Igor Krashenyi, Jiři Matas
ECCV 2022

Environment setup

The training code is tested on Linux systems and mobile benchmarking on MacOS systems.

conda create -n py37fear python=3.7
conda activate py37fear
pip install -r requirements.txt

Note: you might need to remove xtcocotools requirement when installing environment on MacOS system for model evaluation.

FEAR Benchmark

We provide FEAR evaluation protocol implementation in evaluate/MeasurePerformance directory. Also, we provide FEAR-XS model checkpoint without the Dynamic Template Update module; the complete version of the model will be added soon. You should do the following steps on MacOS device to evaluate model on iOS device:

  1. Open evaluate/MeasurePerformance project in Xcode. You can do this by double-clicking on evaluate/MeasurePerformance/MeasurePerformance.xcodeproj file or by opening it from the Open option in Xcode.
  2. Connect iOS device to your computer and build the project into it.
  3. Select one of the benchmark options by tapping on the corresponding button on your mobile device:
    • Benchmark FPS: launches simple model benchmark that warms up the model for 20 iterations and measures average FPS across 100 model calls. The result is displayed in Xcode console.
    • Benchmark Online: launches FEAR online benchmark as described in the paper
    • Benchmark Offline: launches FEAR offline benchmark

Do the following steps on MacOS device to convert model into CoreML:

  1. To convert the model trained in PyTorch to CoreML with the following command from the project root directory. This command will produce a file with the model in CoreML format (Model.mlmodel) and a model with FP16 weight quantization (Model_quantized.mlmodel).
PYTHONPATH=. python evaluate/coreml_convert.py
  1. Move converted model into the iOS project with the following command cp Model_quantized.mlmodel evaluate/MeasurePerformance/MeasurePerformance/models/Model_quantized.mlmodel.

Count FLOPS and parameters

PYTHONPATH=. python evaluate/macs_params.py

Demo inference with Python

PYTHONPATH=. python demo_video.py --initial_bbox=[163,53,45,174] \
--video_path=assets/test.mp4 \
--output_path=outputs/test.mp4

Demo app for iOS

FEARDemo.mp4
  1. Open evaluate/FEARDemo project in Xcode.
  2. Connect iOS device to your computer and build the project. Make sure to enable developer mode on your iOS device and trust your current apple developer. Also, you will need to select a development team under the signing & capabilities pane of the project editor (navigation described here here)

Note: demo app does not contain bounding box smoothing postprocessing steps of the tracker so its output is slightly different from Python.

Training

Data preparation

There are two dataset configurations. Download all datasets from the configuration file you'll train with and put them into the directory specified in visual_object_tracking_datasets configuration field. You can change the value of visual_object_tracking_datasets to your local dataset path. There are two dataset configurations:

  1. Quick train on GOT-10k dataset
    Config file: model_training/config/dataset/got10k_train.yaml
  2. Full train on LaSOT, COCO2017, YouTube-BoundingBoxes, GOT-10k and ILSVRC
    Config file: model_training/config/dataset/full_train.yaml

You should create CSV annotation file for each of training datasets. We don't provide CSV annotations as some datasets have license restrictions. The annotation file for each dataset should have the following format:

  • sequence_id: str - unique identifier of video file
  • track_id: str - unique identifier of scene inside video file
  • frame_index: int - index of frame inside video
  • img_path: str - location of frame image relative to root folder with all datasets
  • bbox: Tuple[int, int, int, int] - bounding box of object in a format x, y, w, h
  • frame_shape: Tuple[int, int] - width and height of image
  • dataset: str - label to identify dataset (example: got10k)
  • presence: int - presence of the object (example, 0/1)
  • near_corner: int - is bounding box touches borders of the image (example, 0/1)

Run training

Current training code supports model training without Dynamic Template Update module, it'll be added soon. You can launch training with default configuration with the following command from the project root directory:

PYTHONPATH=. python model_training/train.py backend=2gpu
# or the following for full train
PYTHONPATH=. python model_training/train.py dataset=full_train backend=2gpu

Citation

If you use the FEAR Tracker benchmark, demo or training code (implicitly or explicitly) - for your research projects, please cite the following paper:

@article{fear_tracker,
  title={FEAR: Fast, Efficient, Accurate and Robust Visual Tracker},
  author={Borsuk, Vasyl and Vei, Roman and Kupyn, Orest and Martyniuk, Tetiana and Krashenyi, Igor and Matas, Ji{\v{r}}i},
  journal={arXiv preprint arXiv:2112.07957},
  year={2021}
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.