Coder Social home page Coder Social logo

isbecky27 / slttrack Goto Github PK

View Code? Open in Web Editor NEW

This project forked from byminji/slttrack

0.0 0.0 0.0 5.01 MB

Official Implementation of Towards Sequence-Level Training for Visual Tracking (ECCV 2022)

License: GNU General Public License v3.0

C++ 2.56% Python 90.81% C 2.35% MATLAB 0.07% Cuda 3.06% Makefile 0.01% Cython 1.14%

slttrack's Introduction

SLTtrack

Official implementation of the ECCV 2022 paper Towards Sequence-Level Training for Visual Tracking
Minji Kim*, Seungkwan Lee*, Jungseul Ok, Bohyung Han, Minsu Cho (* denotes equal contribution)

[Paper] [Models] [Raw Results]

SLT_Framework

Introduction

❗ Problem: training-testing inconsistency in recent trackers

FLT_Pitfall Pitfall of frame-level training for visual tracking:
Training a tracker to better localize a target in each of individual frames of (a) does not necessarily improve actual tracking in the sequence of (b). Due to this issue, inconsistency between the validation loss and the validation performance is often observed during training as shown in (c).

✨ Solution: Sequence-Level Training (SLT)

SLT_Highlight Based on a reinforcement learning framework, SLT trains a model by actually tracking on a video and directly optimizing a tracking performance metric. Our sequence-level design of data sampling, learning objective, data augmentation boosts the generalization performance for visual tracking.

✨ Result: improvements on four baselines without modifying model architectures

Tracker (Base → Ours) LaSOT (AUC) TrackingNet (AUC) GOT-10K (AO)
SiamRPN++ → SLT-SiamRPN++ 51.0 → 58.4 (+7.4) 68.2 → 75.8 (+7.6) 49.5 → 62.1 (+12.6)
SiamAttn → SLT-SiamAttn 54.8 → 57.4 (+2.6) 74.3 → 76.9 (+2.6) 53.4 → 62.5 (+9.1)
TrDiMP → SLT-TrDiMP 63.3 → 64.4 (+1.1) 78.1 → 78.1 (+0.0) 67.1 → 67.5 (+0.4)
TransT → SLT-TransT 64.2 → 66.8 (+2.6) 81.1 → 82.8 (+1.7) 66.2 → 67.5 (+1.3)

Getting Started

Installation

We tested the codes in the following environments but other versions may also be compatible.

  • CUDA 11.3
  • Python 3.9
  • PyTorch 1.10.1
  • Torchvision 0.11.2
# Create and activate a conda environment
conda create -y --name slt python=3.9
conda activate slt

# Install PyTorch
conda install pytorch==1.10.1 torchvision==0.11.2 cudatoolkit=11.3 -c pytorch -c conda-forge

# Install requirements
pip install -r requirements.txt
sudo apt-get install libturbojpeg

# Build pycocotools
cd ${SLTtrack_ROOT}/pysot_toolkit
python setup.py build_ext --inplace

# Build library for deformable convolution/pooling
cd ${SLTtrack_ROOT}/pysot_toolkit/pysot/models/head/dcn
python setup.py build_ext --inplace

Training & Testing

  • SLT-TransT and SLT-TrDiMP are implemented based on PyTracking library.
    Please refer to tutorial_pytracking.md for more details.

  • SLT-SiamRPN++ and SLT-SiamAttn are implemented based on PySOT library.
    Please refer to tutorial_pysot.md for more details.

Models and Raw Results

Models and raw tracking results are provided in [Models] [Raw Results].

Citation

If you find SLT useful in your research, please consider citing our paper:

@inproceedings{SLTtrack,
  title={Towards Sequence-Level Training for Visual Tracking},
  author={Kim, Minji and Lee, Seungkwan and Ok, Jungseul and Han, Bohyung and Cho, Minsu},
  booktitle={ECCV},
  year={2022}
}

Acknowledgments

SLTtrack is developed upon PyTracking library and PySOT library, also borrowing from TransT, TrDiMP, SiamAttn. We would like to thank the authors for providing great frameworks and toolkits.

Contact

Minji Kim: [email protected]
Seungkwan Lee: [email protected]

slttrack's People

Contributors

byminji avatar deneb2016 avatar isbecky27 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.