Coder Social home page Coder Social logo

zhang-tao-whu / dvis Goto Github PK

View Code? Open in Web Editor NEW
121.0 4.0 7.0 192 KB

DVIS: Decoupled Video Instance Segmentation Framework

License: MIT License

Python 92.82% Shell 0.07% C++ 0.71% Cuda 6.40%
offline online ovis segmentation video-instance-segmentation video-panoptic-segmentation

dvis's Introduction

Tao Zhang, XingYe Tian, Yu Wu, ShunPing Ji, Xuebo Wang, Yuan Zhang, Pengfei Wan

PWC PWC PWC PWC PWC

News

  • DVIS-DAQ achieves 57.1 AP on the OVIS dataset and also sets a new SOTA performance on YTVIS19/21 and VIPSeg. The code will be released in DVIS-DAQ. The paper is available at DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries and the project page can be found in project page.
  • The improved version of DVIS, DVIS++, is now available. Please refer to DVIS++ for more information. DVIS++ achieves 41.2 AP, 56.7 AP, and 52.0 AP, as well as 48.6 mIOU and 44.2 VPQ in OVIS, YTVIS19, YTVIS21, VSPW, and VIPSeg, respectively. Additionally, OV-DVIS++ supports open-vocabulary universal video segmentation.
  • DVIS achieved 1st place in the VPS Track of the PVUW challenge at CVPR 2023. 2023.5.25
  • DVIS has been accepted by ICCV 2023. 2023.7.15
  • DVIS achieved 1st place in the VIS Track of the 5th LSVOS challenge at ICCV 2023. 2023.8.15

Features

  • DVIS is a universal video segmentation framework that supports VIS, VPS and VSS.
  • DVIS can run in both online and offline modes.
  • DVIS achieved SOTA performance on YTVIS, OVIS, VIPSeg and VSPW datasets.
  • DVIS can complete training and inference on GPUs with only 11G memory.

Demos

Installation

See Installation Instructions.

Getting Started

See Preparing Datasets for DVIS.

See Getting Started with DVIS.

Model Zoo

Trained models are available for download in the DVIS Model Zoo.

Citing DVIS

@article{DVIS,
  title={DVIS: Decoupled Video Instance Segmentation Framework},
  author={Zhang, Tao and Tian, Xingye and Wu, Yu and Ji, Shunping and Wang, Xuebo and Zhang, Yuan and Wan, Pengfei},
  journal={arXiv preprint arXiv:2306.03413},
  year={2023}
}

@article{zhang2023vis1st,
  title={1st Place Solution for the 5th LSVOS Challenge: Video Instance Segmentation},
  author={Zhang, Tao and Tian, Xingye and Zhou, Yikang and Wu, Yu and Ji, Shunping and Yan, Cilin and Wang, Xuebo and Tao, Xin and Zhang, Yuan and Wan, Pengfei},
  journal={arXiv preprint arXiv:2308.14392},
  year={2023}
}

@article{zhang2023vps1st,
  title={1st Place Solution for PVUW Challenge 2023: Video Panoptic Segmentation},
  author={Zhang, Tao and Tian, Xingye and Wei, Haoran and Wu, Yu and Ji, Shunping and Wang, Xuebo and Zhang, Yuan and Wan, Pengfei},
  journal={arXiv preprint arXiv:2306.04091},
  year={2023}
}

Acknowledgement

This repo is largely based on Mask2Former, MinVIS and VITA. Thanks for their excellent works.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.