Coder Social home page Coder Social logo

uoddm's Introduction

UODDM

Related to our paper Unified Object Detector for Different Modalities based on Vision Transformers. Our unified model can process RGB images, pseudo images converted from point clouds or inter-modality mixing of RGB image and pseudo images converted from point clouds.
Comparison of other systems can be seen here This repo contains the supported code and configuration files to reproduce object detection results of [simCrossTrans]. It is modified based on Swin Transformer for object detection. Original Readme

MODELs

Pretained based on COCO dataset

Those are the base model used for the UODDM work. You can download them from the official SWIN transformer repo, you can also download a backup from the link(Google Drive) provided here:

Finetune dataset model checkpoint
COCO Swin-T swin-t-model
COCO Swin-S swin-s-model

UODDM finetune on SUN RGBD dataset

The UODDM work was finetuning the above model based on SUN RGBD dataset. It has two models based on different modalities:

  • INPUT A: RGB.
  • INPUT B: RGB and DHS and RGB DHS mixed based on chessboard mixture.

We also had a input only as DHS model can be found in the simCrossTrans work. Here the performance based on mAP50 for SUNRGBD10, which includes a 10 common categories. Details please check the UODDM paper.

Finetune dataset input model checkpoint configure file performance on RGB validation performance on DHS validation performance when both RGB and DHS are available log
SUN RGBD INPUT A swin-t basedRGB cfg 53.9 N/A N/A log
SUN RGBD INPUT B swin-t basedRGBandDHSandRGBDHSmixed cfg 54.2 55.8 58.1 test_on_RGB test on RGB DHS mixed

Usage

Finetune based on COCO for SUNRGBD

The sun rgbd dataset training and test can be found in the sunrgbd folder, if you want to train the sunrgbd dataset based on pretrained model on COCO, please do the following:

cd sunrgbd
./shell_script/uoddm/train_swin_transform.sh

You need download a pretrained model from the COCO dataset and you can find the models in the MODEL session. If you want to train a RGB image, please use:# train RGB with pretrained weights from coco for 100 epochs"

About SUN RGBD categories

The SUN RGBD dataset also has 80 categories to align with COCO dataset. The SUNRGBD is direclty overwritten the COCO dataset's class, see this line If you want to directly use the pretrained model from SUN RGBD dataset, you need use the following customized mmdetection (updating the categories name to SUN RGBD and add some inference code).

https://github.com/liketheflower/mmdetection_beta

Inference

Run the following shell script:

./sunrgbd/shell_script/uoddm/inference/inference.sh

Installation

Please refer to get_started.md for installation.

uoddm's People

Contributors

liketheflower avatar zhujunli1993 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.