Coder Social home page Coder Social logo

bruinxiong / dprost Goto Github PK

View Code? Open in Web Editor NEW

This project forked from parkjaewoo0611/dprost

0.0 1.0 0.0 5.19 MB

[ECCV2022] DProST: Dynamic Projective Spatial Transformer Network for 6D Pose Estimation

License: MIT License

Shell 3.06% Python 96.94%

dprost's Introduction

DProST: Dynamic Projective Spatial Transformer Network

Pytorch implementation of "DProST: Dynamic Projective Spatial Transformer Network for 6D Pose Estimation"[ECCV2022]
DProST is a 6D-object pose estimation method based on grid matching in object space.

[ArXiv]

Overview of DProST

Predicting the object's 6D pose from a single RGB image is a fundamental computer vision task. Generally, the distance between transformed object vertices is employed as an objective function for pose estimation methods. However, projective geometry in the camera space is not considered in those methods and causes performance degradation. In this regard, we propose a new pose estimation system based on a projective grid instead of object vertices. Our pose estimation method, dynamic projective spatial transformer network (DProST), localizes the region of interest grid on the rays in camera space and transforms the grid to object space by estimated pose. The transformed grid is used as both a sampling grid and a new criterion of the estimated pose. Additionally, because DProST does not require object vertices, our method can be used in a mesh-less setting by replacing the mesh with a reconstructed feature. Experimental results show that mesh-less DProST outperforms the state-of-the-art mesh-based methods on the LINEMOD and LINEMOD-OCCLUSION dataset, and shows competitive performance on the YCBV dataset with mesh data.

Installation

  • Clone this repository:
git clone https://github.com/parkjaewoo0611/DProST.git
cd DProST
  • DProST is based on pytorch 1.7.0, cudatoolkit 11.0, and pytorch3d 0.5.0.
  • Check the CUDA install for cuda and pytorch version match.
  • Create anaconda virtual env, install packages and clone official bop_toolkit by following command.
source source_install.sh

Dataset

  • Edit the dataset_root in source_download.sh to your own data folder
  • Download the data and make a symbolic link to your dataset folder by
./source_download.sh
  • Preprocessed data params (bboxes, K, indexes...) are in pickle file
  • Dataset structure should look like
Dataset
├──LINEMOD
    ├──dataset_info.md
    ├──train.pickle
    ├──test.pickle
    ├──train_pbr.pickle
    ├──train_syn.pickle
    ├──index
        ├──ape_test.txt
        ...
    ├──models
        ├──models_info.json
        ├──obj_000001.ply
        ...
    ├──pbr
        ├──000000
        ...
    ├──test
        ├──000001
        ...
    ├──test_bboxes
        ├──bbox_faster_all.json
        ├──bbox_yolov3_all.json
    ├──train
        ├──000001
        ...
    ├──syn (optional)
        ├──ape
        ...
    ├──backgrounds (optional)
        ├──2007_000027.jpg
        ...
├──OCCLUSION
├──YCBV
  • The structure of OCCLUSION and YCBV dataset should be same as LINEMOD.

Train

  • Check the code and model are properly installed by following toy example command
python train.py --gpu_id 0 --data_dir Dataset/LINEMOD --is_toy true
  • For faster training, we additionally updated DDP code for multi-gpu training, and gpu-scheduling to easily manage many experiments.
  • Train the DProST as you want
Ex)
# LINEMOD ape object on gpu 0
python train.py --gpu_id 0 --data_dir Dataset/LINEMOD --use_mesh false --obj_list 1 --mode train_pbr 

# LINEMOD each object on gpu 0,1,2 (with gpu_scheduling)
simple_gpu_scheduler --gpus 0,1,2 < gpu_commands.txt

# OCCLUSION all objects on gpu 0,1,2 (with DDP)
python train.py --gpu_id 0,1,2 --data_dir Dataset/OCCLUSION --use_mesh false --obj_list 1 5 6 8 9 10 11 12 --mode train_pbr

# train YCBV 002_master_chef_can object with mesh on gpu 0 
python train.py --gpu_id 0 --data_dir Dataset/YCBV --use_mesh true --obj_list 1 --mode train_pbr --epochs 300 --save_period 10 --early_stop 100 -- lr_step_size 200 --valid_metrics ADD_S_AUC

Test

Pre-trained Model

  • Download pre-trained models here
  • Evaluate and visualize the model with test code
Ex)
# evaluate & visualize result
python test.py -r pretrained/LMO_all/all/model/model_best.pth -p pretrained/LMO_all/all/result -v true --gpu_scheduler false --data_dir Dataset/OCCLUSION
-r: Path to pretrained model & config.json.  
-p: Path to save the visualize qualitative results.  
-v: boolean option to visualization.  
--gpu_scheduler: turn off the gpu_scheduler in test.  

Qualitatiave Results

  • Qualitative results on OCCLUSION dataset (blue contours for prediction, green contours for ground-truth)

  • Qualitative results on YCBV dataset (blue contours for prediction, green contours for ground-truth)

TODO

  • Upload the dataset download link
  • Upload the pretrained model

Acknowledgements

Our project is based on the following projects. We appreciate the authors for sharing their great code and dataset.

Citation

@article{park2022dprost,
  title={DProST:Dynamic Projective Spatial Transformer Network for 6D Pose Estimation},
  author={Park, Jaewoo and Cho, Nam Ik},
  booktitle={ECCV},
  year={2022}
}

dprost's People

Contributors

parkjaewoo0611 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.