Coder Social home page Coder Social logo

bruinxiong / gen6d Goto Github PK

View Code? Open in Web Editor NEW

This project forked from liuyuan-pal/gen6d

0.0 1.0 0.0 27.87 MB

[ECCV2022] Gen6D: Generalizable Model-Free 6-DoF Object Pose Estimation from RGB Images

License: GNU General Public License v3.0

Python 100.00%

gen6d's Introduction

Gen6D

Gen6D is able to estimate 6DoF poses for unseen objects like the following video.

Todo List

  • Pretrained models and evaluation codes.
  • Pose estimation on custom objects.
  • Training codes.

Installation

Required packages are list in requirements.txt. To determine how to install PyTorch along with CUDA, please refer to the pytorch-documentation

Download

  1. Download pretrained models, GenMOP dataset and processed LINEMOD dataset at here.
  2. Organize files like
Gen6D
|-- data
    |-- model
        |-- detector_pretrain
            |-- model_best.pth
        |-- selector_pretrain
            |-- model_best.pth
        |-- refiner_pretrain
            |-- model_best.pth
    |-- GenMOP
        |-- chair 
            ...
    |-- LINEMOD
        |-- cat 
            ...

Evaluation

# Evaluate on the object TFormer from the GenMOP dataset
python eval.py --cfg configs/gen6d_pretrain.yaml --object_name genmop/tformer

# Evaluate on the object cat from the LINEMOD dataset
python eval.py --cfg configs/gen6d_pretrain.yaml --object_name linemod/cat

Metrics about ADD-0.1d and Prj-5 will be printed on the screen.

Qualitative results

3D bounding boxes of estimated poses will be saved in data/vis_final/gen6d_pretrain/genmop/tformer. Ground-truth is drawn in green while prediction is drawn in blue.

Intermediate results about detection, viewpoint selection and pose refinement will be saved in data/vis_inter/gen6d_pretrain/genmop/tformer.

This image shows detection results.

This image shows viewpoint selection results. The first row shows the input image to the selector. The second row shows the input images rotated by the estimated in-plane rotation (left column) or the ground-truth in-plane rotation(right column) Subsequent 5 rows show the predicted (left) or ground-truth (right) 5 reference images with nearest viewpoints to the input image.

This image shows the pose refinement process. The red bbox represents the input pose, the green one represents the ground-truth and the blue one represents the output pose for the current refinement step.

Pose estimation on custom objects

Please refer to custom_object.md

Training

  1. Download processed co3d data (co3d.tar.gz), google scanned objects data (google_scanned_objects.tar.gz) and ShapeNet renderings (shapenet.tar.gz) at here.
  2. Download COCO 2017 training set.
  3. Organize files like
Gen6D
|-- data
    |-- GenMOP
        |-- chair 
            ...
    |-- LINEMOD
        |-- cat 
            ...
    |-- shapenet
        |-- shapenet_cache
        |-- shapenet_render
        |-- shapenet_render_v1.pkl
    |-- co3d_256_512
        |-- apple
            ...
    |-- google_scanned_objects
        |-- 06K3jXvzqIM
            ...
    |-- coco
        |-- train2017
  1. Train the detector
python train_model.py --cfg configs/detector/detector_train.yaml
  1. Train the selector
python train_model.py --cfg configs/selector/selector_train.yaml
  1. Prepare the validation data for training refiner
python prepare.py --action gen_val_set \
                  --estimator_cfg configs/gen6d_train.yaml \
                  --que_database linemod/cat \
                  --que_split linemod_val \
                  --ref_database linemod/cat \
                  --ref_split linemod_val

python prepare.py --action gen_val_set \
                  --estimator_cfg configs/gen6d_train.yaml \
                  --que_database genmop/tformer-test \
                  --que_split all \
                  --ref_database genmop/tformer-ref \
                  --ref_split all 

This command will generate the information in the data/val, which will be used in producing validation data for the refiner. 7. Train the refiner

python train_model.py --cfg configs/selector/refiner_train.yaml
  1. Evaluate all components together.
# Evaluate on the object TFormer from the GenMOP dataset
python eval.py --cfg configs/gen6d_train.yaml --object_name genmop/tformer

# Evaluate on the object cat from the LINEMOD dataset
python eval.py --cfg configs/gen6d_train.yaml --object_name linemod/cat

Acknowledgements

In this repository, we have used codes or datasets from the following repositories. We thank all the authors for sharing great codes or datasets.

We provide a paper list about recent generalizable 6-DoF object pose estimators at https://github.com/liuyuan-pal/Awsome-generalizable-6D-object-pose.

Citation

@inproceedings{liu2022gen6d,
  title={Gen6D: Generalizable Model-Free 6-DoF Object Pose Estimation from RGB Images},
  author={Liu, Yuan and Wen, Yilin and Peng, Sida and Lin, Cheng and Long, Xiaoxiao and Komura, Taku and Wang, Wenping},
  booktitle={ECCV},
  year={2022}
}

gen6d's People

Contributors

liuyuan-pal avatar omarirfa avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.