Coder Social home page Coder Social logo

alanxuefei / morefusion Goto Github PK

View Code? Open in Web Editor NEW

This project forked from wkentaro/morefusion

0.0 0.0 0.0 54.74 MB

MoreFusion: Multi-object Reasoning for 6D Pose Estimation from Volumetric Fusion

Home Page: https://morefusion.wkentaro.com

License: Other

Shell 2.18% Makefile 0.14% Python 85.17% CMake 0.80% C++ 10.72% Cuda 0.53% C 0.47%

morefusion's Introduction

MoreFusion

Multi-object Reasoning for 6D Pose Estimation from Volumetric Fusion

Kentaro Wada, Edgar Sucar, Stephen James, Daniel Lenton, Andrew J. Davison
Dyson Robotics Laboratory , Imperial College London
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020

MoreFusion is an object-level reconstruction system that builds a map with known-shaped objects, exploiting volumetric reconstruction of detected objects in a real-time, incremental scene reconstruction senario. The key components are:

  • Occupancy-based volumetric reconstruction of detected objects for model alignment in the later stage;
  • Volumetric pose prediction that exploits volumetric reconstruction and CNN feature extraction from the image observation;
  • Joint pose refinement of objects based on geometric consistency among objects and impenetrable space.

Installation

There're several options for installation:

NOTE: We have developed this project on Ubuntu 16.04 (and ROS Kinetic, CUDA 10.1), so several code changes may be needed to adapt to other OS (and ROS, CUDA versions).

Python project only

make install

ROS project for camera demonstration

mkdir -p ~/ros_morefusion/src
cd ~/ros_morefusion/src

git clone https://github.com/wkentaro/morefusion.git
cd morefusion
make install

cd ~/ros_morefusion
ln -s src/ros/*.sh .

./rosdep_install.sh
./catkin_build.robot_agent.sh

source .autoenv.zsh

ROS project for robotic demonstration

  • robot-agent: A computer for visual processing.
  • robot-node: A computer with real-time OS for Panda robot.

@robot-agent

Same as above instruction: ROS project for camera demonstration.

@robot-node

mkdir -p ~/ros_morefusion/src
cd ~/ros_morefusion/src

git clone https://github.com/wkentaro/morefusion.git

cd ~/ros_morefusion
ln -s src/ros/*.sh .

./catkin_build.robot_node.sh
source devel/setup.bash

rosrun morefusion_panda create_udev_rules.sh

Usage

Training & Inference

Pre-trained models are provided in the demos as following, so this process is optional to run the demos.

Instance Segmentation

cd examples/ycb_video/instance_segm
./download_dataset.py
mpirun -n 4 python train_multi.py  # 4-gpu training
./image_demo.py --model logs/XXX/XXX.npz

6D pose prediction

# baseline model (point-cloud-based)
cd examples/ycb_video/singleview_pcd
./download_dataset.py
./train.py --gpu 0 --centerize-pcd --pretrained-resnet18  # 1-gpu
mpirun -n 4 ./train.py --multi-node --centerize-pcd --pretrained-resnet18  # 4-gpu

# volumetric prediction model (3D-CNN-based)
cd examples/ycb_video/singleview_3d
./download_dataset.py
./train.py --gpu 0 --pretrained-resnet18 --with-occupancy  # 1-gpu
mpirun -n 4 ./train.py --multi-node --pretrained-resnet18 --with-occupancy  # 4-gpu
mpirun -n 4 ./train.py --multi-node --pretrained-resnet18  # w/o occupancy

# inference
./download_pretrained_model.py  # for downloading pretrained model
./demo.py logs/XXX/XXX.npz
./evaluate.py logs/XXX

Joint pose refinement

cd examples/ycb_video/pose_refinement
./check_icp_vs_icc.py  # press [s] to start

Camera demonstration

Static Scene

# using orb-slam2 for camera tracking
roslaunch morefusion_panda_ycb_video rs_rgbd.launch
roslaunch morefusion_panda_ycb_video rviz_static.desk.launch
roslaunch morefusion_panda_ycb_video setup_static.desk.launch

Figure 1. Static Scene Reconstruction with the Human Hand-mounted Camera.
# using robotic kinematics for camera tracking
roslaunch morefusion_panda_ycb_video rs_rgbd.robot.launch
roslaunch morefusion_panda_ycb_video rviz_static.robot.launch
roslaunch morefusion_panda_ycb_video setup_static.robot.launch

Figure 2. Static Scene Reconstruction with the Robotic Hand-mounted Camera.

Dynamic Scene

roslaunch morefusion_panda_ycb_video rs_rgbd.launch
roslaunch morefusion_panda_ycb_video rviz_dynamic.desk.launch
roslaunch morefusion_panda_ycb_video setup_dynamic.desk.launch

roslaunch morefusion_panda_ycb_video rs_rgbd.robot.launch
roslaunch morefusion_panda_ycb_video rviz_dynamic.robot.launch
roslaunch morefusion_panda_ycb_video setup_dynamic.robot.launch

Figure 3. Dynamic Scene Reconstruction with the Human Hand-mounted Camera.

Robotic Demonstration

Robotic Pick-and-Place

robot-agent $ sudo ntpdate 0.uk.pool.ntp.org  # for time synchronization
robot-node  $ sudo ntpdate 0.uk.pool.ntp.org  # for time synchronization

robot-node  $ roscore

robot-agent $ roslaunch morefusion_panda panda.launch

robot-node  $ roslaunch morefusion_panda_ycb_video rs_rgbd.robot.launch
robot-node  $ roslaunch morefusion_panda_ycb_video rviz_static.launch
robot-node  $ roslaunch morefusion_panda_ycb_video setup_static.robot.launch TARGET:=2
robot-node  $ rosrun morefusion_panda_ycb_video robot_demo_node.py
>>> ri.run()

Figure 4. Targetted Object Pick-and-Place. (a) Scanning the Scene; (b) Removing Distractor Objects; (c) Picking Target Object.

Citation

If you find MoreFusion useful, please consider citing the paper as:

@inproceedings{Wada:etal:CVPR2020,
  title={{MoreFusion}: Multi-object Reasoning for {6D} Pose Estimation from Volumetric Fusion},
  author={Kentaro Wada and Edgar Sucar and Stephen James and Daniel Lenton and Andrew J. Davison},
  booktitle={Proceedings of the {IEEE} Conference on Computer Vision and Pattern Recognition ({CVPR})},
  year={2020},
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.