Coder Social home page Coder Social logo

trungmaster5 / 3d-vision-and-touch Goto Github PK

View Code? Open in Web Editor NEW

This project forked from facebookresearch/3d-vision-and-touch

0.0 1.0 0.0 1.29 MB

When told to understand the shape of a new object, the most instinctual approach is to pick it up and inspect it with your hand and eyes in tandem. Here, touch provides high fidelity localized information while vision provides complementary global context. However, in 3D shape reconstruction, the complementary fusion of visual and haptic modalities remains largely unexplored. In this paper, we study this problem and present an effective chart-based approach to fusing vision and touch, which leverages advances in graph convolutional networks. To do so, we introduce a dataset of simulated touch and vision signals from the interaction between a robotic hand and a large array of 3D objects. Our results show that (1) leveraging both vision and touch signals consistently improves single-modality baselines, especially when the object is occluded by the hand touching it; (2) our approach outperforms alternative modality fusion methods and strongly benefits from the proposed chart-based structure; (3) reconstruction quality boosts with the number of grasps provided; and (4) the touch information not only enhances the reconstruction at the touch site but also extrapolates to its local neighborhood.

License: Other

Python 95.26% C++ 1.55% Cuda 2.51% Shell 0.68%

3d-vision-and-touch's Introduction

Image-to-Set Prediction

Companion code for E.J. Smith, et al.: 3D Shape Reconstruction from Vision and Touch.

This repository contains a code base and dataset for learning to fuse vision and touch signals from the grasp interaction of a simulated robotic hand and 3D obejct for 3D shape reconstruction. The code comes with pre-defined train/valid/test splits over the dataset released, pretrained models, and training and evaluation scripts. This code base uses a subset of the ABC Dataset (released under MIT License) instead of the dataset listed in the paper due to licensing issues. We appologise for the discrepancy, however, no data could have been released otherwise. We have provided updated reconstruction accuracies for the new dataset below.

If you find this code useful in your research, please consider citing with the following BibTeX entry:

@misc{VisionTouch,
Author = {Edward J. Smith and Roberto Calandra and Adriana Romero and Georgia Gkioxari and David Meger and Jitendra Malik and Michal Drozdzal},
Title = {3D Shape Reconstruction from Vision and Touch},
Year = {2020},
journal = {arXiv:1911.05063},
}

Installation

This code uses Python 3.6.9 , PyTorch 1.4.0. and cuda version 10.1

  • Installing pytorch:
$ pip install torch==1.4.0
  • Install dependencies
$ pip install -r requirements.txt

Dataset

To download the code call the following, keep in mind this will take some time to download and unpack:

$ bash download_data.sh

This is released under a MIT License.

Training

Touch Chart Prediction

To train a model to predict touch charts, ie local geometry at each touch site, first move into the touch chart directory:

$ cd touch_charts

To begin training call:

$ python recon.py --exp_type <exp_type> --exp_id <exp_id> 

where <exp_type> and <exp_id> are the experiment type and id you wish to specify. There are a number of other arguments for changing the default parameters of this training, call with --help to view them.

Checkpoints will be saved under a directory "experiments/checkpoint/<exp_type>/<exp_id>/", specified by --exp_type and --exp_id.

To check training progress with Tensorboard:

$ tensorboard --logdir=experiments/tensorboard/<exp_type>/  --port=6006

The training above will only predict a point cloud for each haptic signal. To optimize a mesh sheet to match this predicted point cloud and produce a predicted touch chart at every touch site call the following:

$ mv data/sheets data/pretrained_sheets
$ python produce_sheets.py.py --save_directory experiments/checkpoint/<exp_type>/<exp_id>/encoder_touch

where <exp_type> and <exp_id> are the same settings as when training. This will first move the premade sheets produced using the pretrained model. If you would like to use the premade sheets simply skip this step. By default the script uses the pretrained model provided to perform this optimization. Regardless of the model used, this will take some time to complete, and if you would like to use slurm to produce these sheets, the sumbit.py file can be called instead.

Global Prediction

To train a model to deform vision charts around touch charts and produce a full surface prediction, first move into the vision chart directory:

$ cd vision_charts

To begin training call:

$ python recon.py --exp_type <exp_type> --exp_id <exp_id> 

where <exp_type> and <exp_id> are the experiment type and id you wish to specify. There are a number of other arguments for changing the default parameters of this training, call with --help to view them.

Checkpoints will be saved under a directory "experiments/checkpoint/<exp_type>/<exp_id>/", specified by --exp_type and --exp_id.

To check training progress with Tensorboard:

$ tensorboard --logdir=experiments/tensorboard/<exp_type>/  --port=6006

The same level of hyperparamter search used in the paper can be reproduced using slurm and the submit.py file located in the same folder.

Evaluation

Touch Chart Prediction

Perform evaluation of the touch chart prediction, from the touch chart directory as follows:

$ python recon.py --eval --exp_type <exp_type> --exp_id <exp_id> 

where <exp_type> and <exp_id> are the experiment type and id specified during training.

Global Prediction

Perform evaluation of the global prediction, from the vision chart directory as follows:

$ python recon.py --eval --exp_type <exp_type> --exp_id <exp_id> 

where <exp_type> and <exp_id> are the experiment type and id specified during training.

Pretrained Models

If you wish to download pretrained models please call the following:

$ bash prepare_models.sh

To produce touch charts using the pretrained model call:

$ cd touch_charts
$ python produce_sheets.py 

As this is a time intensive procedure, if you would like to use slurm to produce these sheets, the sumbit.py file can be called. Premade sheets have also been provided in the dataset however.

To test using the pretrained models to reconstruct objects using different input modalities call:

$ cd vision 
$ python recon.py --pretrained <model> --eval

where <model> is one of either ['empty', 'touch', 'touch_unoccluded', 'touch_occluded', 'unoccluded', 'occluded'].

The following table highlights the reconstruction accuracies of these models on the test set:

No Input Touch Occluded Unoccluded Touch + Occluded Touch + Unoccluded
Chamfer Distance 26.888 6.926 2.936 2.844 2.406 2.468

License

This codebase and dataset is released under MIT license, see LICENSE for details.

3d-vision-and-touch's People

Contributors

edwardsmith1884 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.