Coder Social home page Coder Social logo

visgel's Introduction

Connecting Touch and Vision via Cross-Modal Prediction

Yunzhu Li, Jun-Yan Zhu, Russ Tedrake, Antonio Torralba

CVPR 2019 [website] [paper]

Installation

This code base is tested with Ubuntu 16.04 LTS, Python 3.7, and PyTorch 0.3.1. Other versions might work but are not guaranteed.

Install PyTorch 0.3.1 using anaconda

conda install pytorch=0.3.1 cuda90 -c pytorch

Install opencv and imageio

conda install -c conda-forge opencv
conda install -c conda-forge imageio-ffmpeg

Demo

Download pretrained checkpoints (~500 MB) and example data (~570 MB)

bash scripts/download_ckps.sh
bash scripts/download_demo.sh

Evaluate the pretrained checkpoints on the example data

bash scripts/demo_vision2touch.sh
bash scripts/demo_touch2vision.sh

The results are store in dump_vision2touch/demo or dump_touch2vision/demo.

Following are a few examples where the green box indicates the ground truth and the predictions from our model are shown in red.

Vision to Touch

  

Touch to Vision

  

Training

Download data lists (~430 MB), data where the objects are considered as seen (328 GB) and data where the objects are considered as unseen (83.2 GB).

bash scripts/download_data_lst.sh
bash scripts/download_data_seen.sh
bash scripts/download_data_unseen.sh

Train the vision2touch or touch2vision model using the corresponding script.

bash scripts/train_vision2touch.sh
bash scripts/train_touch2vision.sh

Evaluation

Make sure the data lists, data_seen and data_unseen are in place. Run the following scripts to evaluate the trained model.

bash scripts/eval_vision2touch.sh
bash scripts/eval_touch2vision.sh

For Vision to Touch, the deformation error for seen objects is 0.6439 and the error for unseen objects is 0.7573.

Citation

If you find this codebase useful in your research, please consider citing:

@inproceedings{li2019connecting,
    Title={Connecting Touch and Vision via Cross-Modal Prediction},
    Author={Li, Yunzhu and Zhu, Jun-Yan and Tedrake, Russ and Torralba, Antonio},
    Booktitle = {CVPR},
    Year = {2019}
}

visgel's People

Contributors

yunzhuli avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.