Coder Social home page Coder Social logo

bruinxiong / affordance_diffusion Goto Github PK

View Code? Open in Web Editor NEW

This project forked from nvlabs/affordance_diffusion

0.0 0.0 0.0 12.82 MB

Codes for "Affordance Diffusion: Synthesizing Hand-Object Interactions"

Home Page: https://github.com/NVlabs/affordance_diffusion/blob/master

Shell 0.22% Python 99.78%

affordance_diffusion's Introduction

Affordance Diffusion: Synthesizing Hand-Object Interactions

Yufei Ye, Xueting Li, Abhinav Gupta, Shalini De Mello, Stan Birchfield, Jiaming Song, Shubham Tulsiani, Sifei Liu

in CVPR2023

Tl;dr: Given a single RGB image of an object, hallucinate plausible ways of human interacting with it.

[Project Page] [Video] [Arxiv] [Data Generation]

Installation

See install.md

Inference

HOI synthesis

python inference.py data.data_dir='docs/demo/*.*g' test_num=3

Inference script first synthesizes $test_num HOI images in batch and then extract 3D hand pose.

Input Synthesized HOI images Extracted 3D Hand Pose

Interpolation

The script takes in the layout parameter of the $index-th example predicted from inference.py, and smoothly interpolates the HOI synthesis to the horizontally flipped parameters. To run demo,

python -m scripts.interpolate dir=docs/demo_inter

This should gives results similar to:

Input Interpolated Layouts Output
Addtional parameters ``` python -m scripts.interpolate dir=\${output}/release/layout/cascade index=0000_00_s0 ```
  • interpolation.len: length of a interpolation sequence
  • interpolation.num: number of interpolation sequences
  • interpolation.test_name: subfolder to save the output
  • interpolation.orient: whether to horizontally flip approaching direction

Heatmap Guidance

The following command runs guided generation with keypoints in docs/demo_kpts

python inference.py  mode=hijack data.data_dir='docs/demo_kpts/*.png' test_name=hijack

This should gives results similar to:

Input 1 Output 1 Input 2 Output 2

Training

Data Preprocessing

We provide the script to generate the HO3Pair dataset. Please see preprocess/.

Train your own models

python -m models.base -m  --config-name=train \
  expname=reproduce/\${model.module} \
  model=layout 
python -m models.base -m  --config-name=train \
  expname=reproduce/\${model.module} \
  model=content_glide
  • ContentNet-LDM: First download off-shelf pretrained model from here and put it under ${environment.pretrain}/stable/inpaint.ckpt specified in configs/model/content_ldm.yaml:resume_ckpt
python -m models.base -m  --config-name=train \
  expname=reproduce/\${model.module} \
  model=content_ldm 

Split and test images

Per-category HOI4D instance splits (was not used in the paper), test images on HOI4D and EPIC-KITCHENS(VISOR) can be downloaded here.

License

This project is licensed under CC-BY-NC-SA-4.0. Redistribution and use should follow this license.

Acknowledgement

Affordance Diffusion leverages many amazing open-sources shared in research community:

Citation

If you use find this work helpful, please consider citing:

 @inproceedings{ye2023affordance,
                title={Affordance Diffusion: Synthesizing Hand-Object Interactions},
                author={Yufei Ye and Xueting Li and Abhinav Gupta
                        and Shalini De Mello and Stan Birchfield and Jiaming Song
                        and Shubham Tulsiani and Sifei Liu},
                year={2023},
                booktitle ={CVPR},
            }

affordance_diffusion's People

Contributors

judyye avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.