Coder Social home page Coder Social logo

hassony2 / handobjectconsist Goto Github PK

View Code? Open in Web Editor NEW
120.0 5.0 18.0 16.3 MB

[cvpr 20] Demo, training and evaluation code for joint hand-object pose estimation in sparsely annotated videos

Home Page: https://hassony2.github.io/handobjectconsist.html

License: MIT License

Python 99.96% Shell 0.04%
cvpr2020 sparse-supervision photometric differentiable-rendering hands pose-estimation video 3d-reconstruction

handobjectconsist's Introduction

Leveraging Photometric Consistency over Time for Sparsely Supervised Hand-Object Reconstruction

Yana Hasson, Bugra Tekin, Federica Bogo, Ivan Laptev, Marc Pollefeys, and Cordelia Schmid

Table of Content

Setup

Download and install code

  • Retrieve the code
git clone https://github.com/hassony2/handobjectconsist`
cd handobjectconsist
  • Create and activate the virtual environment with python dependencies
conda env create --file=environment.yml
conda activate handobject_env

Download the MANO model files

  • Go to MANO website

  • Create an account by clicking Sign Up and provide your information

  • Download Models and Code (the downloaded file should have the format mano_v*_*.zip). Note that all code and data from this download falls under the MANO license.

  • unzip and copy the content of the models folder into the assets/mano folder

  • Your structure should look like this:

handobjectconsist/
  assets/
    mano/
      MANO_LEFT.pkl
      MANO_RIGHT.pkl
      fhb_skel_centeridx9.pkl

Download datasets

First-Person Hand Action Benchmark (FPHAB)

  • Download the First-Person Hand Action Benchmark dataset following the official instructions to the data/fhbhands folder
  • Unzip the Object_models

unzip data/fhbhands/Object_models.zip -d data/fhbhands

  • Unzip MANO fits

tar -xvf assets/fhbhands_fits.tgz -C assets/

wget https://github.com/hassony2/handobjectconsist/releases/download/v0.2/releasemodels.zip

unzip releasemodels.zip

  • Optionally, resize the images (speeds up training !)

    • python reduce_fphab.py
  • Your structure should look like this:

data/
  fhbhands/
    Video_files/
    Video_files_480/  # Optional, created by reduce_fphab.py script
    Subects_info/
    Object_models/
    Hand_pose_annotation_v1/
    Object_6D_pose_annotation_v1_1/
assets/
  fhbhands_fits/
releasemodels/
  fphab/
     ...

HO3D

CVPR 2020

Note that all results in our paper are reported on a subset of the current dataset which was published as an early release, additionally we used synthetic data which is not released. The results are therefore not directly comparable with the final published results which are reported on the v2 version of the dataset.

Codalab challenge pre-trained model

After submisison I retrained a baseline model on the current dataset (official release of HO3D, which I refer to as HO3D-v2). You can get the model from the releasemodels

Evaluate the pre-trained model:

  • Download pre-trained models

  • Extract the pre-trained models unzip releasemodels.zip

  • Run the evaluation code and generate the codalab submission file

python evalho3dv2.py --resume releasemodels/ho3dv2/realonly/checkpoint_200.pth --val_split test --json_folder jsonres/res

This will create a file 'pred.zip' ready for upload to the codalab challenge

Training model on HO3D-v2

  • Download the HO3D-v2 dataset.

  • launch training using python trainmeshreg and providing all arguments as in releasemodels/ho3dv2/realonly/opt.txt

Demo

Run the demo on the FPHAB dataset.

python visualize.py

This script loads three models and visualizes their predictions on samples from the test split of FPHAB:

  • a model trained on the full FPHAB dataset
  • a model trained with only a fraction (<1%) of the full ground truth annotations finetuned with photometric consistency
  • a control model trained with the same fraction of the full ground truth annotations finetuned without photometric consistency

It produces images such as the following:

image

Training

Run the training code

Baseline model for joint hand-object pose estimation

Train baseline model of entire FPHAB (100% of the data is supervised with 3D annotations)

python trainmeshreg.py --freeze_batchnorm --workers 8 --block_rot

Train in sparsely annotated setting

  • Step 1: Train baseline model on a fraction of the FPHAB dataset (here 0.65%)
python trainmeshreg.py --freeze_batchnorm --workers 8 --fraction 0.00625 --eval_freq 50
  • Step 2: Resume training, adding photometric supervision

Step 1 will have produced a trained model which will be saved in a subdirectory of checkpoints/fhbhands_train_mini1/{data_you_launched_trainings}/.

Step 2 will resume training from this model, and further train with the additional photometric consistency loss on the frames for which the ground truth annotations are not used.

python trainmeshwarp.py --freeze_batchnorm --consist_gt_refs --workers 8 --fraction 0.00625 --resume checkpoints/path/to/saved/checkpoint.pth

  • Optional: For fair comparison (same number of training epochs), training can also be resumed without photometric consistency (this shows that the improvement does not come simply from longer training)

python trainmeshwarp.py --freeze_batchnorm --consist_gt_refs --workers 8 --fraction 0.00625 --resume checkpoints/path/to/saved/checkpoint.pth --lambda_data 1 --lambda_consist 0

Citation

If you find this code useful for your research, consider citing our paper:

@INPROCEEDINGS{hasson20_handobjectconsist,
	       title     = {Leveraging Photometric Consistency over Time for Sparsely Supervised Hand-Object Reconstruction},
	       author    = {Hasson, Yana and Tekin, Bugra and Bogo, Federica and Laptev, Ivan and Pollefeys, Marc and Schmid, Cordelia},
	       booktitle = {CVPR},
	       year      = {2020}
}

To fix

Thanks to Samira Kaviani for spotting that in Table 2. the splits are different because I previously filtered out frames for which hands are further than 10cm away from the object ! I will rerun the results beginning September and update them here.

Acknowledgements

Code

For this project, we relied on research code from:

Advice and discussion

I would like to specially thank Shreyas Hampali for advice on the HO-3D dataset and Guillermo Garcia-Hernando for advice and on the FPHAB dataset.

I would also like to thank Mihai Dusmanu, Yann Labbé and Thomas Eboli for helpful discussions and proofreading !

handobjectconsist's People

Contributors

hassony2 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

handobjectconsist's Issues

Why dose the cam_extr look like this?

Hi,hassony.

Thanks for this work.
I start leaning th code recently, and now I got confused bout the "get_hand_verts3d" function in ho3dv2.py.
In such function, you use cam_extr and it has been defined as np.array([[1, 0, 0, 0], [0, -1, 0, 0], [0, 0, -1, 0], [0, 0, 0, 1]]).
I cannot know the reason.
Hope do some little work in such field ,welcome for any generous suggestion.

about self.reorder_idxs

Hi, thanks for the great work!

In ho3dv2.py line54 , u use self.reorder_idxs. An d I have checked that the joints 3d location after passing params. to mano_layer(which u proposed as manopth) are different from the ground truth 3d joint loactions.But after u use self.reorder_idxs to change order , they are nearly same.
So do u change the order through self.reorder_idxs to adapt to the mano layer? In code of ho3d, such order change is used only for simpler visualization.
So after training on ho3d subset, do you use an "inverse_reorder_idxs" to recover the original order which are used in evaluation datast? Is it necessary? How do you achieve it?

Hope to get some suggestions from u.
Merci beaucoup!

same issue as #11

On running python visualize.py I get the following error (I have made sure I followed all instructions for installing the FPHAB dataset correctly):

Traceback (most recent call last): File "visualize.py", line 214, in main(args) File "visualize.py", line 127, in main crops = [vizdemo.get_crop(render_res) for render_res in render_ress] File "visualize.py", line 127, in crops = [vizdemo.get_crop(render_res) for render_res in render_ress] File "/advaya/handobjectconsist/meshreg/visualize/vizdemo.py", line 32, in get_crop x_min = xs.min() File "/h/advaya/.conda/envs/handobject_env/lib/python3.7/site-packages/numpy/core/_methods.py", line 43, in _amin return umr_minimum(a, axis, None, out, keepdims, initial, where) ValueError: zero-size array to reduction operation minimum which has no identit

data_split_action_recognition.txt

I was following the readme to run visualize.py and keep running into the following error:

Traceback (most recent call last): File "visualize.py", line 214, in <module> main(args) File "visualize.py", line 51, in main sample_nb=None, File "/scratch/ssd002/home/advaya/handobjectconsist/meshreg/netscripts/get_dataset.py", line 51, in get_dataset split=split, use_cache=use_cache, mini_factor=mini_factor, fraction=fraction, mode=mode File "/scratch/ssd002/home/advaya/handobjectconsist/meshreg/datasets/fhbhands.py", line 126, in __init__ self.load_dataset() File "/scratch/ssd002/home/advaya/handobjectconsist/meshreg/datasets/fhbhands.py", line 174, in load_dataset with open(self.info_split, "r") as annot_f: FileNotFoundError: [Errno 2] No such file or directory: 'data/fhbhands/data_split_action_recognition.txt'

How do I obtain/create this file?

Problem with neural_renderer

Hi there,

Thanks for sharing your work.

I got this problem while creating the virtual environment:

1

My system: Ubuntu 20.04, Cuda 10.1.

Any suggestions would be greatly appreciated!

Memory used by Warping

Hi @hassony2 ,
Thanks for sharing the code,
I have treid second stage of training; meshwarp; after some epochs, for me 50 epochs, occupied memory goes around 128 G, I do not have an idea why it happens?

About table 1. results in your paper

Hello.
I have two questions.

For the errors reported at table 1 in your paper, are these errors computed after Procrustes alignment or not ?

  1. Is the model chekpoint in releasemodes/fphab/hand_and_objects/checkpoint_200.pth correspond to the model of table 1 ??

Thanks in advance!

Live demo with RealSense D435 camera

Hi, thanks for sharing this great project!

I'm wondering if it's possible to run a live demo using the D435i camera or this library works only with specific datasets?

Thanks!

What exactly subset were you using for evaluation on HO3D

Hi, thanks for the great work!

For HO3D you mentioned that
"
HO3D
Optional: Download the HO3D-v2 dataset. Note that all results in our paper are reported on a subset of the current dataset which was published as an early release. The results are therefore not directly comparable with the final published results which are reported on the v2 version of the dataset.
"

Do you remember what exactly are the subset you were using? I'm working on a project which may need to follow same protocol for fair comparison later. Currently HO3D is using online competition and the evaluation set groundtruths are not available personally. If I understand correctly, you were not using the same evaluation set at that moment?

Training details about HO3D

Hi, Hasson,

Thanks for your great work!

Could you be kind enough to share the training args when train the meshreg on HO3D dataset.
as well as the pretrained model (of HO3D)

Thanks 👍

Processing a single image with the model

Hello everyone,

i want to write an application that processes each frame of a video with the pretrained model of handobjectconsist
(producing the MANO-mesh and Object Pose for each frame).
I saw the code in the visualize.py file has a demonstration of the model, but it receives some sort of datastructure for the dataset which it gets by "dataset, input_res = get_dataset.get_dataset (...)", however i dont want to process a whole datastructure but just a single frame.
I would like to know, what would be the easiest way to process a single frame when each RGB-frame is given as a single opencv mat type.
I tried it like this (the getMat () funktion receives the frame from the video stream as a opencv mat):

resume = "releasemodels/fphab/hands_and_objects/checkpoint_200.pth"
opts = reloadmodel.load_opts(resume)
self.model, epoch = reloadmodel.reload_model(resume, opts)
freeze.freeze_batchnorm_stats(self.model)
self.model.cuda()
self.model.eval()

mat = getMat()
dataset = torch.utils.data.Dataset (np.array (mat))
loader = torch.utils.data.DataLoader(dataset,batch_size=1)
_, results, _ = self.model (loader)

However i get the error:
Failed to call callback: 'DataLoader' object is not subscriptable
Traceback (most recent call last):

Has anybody an idea how to fix this,
Thanks in advance,
Patrick

visualize.py zero-size array

On running python visualize.py I get the following error (I have made sure I followed all instructions for installing the FPHAB dataset correctly):

Traceback (most recent call last): File "visualize.py", line 214, in <module> main(args) File "visualize.py", line 127, in main crops = [vizdemo.get_crop(render_res) for render_res in render_ress] File "visualize.py", line 127, in <listcomp> crops = [vizdemo.get_crop(render_res) for render_res in render_ress] File "/advaya/handobjectconsist/meshreg/visualize/vizdemo.py", line 32, in get_crop x_min = xs.min() File "/h/advaya/.conda/envs/handobject_env/lib/python3.7/site-packages/numpy/core/_methods.py", line 43, in _amin return umr_minimum(a, axis, None, out, keepdims, initial, where) ValueError: zero-size array to reduction operation minimum which has no identity

Is there are full MANO fitted parameters for other frames of "fhbhands"?

Hi, Hasson,

Thank you for the work.
I noticed that the MANO fitted of "fhbhands_fits" is not full version.
I saw your previous paper "Learning joint reconstruction of hands and manipulated objects" fitted the MANO model on "First-Person Hand Action Benchmark with RGB-D Videos and 3D Hand Pose Annotations" dataset.
Wondering that there are mano fitted parameters with that dataset.
If so could you share url?
Thank you.

Figure 8 in your paper

Hi, I see you have resized all FPHAB images to [480x270] dimension in this code

In figure 8, are the reported pixel erros in the size of [480x270], or the full resolution size? I'm working on a research project and want to compare with this method fairly.

Training on HO3D

Hi Yana,

Thanks for sharing your code on GitHub. It is very cool! I am trying to reproduce your HO3D results using your code, but encountered an error and I am not sure if I missed anything.

The command for launching the training procedure:

python trainmeshreg.py --freeze_batchnorm --workers 8 --block_rot --train_datasets ho3dv2 --version 1

The error message:

Traceback (most recent call last):
  File "trainmeshreg.py", line 361, in <module>
    main(args)
  File "trainmeshreg.py", line 79, in main
    sample_nb=None,
  File "/home/elytra/nether/projects/handobjectconsist/meshreg/netscripts/get_dataset.py", line 40, in get_dataset
    full_sequences=False,
  File "/home/elytra/nether/projects/handobjectconsist/meshreg/datasets/ho3dv2.py", line 259, in __init__
    self.obj_meshes = ho3dfullutils.load_objects(os.path.join(self.root, "modelsprocess"))
  File "/home/elytra/nether/projects/handobjectconsist/meshreg/datasets/ho3dfullutils.py", line 8, in load_objects
    object_names = [obj_name for obj_name in os.listdir(obj_root) if ".tgz" not in obj_name]
FileNotFoundError: [Errno 2] No such file or directory: 'data/ho3dv2/modelsprocess'

I am not sure what modelsprocess is, but according to the object file names, it should be YCB object files because the file name is called textured_simple_2000.obj while the YCB objects are called textured_simple.obj.

Questions:

  • How do you go from textured_simple.obj to textured_simple_2000.obj? It is my first mesh-based project. Could you kindly provide some instructions to generate the modelsprocess folder? (my email is [email protected])

Thank you,
Alex

FileNotFoundError: [Errno 2] No such file or directory: '/home/chen/datasets/HO3D_v2/modelsprocess'

Hi, thanks for update on ho3d_v2, I have 2 questions here:

  1. When I run python evalho3dv2.py --resume releasemodels/ho3dv2/realonly/checkpoint_200.pth --val_split test --json_folder jsonres/res, I got error below.It seems object models are neeeded.So could u share them?
    Traceback (most recent call last): File "evalho3dv2.py", line 156, in <module> main(args) File "evalho3dv2.py", line 56, in main has_dist2strong=True, File "/home/chen/PycharmProjects/handobjectconsist/meshreg/netscripts/get_dataset.py", line 41, in get_dataset full_sequences=False, File "/home/chen/PycharmProjects/handobjectconsist/meshreg/datasets/ho3dv2.py", line 260, in __init__ self.obj_meshes = ho3dfullutils.load_objects(os.path.join(self.root, "modelsprocess")) File "/home/chen/PycharmProjects/handobjectconsist/meshreg/datasets/ho3dfullutils.py", line 8, in load_objects object_names = [obj_name for obj_name in os.listdir(obj_root) if ".tgz" not in obj_name] FileNotFoundError: [Errno 2] No such file or directory: '/home/chen/datasets/HO3D_v2/modelsprocess'
  2. I got confuesd about that you could generate results to Codalab Competion ,as your method depicted in CVPR2020 paper couldn't handle unseen object in official evaluation set. And I remember u once upload one better result to Codalab Competition using method mentioned in your CVPR2019 paper.Could u spare some time to explain it?

Merci beaucoup!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.