Coder Social home page Coder Social logo

ronlek / fastv2c-handnet Goto Github PK

View Code? Open in Web Editor NEW
10.0 4.0 2.0 45.41 MB

Repository for the implementation of "FastV2C-HandNet: Fast Voxel to Coordinate Hand Pose Estimation with 3D Convolutional Neural Networks"

Home Page: https://arxiv.org/abs/1907.06327

Python 78.40% M 21.60%
hand-pose-estimation fastv2c-handnet deep-learning computer-vision 3d-pose-estimation 3d-hand-pose depth-images 3d-convolutional-network

fastv2c-handnet's Introduction

FastV2C-HandNet : Fast Voxel to Coordinate Hand Pose Estimation with 3D Convolutional Neural Networks

Introduction

This is the project repository for the paper, FastV2C-HandNet : Fast Voxel to Coordinate Hand Pose Estimation with 3D Convolutional Neural Networks (Springer).

Please refer to our paper for details.

If you find our work useful in your research or publication, please cite our work:

[1] Rohan Lekhwani, Bhupendra Singh. "FastV2C-HandNet : Fast Voxel to Coordinate Hand Pose Estimation with 3D Convolutional Neural Networks"[Springer]

Lekhwani, Rohan, and Bhupendra Singh. 
"FastV2C-HandNet: Fast Voxel to Coordinate Hand Pose Estimation with 3D Convolutional Neural Networks." 
International Conference on Innovative Computing and Communications. 
Springer, Singapore, 2019.

In this repository, we provide

  • Our model architecture description (FastV2C-HandNet)
  • Comparison with the previous state-of-the-art methods
  • Training code
  • Dataset we used (MSRA)
  • Trained models and estimated results

Model Architecture

FastV2C-HandNet

Comparison with the previous state-of-the-art methods

Paper_result_hand_table

Paper_result_v2v-posenet_table

About our code

Dependencies

The code is tested under Ubuntu 18.04, Windows 10 environment with Nvidia P100 GPU (16GB VRAM).

Code

Clone this repository into any place you want. You may follow the example below.

makeReposit = [/the/directory/as/you/wish]
mkdir -p $makeReposit/; cd $makeReposit/
git clone https://github.com/RonLek/FastV2C-HandNet.git
  • src folder contains python script files for data loader, trainer, tester and other utilities.
  • data folder should contain an 'MSRA' folder with binary image files.

To train our model, please run the following command in the src directory:

python train.py

Dataset

We trained and tested our model on the MSRA Hand Pose Dataset.

Results

Here we provide the precomputed centers, estimated 3D coordinates and pre-trained models of MSRA dataset. You can download precomputed centers and 3D hand pose results in here and pre-trained models in here

The precomputed centers are obtained by training the hand center estimation network from DeepPrior++ . Each line represents 3D world coordinate of each frame. In case depth map does not exist or not contain hand, that frame is considered as invalid. All test images are considered as valid.

We used awesome-hand-pose-estimation to evaluate the accuracy of the FastV2C-HandNet on the MSRA dataset.

Belows are qualitative results. result_1 result_2

fastv2c-handnet's People

Contributors

ronlek avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

2113vm swipswaps

fastv2c-handnet's Issues

wrong number of predictions for msra?

I am using the test.py script to generate the predictions so I can later evaluate the model using the https://github.com/xinghaochen/awesome-hand-pose-estimation framework.

For msra the total number of predictions in the generated txt file is 8496 while the expected number of predictions in the https://github.com/xinghaochen/awesome-hand-pose-estimation is 76375.

Therefore i am getting an error while trying to generate the scores :

ValueError: operands could not be broadcast together with shapes (76375,21,3) (8496,21,3)

Shouldn't the test.py script generate a bigger list of predictions for this dataset? Or am I doing something wrong?

how to speed up training?

Environment: Ubuntu RTX 2080ti 11g

When I try to train with MSRA dataset, I find that the training speed is very slow, and the batch size can only be set to 2. How can I increase the size of batch size? Why can a video card with 11g memory only read 2 batch sizes? How can I improve my training speed? How many epochs do you need to train? Thank you for response.

Evalution Script

Hi,

Thank you for open-sourcing this project.

How can I evaluate the model with awesome-hand-pose-estimation? It seems the test.py can only generate 2124 samples, while the awesome-hand-pose-estimation requires the whole MSRA dataset.

Is it possible to enable the whole dataset evaluation or individually?

Thanks

Error while run train.py

Hello!
I was trying to reproduce experiments with your solution. When I run train.py I have this error:

Traceback (most recent call last):
  File "train.py", line 145, in <module>
    net = model_inst(input_channels = 1, output_channels = keypoints_num) 
  File "/home/oriuser/mounted/src/mymodel.py", line 176, in model_inst
    x = Dense(44, activation = 'relu')(x) #Changed
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 591, in __call__
    self._maybe_build(inputs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 1881, in _maybe_build
    self.build(input_shapes)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/layers/core.py", line 1005, in build
    raise ValueError('The last dimension of the inputs to `Dense` '
ValueError: The last dimension of the inputs to `Dense` should be defined. Found `None`.

I tried to fix that commenting out this line in mymodel.py

x = Reshape((output_channels, -1))(x)

And got an error:

Traceback (most recent call last):
  File "train.py", line 177, in <module>
    history = net.fit_generator(train_set, steps_per_epoch = steps_per_epoch_train, epochs = epochs_num, verbose = 1, callbacks = [cp_callback], validation_data = val_set, validation_steps = steps_per_epoch_val, workers = 0, use_multiprocessing = False, shuffle = True, initial_epoch = 0) #Changed
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 1433, in fit_generator
    steps_name='steps_per_epoch')
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training_generator.py", line 264, in model_iteration
    batch_outs = batch_function(*batch_data)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 1153, in train_on_batch
    extract_tensors_from_dataset=True)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 2692, in _standardize_user_data
    y, self._feed_loss_fns, feed_output_shapes)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training_utils.py", line 549, in check_loss_and_target_compatibility
    ' while using as loss `' + loss_name + '`. '
ValueError: A target array with shape (4, 21, 3) was passed for an output of shape (None, 21, 22, 22, 3) while using as loss `mean_squared_error`. This loss expects targets to have the same shape as the output.

Could you please specify tensorflow and keras versions? (I guess this is the issue: I use different versions of tensorflow)

If not, is there any suggestion to fix those errors?

Why the functions pixel2world and world2pixel are implemented differently for different datasets?

In V2V github website "https://github.com/mks0601/V2V-PoseNet_RELEASE". I find the functions pixel2world and world2pixel are implemented differently for different datasets.
In dataset MSRA: world2pixel(x,y,z) local pixelY = imgHeight/2 - fy * torch.cdiv(y,z)
In dataset ICVL: world2pixel(x,y,z) local pixelY = imgHeight/2 + fy * torch.cdiv(y, z)
So can you provide your ICVL dataset related code? The formulate is whether right? Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.