Coder Social home page Coder Social logo

ronlek / fastv2c-handnet Goto Github PK

View Code? Open in Web Editor NEW
10.0 4.0 2.0 45.41 MB

Repository for the implementation of "FastV2C-HandNet: Fast Voxel to Coordinate Hand Pose Estimation with 3D Convolutional Neural Networks"

Home Page: https://arxiv.org/abs/1907.06327

Python 78.40% M 21.60%
hand-pose-estimation fastv2c-handnet deep-learning computer-vision 3d-pose-estimation 3d-hand-pose depth-images 3d-convolutional-network

fastv2c-handnet's Issues

how to speed up training?

Environment: Ubuntu RTX 2080ti 11g

When I try to train with MSRA dataset, I find that the training speed is very slow, and the batch size can only be set to 2. How can I increase the size of batch size? Why can a video card with 11g memory only read 2 batch sizes? How can I improve my training speed? How many epochs do you need to train? Thank you for response.

Error while run train.py

Hello!
I was trying to reproduce experiments with your solution. When I run train.py I have this error:

Traceback (most recent call last):
  File "train.py", line 145, in <module>
    net = model_inst(input_channels = 1, output_channels = keypoints_num) 
  File "/home/oriuser/mounted/src/mymodel.py", line 176, in model_inst
    x = Dense(44, activation = 'relu')(x) #Changed
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 591, in __call__
    self._maybe_build(inputs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 1881, in _maybe_build
    self.build(input_shapes)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/layers/core.py", line 1005, in build
    raise ValueError('The last dimension of the inputs to `Dense` '
ValueError: The last dimension of the inputs to `Dense` should be defined. Found `None`.

I tried to fix that commenting out this line in mymodel.py

x = Reshape((output_channels, -1))(x)

And got an error:

Traceback (most recent call last):
  File "train.py", line 177, in <module>
    history = net.fit_generator(train_set, steps_per_epoch = steps_per_epoch_train, epochs = epochs_num, verbose = 1, callbacks = [cp_callback], validation_data = val_set, validation_steps = steps_per_epoch_val, workers = 0, use_multiprocessing = False, shuffle = True, initial_epoch = 0) #Changed
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 1433, in fit_generator
    steps_name='steps_per_epoch')
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training_generator.py", line 264, in model_iteration
    batch_outs = batch_function(*batch_data)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 1153, in train_on_batch
    extract_tensors_from_dataset=True)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 2692, in _standardize_user_data
    y, self._feed_loss_fns, feed_output_shapes)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training_utils.py", line 549, in check_loss_and_target_compatibility
    ' while using as loss `' + loss_name + '`. '
ValueError: A target array with shape (4, 21, 3) was passed for an output of shape (None, 21, 22, 22, 3) while using as loss `mean_squared_error`. This loss expects targets to have the same shape as the output.

Could you please specify tensorflow and keras versions? (I guess this is the issue: I use different versions of tensorflow)

If not, is there any suggestion to fix those errors?

Evalution Script

Hi,

Thank you for open-sourcing this project.

How can I evaluate the model with awesome-hand-pose-estimation? It seems the test.py can only generate 2124 samples, while the awesome-hand-pose-estimation requires the whole MSRA dataset.

Is it possible to enable the whole dataset evaluation or individually?

Thanks

Why the functions pixel2world and world2pixel are implemented differently for different datasets?

In V2V github website "https://github.com/mks0601/V2V-PoseNet_RELEASE". I find the functions pixel2world and world2pixel are implemented differently for different datasets.
In dataset MSRA: world2pixel(x,y,z) local pixelY = imgHeight/2 - fy * torch.cdiv(y,z)
In dataset ICVL: world2pixel(x,y,z) local pixelY = imgHeight/2 + fy * torch.cdiv(y, z)
So can you provide your ICVL dataset related code? The formulate is whether right? Thank you!

wrong number of predictions for msra?

I am using the test.py script to generate the predictions so I can later evaluate the model using the https://github.com/xinghaochen/awesome-hand-pose-estimation framework.

For msra the total number of predictions in the generated txt file is 8496 while the expected number of predictions in the https://github.com/xinghaochen/awesome-hand-pose-estimation is 76375.

Therefore i am getting an error while trying to generate the scores :

ValueError: operands could not be broadcast together with shapes (76375,21,3) (8496,21,3)

Shouldn't the test.py script generate a bigger list of predictions for this dataset? Or am I doing something wrong?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.