Coder Social home page Coder Social logo

Comments (18)

PINTO0309 avatar PINTO0309 commented on July 21, 2024 2

I have updated all committed models again. As you can see in the picture above, I'm sure it will be detected correctly this time.
Try downloading the model again.

from pinto_model_zoo.

geaxgx avatar geaxgx commented on July 21, 2024 1

Thank you for you quick action @PINTO0309 ! I admire your job, I am totally unable to do the work you are doing !
The result is clearly better, not yet perfect :
new_model_2
Note also the outputs for score and handedness (0.999 and 0.953) seems reasonable values (it wasn't the case before).
I have to check the mediapipe github to see if the problem is on my side and also to know how to interpret the handedness value.

Yes, you can share the results, of course.

from pinto_model_zoo.

geaxgx avatar geaxgx commented on July 21, 2024 1

Instead of checking the mediapipe github to check if the interpretation of model outputs has changed between the old version and new version of the model, I have found it was easier to adapt my script to run directly with the tflite models (instead of openvino models).
So on one side, I use the latest tflite model from Mediapipe (this file: https://github.com/google/mediapipe/blob/master/mediapipe/models/hand_landmark.tflite). The result is correct :
mediapipe_tflite

On the other side, I use the model hand_landmark_new_256x256_float32.tflite downloaded with https://github.com/PINTO0309/PINTO_model_zoo/blob/master/033_Hand_Detection_and_Tracking/01_float32/download_new.sh. I get a result very similar to what I got with the Openvino model:
pinto_tflite

I guess some information is lost/corrupted during the conversion. What do you think @PINTO0309 ?

About the handedness output, according to the mediapipe github, the value given by the model is the probability for the hand to be a left hand. But according to my test, it seems to be the exact opposite (probability to be a right hand) :-) I am not the first one to notice this inversion : https://dev.classmethod.jp/articles/mediapipe-extract-handedness/

from pinto_model_zoo.

PINTO0309 avatar PINTO0309 commented on July 21, 2024 1

Thank you for providing the information. It looks like there is still a problem with how to make the model. I am converting the original 16-bit float quantized model straight, but some intermediate processing may be required. It is correct to doubt the model that the test code used is the same but the results are quite different.

I will consider the cause and countermeasures for a while.

from pinto_model_zoo.

geaxgx avatar geaxgx commented on July 21, 2024 1

Converting the new model seems challenging !
The question is: is it worth to put much effort in that task ?

Well, I don't know. It is up to you. The older model you converted is already very good and satisfying for me.
But from the few tests I have done, the new model seems more accurate. For instance, on the picture below (left image is with your older tflite model, right image is with original mediapipe last tflite model), we can see that the finger tips are more accurate :
cmp
About handedness, I imagine some applications where it can be useful, but I also have noticed it is not 100% accurate.

from pinto_model_zoo.

geaxgx avatar geaxgx commented on July 21, 2024 1

Fantastic job @PINTO0309 ! Ah ah you couldn't wait for the weekend, could you ?
Thank you very much !

from pinto_model_zoo.

PINTO0309 avatar PINTO0309 commented on July 21, 2024

My model was converted from a .tflite model that I got from the MediaPipe repository 4 months ago. It looks like the shape of the output is different from the latest model. Once I've finished implementing the Python sample program for BlazePose Full Body, I'll try out the conversion on the latest model. Please be patient.

Modeled 4 months ago (no quantization)

https://github.com/google/mediapipe/blob/c27a7c1e1064e66427ad0656f776155ab456f90b/mediapipe/models/hand_landmark_3d.tflite
Screenshot 2020-09-17 00:11:51

Latest model (Float16 quantized)

https://github.com/google/mediapipe/blob/master/mediapipe/models/hand_landmark.tflite
Screenshot 2020-09-17 00:12:24

from pinto_model_zoo.

geaxgx avatar geaxgx commented on July 21, 2024

Thank you for your fast reply ! Sure, I will wait.
By the way, I am also interested in your work on Blazepose in order to use it on OAKD ;-)

from pinto_model_zoo.

PINTO0309 avatar PINTO0309 commented on July 21, 2024

I'm being monitored by the developers of OAK-D, so maybe it will be released soon. (Just a guess.)

from pinto_model_zoo.

geaxgx avatar geaxgx commented on July 21, 2024

It is the OAK-D developpers team who gave me your github link ! I think they are already using your work for pose estimation (007_mobilenetv2-poseestimation). But Blazepose seems more accurate, that's why I will probably try to use it ! Thx !

from pinto_model_zoo.

PINTO0309 avatar PINTO0309 commented on July 21, 2024

I have added the latest model that can distinguish between right and left hands.
https://github.com/PINTO0309/PINTO_model_zoo/tree/master/033_Hand_Detection_and_Tracking

from pinto_model_zoo.

geaxgx avatar geaxgx commented on July 21, 2024

Thank you for your effort @PINTO0309 !
I have downloaded your latest model (openvino version) and adapted my script a bit to take into account that now the network have 3 outputs :
Output blob: StatefulPartitionedCall/functional_1/tf_op_layer_Sigmoid/Sigmoid - shape: [1, 1, 1, 1]
Output blob: StatefulPartitionedCall/functional_1/tf_op_layer_Sigmoid_1/Sigmoid_1 - shape: [1, 1, 1, 1]
Output blob: StatefulPartitionedCall/functional_1/tf_op_layer_ld_21_3d/ld_21_3d - shape: [1, 63]
I don't know yet which output is the score and which output is the handedness, but a least, it is clear that the 3rd output corresponds to the landmarks.
I suppose that the processing to display the landmarks is the same for the previous and new model (am I wrong ?).
Unfortunately the landmarks don't seem correct :
new_model

This what I get with the older model:
old_model

Do you have an idea on what could be wrong ?

from pinto_model_zoo.

PINTO0309 avatar PINTO0309 commented on July 21, 2024

I'll check the certainty of the model's construction once. Please wait.

from pinto_model_zoo.

PINTO0309 avatar PINTO0309 commented on July 21, 2024

@geaxgx
I'm sorry. I found a total of 4 mistakes. I'm going to re-convert and commit immediately after this.

  1. remove the useless activation function given to Conv2D
  2. wrongly set weights

from pinto_model_zoo.

PINTO0309 avatar PINTO0309 commented on July 21, 2024

@geaxgx
Errors have been corrected and the model has been re-registered. The URL for download has not changed, please download it again and try again. If you'll allow me, I'd be very happy to share the results with you.

from pinto_model_zoo.

PINTO0309 avatar PINTO0309 commented on July 21, 2024

What I'm a little worried about is whether there is a difference between the results of my float16 quantized model and the official float16 quantized model. I haven't come home yet so I haven't been able to confirm it in detail, but it's very strange because the weights of the first layer appear to be converted correctly.

https://github.com/PINTO0309/PINTO_model_zoo/tree/master/033_Hand_Detection_and_Tracking/04_float16_quantization

official
Screenshot_20200923-211742

mine
Screenshot_20200923-211820

from pinto_model_zoo.

PINTO0309 avatar PINTO0309 commented on July 21, 2024

I built a sample app for the MediaPipe desktop, imported my own Float16 model and ran it. It still appears to be an error in my model structure. The expectation is to get exactly the same result, but it didn't work out that way. It is difficult to find enough time to work on the model during the weekdays, so we will check the structure of the model again this weekend.
Screenshot 2020-09-23 22:40:58

from pinto_model_zoo.

PINTO0309 avatar PINTO0309 commented on July 21, 2024

I have identified the problem areas. We'll run the conversion again and recommit.
Screenshot 2020-09-23 23:21:34

from pinto_model_zoo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.