Coder Social home page Coder Social logo

mreliptik / handpose Goto Github PK

View Code? Open in Web Editor NEW
232.0 10.0 67.0 336.99 MB

A python program to detect and classify hand pose using deep learning techniques

License: MIT License

Python 88.60% Shell 0.51% Starlark 10.88%
python opencv ssd neural-network handpose pose-estimation cnn keras tensorflow

handpose's Introduction

HandPose

[✔ WORKING]
(See the TODO list below for future improvements)

A program to recognize hand pose from an RGB camera.

testing different poses

Getting Started

These instructions will help you setting up the project and understanding how the software is working. You'll see the file structure and what each file does.

Requirements

See the requirements.txt file or simply run:

pip install -r requirements.txt

File structure

.
├── cnn                                              Contains the cnn architecture and the models.
│       └── models                               Trained models.
├── Examples
├── hand_inference_graph
├── model-checkpoint
├── Poses                                            The poses dataset. Each pose will have its folder.
│       ├── Fist
│       ├── Four
│       ├── Garbage
│       ├── Palm
│       ├── Rock
│       └── Startrek
├── protos ├── Results
└── utils

Running the hand pose recognition

To run the multithreaded hand pose recognition, simply run:

python HandPose.py

Downloading the dataset

The mediafire link is here: http://www.mediafire.com/file/wt7dc5e9jgnym04/Poses.tar.gz/file Download, and extract the Poses folder that you then place in the root of the Handpose folder.

OR, on Linux, just run:

./download_dataset.sh

this will download, extract the files and remove the archive file.

Adding a new pose

To add a new pose, launch the AddPose.py script doing:

python AddPose.py

You will then be prompted to make a choice. Type '1' and 'enter'. Now you can enter the name of your pose and validate with 'enter':

Do you want to :
    1 - Add new pose
    2 - Add examples to existing pose
    3 - Add garbarge examples
1

Enter a name for the pose you want to add :
Example

You'll now be prompted to record the pose you want to add.
             Please place your hand beforehand facing the
             camera, and press any key when ready.
             When finished press 'q'.

Place your hand facing the camera, doing the pose you want to save and press enter when ready. You'll see the camera feed. Move your hand slowly across the frame, closer and further from the camera. Try to rotate a bit your pose. Do every movement slowly as you want to create ghosting.
You can record for as long as you want, but remember that camera_fps x seconds_of_recording images will be generated.

See an example below:

recording startrek pose

Then you want to head to the new pose folder situated in Poses/name_of_your_pose/name_of_your_pose_1 and manually delete images that doesn't show well your hand pose.

You can optionnally bulk rename them once you finished cleaning but note that it's not required.

Once that is done you want to normalize those newly created images. Launch normalize.py with:

python normalize.py

This script will go to the poses folder and make sure every images is the right size. It will skip those that are already 28x28.

You then have to retrain the network. For that, open the file situated in 'cnn/cnn.py' and edit the hyperparameters and the model file name if needed. The saved model will be situated in 'cnn/models/'

You don't have to specifiy the number of classes, it will be infered from the number of directories under 'Poses/'.

Launch the training with:

python cnn/cnn.py

Adding garbage examples

Garbage examples are examples where you face the camera and don't do any special hand pose. You want to show your hand, move them around, but don't do any of your poses. The goal is for the SSD to detect some hands and also some false positives. This will generate images that aren't any pose, they are garbage. We do that because we don't want our CNN to missclasify every time a hand is seen.

Launch the AddPose.py script doing:

python AddPose.py

You will then be prompted to make a choice. Type '3' and 'enter'.

Do you want to :
    1 - Add new pose
    2 - Add examples to existing pose
    3 - Add garbarge examples
3
You'll now be prompted to record the pose you want to add.
             Please place your hand beforehand facing the
             camera, and press any key when ready.
             When finished press 'q'.

Same thing as before, press 'enter' to start the recording and stop with 'q'. Then normalize and relaunch training.

Architecture

Pipeline

The pipeline of this project consists of 4 steps :

project's pipeline

  • A frame is grabbed from the camera by a dedicated thread, converted to RGB (from BGR) and put into the input queue
  • A worker grabs the frame from the queue and pass it into the SSD. This gives us a bouding box of where the hand(s) is and the corresponding cropped frame.
  • This cropped frame of the hand is then passed to the CNN, which give us a class vector output of values between 0 and 1. These values correspond to the probability of the frame to be one of the classes. The worker has finished its job and put: the frame with bouding box drawn on top, the cropped frame and the classes into three different queues.
  • The main thread, responsible of showing the results can grab the informations from the queues and display them in three windows.

CNN architecture

cnn architecture
Input image 28x28x1 (grayscale). Two convolutionnal layers with ReLu activation and kernel size 3, followed by a 2x2 max pooling. Finally a 128 dense layer followed by a softmax layer to give the 6-classes prediction.

SSD architecture

ssd architecture Note: This photo represents the original SSD architecture which uses VGG16 as a feature extractor. The one used in this project is using MobileNet instead.

For more information on the SSD, head to the references

Performance

With 4 workers, I achieved 25fps on a intel i5-8300H running @4Ghz.

Troubleshoot

Tensorflow 2

If using Tensorflow 2, replace import tensorflow as tf with import tensorflow.compat.v1 as tf and add tf.disable_v2_behavior() at the beginning of the script.

Adding a pose: video codec

When trying to add a pose with AddPose.py, if the video is not being written, try to change the codec from XVID to MJPG in a .mp4 container.

Replace:

fourcc = cv2.VideoWriter_fourcc(*'XVID')

With

fourcc = cv2.VideoWriter_fourcc('M','J','P','G') 

with '.mp4' as the extension.

TODO

  • ⌛ Improve hand detection of the SSD
  • Add instructions for garbage pose
  • Update file structure
  • Generate requirements.txt
  • Clean imports
  • Add explanations on the pipeline
  • Remove garbage as a choice when adding more example to existing pose
  • Add SSD architecture
  • Add NN architecture
  • Understand why multithreading doesn't work on linux
  • See if Keras is the right version (Windows and Linux)
  • Fix multi-threaded detection
  • Add more examples to each gesture
  • Add interface to live see inference from network
  • Test model
  • Tweak training/structure of CNN

Author

Want to support me? Buy me a coffee!

Buy Me a Coffee at ko-fi.com

References

License

This project is licensed under the MIT License - see the LICENSE file for details

handpose's People

Contributors

imgbotapp avatar mreliptik avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

handpose's Issues

Sign Language Dataset

Have you considered adding and testing a Sign Language Dataset?
PS: Fantastic work!

video saving issue and citation

Hello,
Thank you for this great work! I'm looking to use this to add features to a classifier I'm developing for ASL phonology -- if we end up using this repository, how would you prefer to be cited in a paper?

Also, I'm running into a problem in retraining. When I use AddPose.py, after I record a video it says "processed 1 frame(s)" and when I look in the poses folder the .avi video is not readable and no images have been taken out from it. This causes errors when I try to run cnn.py. Any idea what could cause this? The AddPose script itself doesn't raise any errors.

(I am using a slightly newer version on tensorflow, 1.13 not 1.12, but didn't think this should be a problem). I'm running a new macbook pro.

`python3 AddPose.py 
/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.6 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.7
  return f(*args, **kwds)
Do you want to : 
 1 - Add new pose                             
 2 - Add examples to existing pose                             
 3 - Add garbage examples
1
Enter a name for the pose you want to add :
7
You'll now be prompted to record the pose you want to add. 
                 Please place your hand beforehand facing the camera, and press any key when ready. 
                 When finished press 'q'.

2019-05-20 09:52:32.689 Python[18449:46975865] ApplePersistenceIgnoreState: Existing state will not be touched. New state will be written to (null)
>> loading frozen model..
> ====== loading HAND frozen graph into memory
2019-05-20 09:52:50.986901: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
>  ====== Hand Inference graph loaded.
>> model loaded!
   Processed 1 frames!`

EDIT:
I solved my issue with the video saving. The problem was that the default width and height on my machine were different from the hardcoded values. I fixed this by adding this line before the "out" assignment in AddPose:

`width = 0
    height = 0
    if cap.isOpened(): 
        width = int(cap.get(3))  # float
        height = int(cap.get(4)) # float`

CNN architecture invalid

As @vincelui21 pointed out to me by email, the default padding value is 'none' so the output size should decrease to 26x26 and not 28x28. This propagates to the other conv layers as well.
See keras.summary() output below

`>>> model.summary()


Layer (type) Output Shape Param #

conv2d_1 (Conv2D) (None, 26, 26, 32) 320


conv2d_2 (Conv2D) (None, 24, 24, 64) 18496


max_pooling2d_1 (MaxPooling2 (None, 12, 12, 64) 0


dropout_1 (Dropout) (None, 12, 12, 64) 0


flatten_1 (Flatten) (None, 9216) 0


dense_1 (Dense) (None, 128) 1179776


dropout_2 (Dropout) (None, 128) 0


dense_2 (Dense) (None, 6) 774

Total params: 1,199,366
Trainable params: 1,199,366
Non-trainable params: 0
_________________________________________________________________`

CNN architecture image should be fixed.

[ WARN:0] videoio(MSMF): can't grab frame.

When I try to run the program according to the given instruction. The following error is thrown.

Using TensorFlow backend. [ WARN:0] videoio(MSMF): can't grab frame. Error: -2147483638 320.0 180.0 {'im_width': 320.0, 'im_height': 180.0, 'score_thresh': 0.18, 'num_hands_detect': 1} Namespace(display=1, fps=1, height=200, num_hands=1, num_workers=4, queue_size=5, video_source=0, width=300) Four Garbage Rock Startrek Fist Palm

please can I know the solution for this problem.
Thanking in advance

No error no output

Hi when I run HandPose.py script I have got output below but there is no output, display scene got frozen.

`
...

====== loading HAND frozen graph into memory

loading frozen model for worker
====== loading HAND frozen graph into memory
loading frozen model for worker
====== loading HAND frozen graph into memory
2019-03-25 08:48:20.582748: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-03-25 08:48:20.584614: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-03-25 08:48:20.589856: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-03-25 08:48:20.757656: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-03-25 08:48:20.758410: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties:
name: GeForce GTX 1060 with Max-Q Design major: 6 minor: 1 memoryClockRate(GHz): 1.48
pciBusID: 0000:01:00.0
totalMemory: 5.94GiB freeMemory: 5.31GiB
2019-03-25 08:48:20.758427: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1060 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 6.1)
2019-03-25 08:48:20.769285: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-03-25 08:48:20.769631: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties:
name: GeForce GTX 1060 with Max-Q Design major: 6 minor: 1 memoryClockRate(GHz): 1.48
...`

images with 2 hands

I want to use this on images which sometimes have two hands, which may or may not have separate shapes.
What would I need to edit to make that work?

Unable to run code

The model can be loaded normally, and the camera can be opened, but the running result is gray without pattern
987452F92799A163AE08AFA7E3423E4F
B527610564F1BA630E241B0E3366D9D2

Poses Folder

Great repo! One question though. Everything is working well, but I want to add an image and retrain the model. However I see that the Poses folder is not in the repository. How can I retrain the model without the data? Is there a place to get the Poses data folder. Thank you!

why not GPU?

amazing work~

I supposed it could do inference faster if I use tensorflow-gpu, but failed.
The error shows:

(handpose) D:\1_Dev\HandPose>python HandPose.py
Traceback (most recent call last):
File "C:\Users\kinga\Anaconda3\envs\handpose\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 58, in
from tensorflow.python.pywrap_tensorflow_internal import *
File "C:\Users\kinga\Anaconda3\envs\handpose\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 28, in
_pywrap_tensorflow_internal = swig_import_helper()
File "C:\Users\kinga\Anaconda3\envs\handpose\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
File "C:\Users\kinga\Anaconda3\envs\handpose\lib\imp.py", line 243, in load_module
return load_dynamic(name, filename, file)
File "C:\Users\kinga\Anaconda3\envs\handpose\lib\imp.py", line 343, in load_dynamic
return _load(spec)
ImportError: DLL load failed: 找不到指定的模块。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "HandPose.py", line 1, in
from utils import detector_utils as detector_utils
File "D:\1_Dev\HandPose\utils\detector_utils.py", line 4, in
import tensorflow as tf
File "C:\Users\kinga\Anaconda3\envs\handpose\lib\site-packages\tensorflow_init_.py", line 24, in
from tensorflow.python import pywrap_tensorflow # pylint: disable=unused-import
File "C:\Users\kinga\Anaconda3\envs\handpose\lib\site-packages\tensorflow\python_init_.py", line 49, in
from tensorflow.python import pywrap_tensorflow
File "C:\Users\kinga\Anaconda3\envs\handpose\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 74, in
raise ImportError(msg)
ImportError: Traceback (most recent call last):
File "C:\Users\kinga\Anaconda3\envs\handpose\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 58, in
from tensorflow.python.pywrap_tensorflow_internal import *
File "C:\Users\kinga\Anaconda3\envs\handpose\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 28, in
_pywrap_tensorflow_internal = swig_import_helper()
File "C:\Users\kinga\Anaconda3\envs\handpose\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
File "C:\Users\kinga\Anaconda3\envs\handpose\lib\imp.py", line 243, in load_module
return load_dynamic(name, filename, file)
File "C:\Users\kinga\Anaconda3\envs\handpose\lib\imp.py", line 343, in load_dynamic
return _load(spec)
ImportError: DLL load failed: 找不到指定的模块。

Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/errors

for some common reasons and solutions. Include the entire stack trace
above this error message when asking for help.

Tensorflow 2.1

Hey, thanks for sharing this project. Any advice on getting this up and running with Tensorflow 2.1? Switching the import prompt across the various files still results in a number of errors.

about testing fps

Thanks for your great work!
when running your code, i encounter some problem:

My os is win10, intel core i5-6500 CPU, 3.2GHZ, 8GB. And I'm using Intel Realsense camera to stream color images. Is this fps resonable? How should I boost the speed?

thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.