Coder Social home page Coder Social logo

sign-language-recognition-with-rnn-and-mediapipe's Introduction

Sign language recognition with RNN and Mediapipe

with multi hand tracking

Sign language gesture recognition using a reccurent neural network(RNN) with Mediapipe hand tracking.

This project is for academic purpose. Thank you for Google's Mediapipe team :)

Data Preprocessing with Mediapipe (Desktop on CPU)

Create training data on Desktop with input video using Multi Hand Tracking. Gesture recognition with deep learning model can be done with hand landmark features per frame with RNN training .

CUSTOMIZE:

  • Use video input instead of Webcam on Desktop to train with video data
  • Preprocess hand landmarks for every frame per one word and make it into one txt file

1. Set up Hand Tracking framework

  • Install Medapipe
  git clone https://github.com/google/mediapipe.git

See the rest of installation documents here.

  • Change end_loop_calculator.h file
  cd ~/mediapipe/mediapipe/calculators/core
  rm end_loop_calculator.h

to our new /end_loop_calculator.h file in the modified_mediapipe folder.

  • Change demo_run_graph_main.cc file
  cd ~/mediapipe/mediapipe/examples/desktop
  rm demo_run_graph_main.cc

to our new demo_run_graph_main.cc file in the modified_mediapipe folder.

  • Change landmarks_to_render_data_calculator.cc file
  cd ~/mediapipe/mediapipe/calculators/util
  rm landmarks_to_render_data_calculator.cc

to our new landmarks_to_render_data_calculator.cc file in the modified_mediapipe folder.

2. Create your own training data

Make train_videos for each sign language word in one folder. Use build.by file to your mediapipe directory.

  • Usage

To make mp4 file and txt file with mediapipe automatically, run

  python build.py --input_data_path=[INPUT_PATH] --output_data_path=[OUTPUT_PATH]

inside mediapipe directory.

IMPORTANT: Name the folder carefully as the folder name will be the label itself for the video data. (DO NOT use space bar or '_' to your video name ex) Apple_pie (X))

For example:

input_video
├── Apple
│   ├── IMG_2733.MOV
│   ├── IMG_2734.MOV
│   ├── IMG_2735.MOV
│   └── IMG_2736.MOV
└── Happy
    ├── IMG_2472.MOV
    ├── IMG_2473.MOV
    ├── IMG_2474.MOV
    └── IMG_2475.MOV
    ...

The output path is initially an empty directory, and when the build is complete, Mp4 and txt folders are extracted to your folder path.

Created folder example:

output_data
├── Absolute
│   └── Apple
│       ├── IMG_2733.txt
│       ├── IMG_2734.txt
│       ├── IMG_2735.txt
│       └── IMG_2736.txt
|       ...
├── Relative
│   └── Apple
│       ├── IMG_2733.txt
│       ├── IMG_2734.txt
│       ├── IMG_2735.txt
│       └── IMG_2736.txt
│       ...
└── _Apple
     ├── IMG_2733.mp4 
     ├── IMG_2734.mp4
     ├── IMG_2735.mp4
     └── IMG_2736.mp4

Your contribution is welcome here!

3. Train RNN model

  • Train
  python train.py --input_train_path=[INPUT_TRAIN_PATH] 

INPUT_TRAIN_PATH is the path to the output folder in the previous step. (either Relative or Absolute path) The model is saved as 'model.h5' in the current directory.

Watch this video for the overall workflow. more details

future work

  • import model to Xcode

sign-language-recognition-with-rnn-and-mediapipe's People

Contributors

kjonguk avatar rabbit64 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.