Coder Social home page Coder Social logo

asl-transcription's Introduction

ASL Transcription in Augmented Reality

Demo Link

Youtube Link

Lens Link

💡Inspiration

Inspiration came from one of my online friends who is deaf. He likes to create content on TikTok and YouTube but doesn't like the fact that he would need to write captions to convey his thoughts to his audience. I created this ASL transcriber to hopefully help him create content by allowing him to express his thoughts using American Sign Language. The lens would automatically convert the letters into text, letting him create content without worrying about writing captions for every single action.

Apart from helping transcribe ASL in real time. This lens could also be used as a learning tool for ASL. Real time transcription allows the user to practice speed and accuracy without the need for an interpreter.

💻What it does

• Uses Machine Learning to classify various hand motions into American Sign Language in real time.

• Displays the text in augmented reality.

• High Accuracy allows for more precise transcriptions.

• Tapping Screen toggles live predictions on and off.

Predicting the N letter:

Prediction

🛠️How we built it

• Lens Studio

• Javascript

• Created model that incorporated transfer learning and image augmentation.

• Tensorflow Hub for MobileNetV2 architecture machine learning model pretrained on ImageNet1k MobileNetV2 based model created by Google

• American Sign Language Dataset created by David Lee on roboflow: Source

• Tensorflow and Keras API to fine tune model: Link to Model used in this lens

• Model had a final validation accuracy of 94% on the dataset:

Model Results

🛑Challenges we ran into

• Originally, the built in MobileNet and EfficientNet Models has problems importing into Lens Studio. Spent over a week's time creating a model from scratch before finding a model on Tensorflow Hub that imported successfully.

• Lens Studio's API and Template Documentation was a bit confusing, took a while to fully understand.

• Len's studio would often crash while doing preview, requiring a force quit to restart the program.

Original Dataset Turned out to be not official American Sign Language. Hence the high validation accuracy but low real-world accuracy. After switching to David Lee's Dataset, real world accuracy became much higher.

• David Lee's dataset was very small, requiring heavy image augmentation to train a properly fitted model. Even then, some poses were unable to be recognized by the model, requiring slight shifts in posture for the model to recognize.

Sample of Dataset

• Original Image:

Original

• Augmented Images:

Augmented

✅Accomplishments that we're proud of

• Successfully training and implementing a Machine Learning Model in an application

• Used heavy image augmentation to expand the limited dataset.

• Deploying a model for the first time in a brand-new environment and editor.

• By using Hand Tracking, it gives the model a more precise input and also allows the lens to deactivate the model when there is no hand on the screen, preventing erroneous predictions.

📖What we learned

• Various forms of Image Augmentation

• More ways of using Keras and Tensorflow API such as saving model as .onnx and TFLite file.

• How to use Lens Studio

• JavaScript scripting

• ASL

⚠️ Known problems

• Due to the small dataset used to train the model, some hand poses are not correctly classified. Would need a larger dataset to correct this issue.

• Some letters are very similar, where the model struggles. Examples include (A/S/E) (M/N/V)

• Due to J and Z requiring movement, the model is not very accurate at classifying those letters.

🛣️ Future Plans

• Once a larger dataset becomes available, re-train model for more accurate real-world performance

• Convert Python Word Ninja to JavaScript in order to probabilistically split concatenated words. However, this is unfeasible at this time due to the chance of incorrectly predicted letters.

Word Ninja Usage:

Split

asl-transcription's People

Contributors

idkwhatimd0ing avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.