Coder Social home page Coder Social logo

iamjanvijay / rnnt_decoder_cuda Goto Github PK

View Code? Open in Web Editor NEW
65.0 3.0 9.0 191.38 MB

An efficient implementation of RNN-T Prefix Beam Search in C++/CUDA.

License: MIT License

C++ 41.23% Makefile 2.48% Cuda 42.57% Python 13.72%
cuda rnnt beam-search prefix-search transducer speech-recognition speech-to-text handwriting-recognition

rnnt_decoder_cuda's Introduction

RNN-Transducer Prefix Beam Search

This repository provides an optimised implementation of prefix beam search for RNN-Tranducer loss function (as described in "Sequence Transduction with Recurrent Neural Networks" paper). This implementation takes ~100 milliseconds for a speech segment of ~5 seconds and beam size of 10 (beam size of 10 is adequate for production level error rates).

Sample Run

To execute a sample run of prefix beam search on your machine, execute the following commands:

  1. Clone this repository.
git clone https://github.com/iamjanvijay/rnnt_decoder_cuda.git;
  1. Clean the output folder.
rm rnnt_decoder_cuda/data/outputs/*;
  1. Make the deocder object file.
cd rnnt_decoder_cuda/decoder;
make clean;
make;
  1. Execute the decoder - decoded beams will be saved to data/output folder.
CUDA_VISIBLE_DEVICES=0 ./decoder ../data/inputs/metadata.txt 0 9 10 5001;
CUDA_VISIBLE_DEVICES=$GPU_ID$ ./decoder ../data/inputs/metadata.txt $index_of_first_file_to_read_from_metadata$ $index_of_last_file_to read_from_metadata$ $beam_size$ $vocabulary_size_excluding_blank$;

Contributing

Contributions are welcomed and greatly appreciated.

rnnt_decoder_cuda's People

Contributors

iamjanvijay avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

rnnt_decoder_cuda's Issues

why not use jit?

Hi, quick question,

I noticed that you saved the pytorch weights for joint network and prediction network, and then reloaded them manually using cpp.

Why did you choose this way instead of JIT the files and upload them using torchscript? Is your way faster or something?

Thanks,
Wancong

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.