Coder Social home page Coder Social logo

ctc_beam_search_decoder's Introduction

CTC Beam Search Decoder

CTC decoder | C++ implementation | Python implementation

Recent Update

2020-05-22: Now both input shapes [batch_size x timesteps x num_classes] and [timesteps x batch_size x num_classes] are supported.

Python

The following code-skeleton gives a first impression of how to use the decoding algorithm with Python. More details can be found in the python_demo dir.

import numpy as np
from ctc_beam_seach_decoder_pywraper import ctc_beam_search_decoder

# Inputs' shape must be : timesteps x batch_size x num_classes
inputs = np.array([
                    [[0.6, 0.0, 0.0, 0.4, 0.0, 0.0 ]],
                    [[0.0, 0.5, 0.0, 0.5, 0.0, 0.0 ]],
                    [[0.0, 0.4, 0.0, 0.6, 0.0, 0.0 ]],
                    [[0.0, 0.4, 0.0, 0.1, 0.0, 0.5 ]],
                    [[0.0, 0.5, 0.0, 0.5, 0.0, 0.0 ]],
                    ]).astype(np.float32)
                    
# sequence_length shape must be [batch_size]
sequence_length = np.int32([inputs.shape[0]])

# decoded : [top_paths, batch_size, max_timestep]
# log_probabilities : [batch_size, top_paths]
decoded, log_probabilities = ctc_beam_search_decoder(inputs, sequence_length,
                                                 beam_width=50, top_paths=10)

C++

The following code-skeleton gives a first impression of how to use the decoding algorithm with C++. More details can be found in the cpp_demo dir.

#include "src/ctc_beam_search_decoder.h"

// Prepare Inputs
const int max_time      =  5;
const int batch_size    =  1;
const int num_classes   =  6;
const int beam_width    = 50;
const int top_paths     = 10;

float inputs[max_time][batch_size][num_classes] = {
        {{0.6, 0.0, 0.0, 0.4, 0.0, 0.0 }},
        {{0.0, 0.5, 0.0, 0.5, 0.0, 0.0 }},
        {{0.0, 0.4, 0.0, 0.6, 0.0, 0.0 }},
        {{0.0, 0.4, 0.0, 0.1, 0.0, 0.5 }},
        {{0.0, 0.5, 0.0, 0.5, 0.0, 0.0 }}
    };  

int sequence_length[batch_size] = {max_time};

// Prepare Outputs
int decoded[top_paths][batch_size][max_time];
float log_probabilities[batch_size][top_paths] = {{0.0f}};

int status = -1;
status = ctc_beam_search_decoder(   &inputs[0][0][0], 
                                    sequence_length,
                                    beam_width,
                                    top_paths,
                                    max_time,
                                    batch_size,
                                    num_classes,
                                    &decoded[0][0][0],
                                    &log_probabilities[0][0]);

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.