Coder Social home page Coder Social logo

skit-ai / speech-recognition Goto Github PK

View Code? Open in Web Editor NEW
20.0 5.0 5.0 458 KB

SDKs and docs for Skit's speech to text service

License: Apache License 2.0

Python 91.36% Java 8.64%
asr speech-recognition-api speech-recognition multilingual-speech-recognition speech-to-text

speech-recognition's Introduction

Speech-to-Text API

Converts audio to text

We support these ten indian languages (language codes).

  • Hindi
  • English
  • Marathi
  • Kannada
  • Malayalam
  • Bengali
  • Gujarati
  • Punjabi
  • Telugu
  • Tamil

Authentication

To get access to our APIs reach out to us at [email protected] We do not provide public access token for the APIs anymore.

Ways to use the Service

  • Transcribing short audios [audios upto 1 min]
  • Transcribing long audios [more than 1 min]
  • Transcribing audio from streaming input

We recommend that you call this service using Vernacular provided client libraries. If your application needs to call this service using your own libraries, you should use the HTTP Endpoints.

Supported SDKs: Python

REST Reference

ServiceHost: https://asr.vernacular.ai

Speech Recognition

Name Description
recognize Performs synchronous speech recognition: receive results after all audio has been sent and processed.
longrunningrecognize Performs asynchronous speech recognition. Generally used for long audios

RPC Reference

Speech Recognition

Methods Description
Recognize Performs synchronous speech recognition: receive results after all audio has been sent and processed.
LongRunningRecognize Performs asynchronous speech recognition: receive results via the longrunning.Operations interface.
StreamingRecognize Performs streaming speech recognition: receive results while sending audio. Supports both unidirectional and bidirectional streaming.

speech-recognition's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

speech-recognition's Issues

Making more models accessible

There are many experimental kaldi-serve models (like phoneme decoders) that are now being used widely in the team. What kind of effort would be needed make these models accessible through this API ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.