Coder Social home page Coder Social logo

gaoyiyeah / spokestack-android Goto Github PK

View Code? Open in Web Editor NEW

This project forked from spokestack/spokestack-android

0.0 1.0 0.0 697 KB

Spokestack speech recognition pipeline for Android

License: Apache License 2.0

Java 97.03% Makefile 0.61% C++ 2.36%

spokestack-android's Introduction

Spokestack Android

Spokestack provides an extensible speech recognition pipeline for the Android platform. It includes a variety of built-in speech processors for Voice Activity Detection (VAD) and Automatic Speech Recognition (ASR) via popular speech recognition services such as the Google Speech API and Azure Speech API.

See the documentation for a lot more information than is in this brief introduction.

Status

CircleCI Coverage JCenter Javadocs License

Configuration

Google Speech API

SpeechPipeline pipeline = new SpeechPipeline.Builder()
    .useProfile("io.spokestack.spokestack.profile.VADTriggerGoogleASR")
    .setProperty("google-credentials", "<google-credentials>")
    .setProperty("locale", "en-US")
    .build();

This example uses a pre-built profile to create a speech recognition pipeline triggered by VAD that uses the Google Speech API for speech recognition. The google-credentials parameter should be the contents of a Google Cloud service account credentials file, in JSON format. For more information, see the documentation. See the javadoc for other component-specific configuration parameters.

Wakeword Detection

SpeechPipeline pipeline = new SpeechPipeline.Builder()
    .useProfile("io.spokestack.spokestack.profile.TFWakewordGoogleASR")
    .setProperty("wake-filter-path", "<tensorflow-lite-filter-path>")
    .setProperty("wake-encode-path", "<tensorflow-lite-encode-path>")
    .setProperty("wake-detect-path", "<tensorflow-lite-detect-path>")
    .setProperty("wake-threshold", 0.85)
    .setProperty("google-credentials", "<google-credentials>")
    .setProperty("locale", "en-US")
    .build();

This example creates a wakeword-triggered pipeline with the Google Speech recognizer. The wakeword trigger uses three trained TensorFlow Lite models: a filter model for spectrum preprocessing, an autoregressive encoder encode model, and a detect decoder model for keyword classification. For more information on the wakeword detector and its configuration parameters, click here.

The wake-threshold property is set by the TFWakewordGoogleASR profile, but it is overridden here to emphasize that properties set after a profile is applied (either directly in the builder or by another profile) supersede those set by that profile.

To use the demo "Spokestack" wakeword, download the TensorFlow Lite models: detect | encode | filter

Development

Maven is used for building/deployment, and the package is hosted at JCenter.

This package requires the Android NDK to be installed and the ANDROID_HOME and ANDROID_NDK_HOME variables to be set. On OSX, ANDROID_HOME is usually set to ~/Library/Android/sdk and ANDROID_NDK_HOME is usually set to ~/Library/Android/sdk/ndk/<version>.

ANDROID_NDK_HOME can also be specified in your local Maven settings.xml file as the android.ndk.path property.

Testing/Coverage

mvn test jacoco:report

Lint

mvn checkstyle:check

Release

Ensure that your Bintray credentials are in your user Maven settings.xml:

<servers>
    <server>
        <id>bintray-spokestack-io.spokestack</id>
        <username>username</username>
        <password>bintray_api_key</password>
    </server>
</servers>

On a non-master branch, remove the -SNAPSHOT suffix from the version in pom.xml, then run the following command. This will deploy the package to Bintray and JCenter.

mvn deploy

Revert your above change using git checkout ., then use the Maven release plugin to tag the release and advance the version number.

mvn release:clean release:prepare release:perform

For additional information about releasing see http://maven.apache.org/maven-release/maven-release-plugin/

License

Copyright 2020 Spokestack, Inc.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

  http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

spokestack-android's People

Contributors

brentspell avatar noelweichbrodt avatar space-pope avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.