Coder Social home page Coder Social logo

spokestack / spokestack-android Goto Github PK

View Code? Open in Web Editor NEW
66.0 6.0 5.0 1.28 MB

Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!

Home Page: https://spokestack.io

License: Apache License 2.0

Java 98.11% Makefile 0.38% C++ 1.51%
speech-recognition android voice-assistant wakeword asr voice-activity-detection text-to-speech nlu vad speech

spokestack-android's Issues

Crash in WordpieceTextEncoder

I tried to search for min sdk version and in the manifest it looks like api 8, however in the class
WordpieceTextEncoder there is a call using

return this.vocabulary.getOrDefault(token,this.vocabulary.get(UNKNOWN));

in encodeSingle (line 90) from WordpieceTextEncoder I had a crash because of getOrDefault, this is supported only from api 24+
would be nice to use something like

return this.vocabulary[token] ?:  this.vocabulary.get(UNKNOWN));

version 11.4.1

(tha's kotlin)

Error when building Google Cloud ASR pipeline

Hi 👋

I'm trying to set up the Google Cloud ASR with this configuration:

var json: String? = null
        try {
            val  inputStream: InputStream = assets.open("service_account.json")
            json = inputStream.bufferedReader().use{it.readText()}
        } catch (ex: Exception) {
            ex.printStackTrace()
        }

        val builder = Spokestack.Builder()
            .withoutWakeword()
            .withoutNlu()
            .setProperty("spokestack-id", "my id")
            .setProperty("spokestack-secret", "my secret")
            .withAndroidContext(this)
            .addListener(listener)
        builder
            .pipelineBuilder
            .setProperty("google-credentials", json)
            .setProperty("language", "en-US")
            .useProfile("io.spokestack.spokestack.profile.VADTriggerGoogleASR")
        return builder.build()

Unfortunately, this configuration throws the following exception(s):

E/AndroidRuntime: FATAL EXCEPTION: main
    Process: mypackagename, PID: 26259
    java.lang.RuntimeException: Unable to start activity ComponentInfo{mypackagename.MainActivity}: java.lang.reflect.InvocationTargetException
        at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3448)
        at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3595)
        at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:83)
        at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135)
        at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95)
        at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2147)
        at android.os.Handler.dispatchMessage(Handler.java:107)
        at android.os.Looper.loop(Looper.java:237)
        at android.app.ActivityThread.main(ActivityThread.java:7814)
        at java.lang.reflect.Method.invoke(Native Method)
        at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:493)
        at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1075)
     Caused by: java.lang.reflect.InvocationTargetException
        at java.lang.reflect.Constructor.newInstance0(Native Method)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:343)
        at io.spokestack.spokestack.SpeechPipeline.createComponents(SpeechPipeline.java:203)
        at io.spokestack.spokestack.SpeechPipeline.start(SpeechPipeline.java:182)
        at io.spokestack.spokestack.Spokestack.start(Spokestack.java:182)
        at mypackagename.MainActivity.onCreate(MainActivity.kt:54)
        at android.app.Activity.performCreate(Activity.java:7955)
        at android.app.Activity.performCreate(Activity.java:7944)
        at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1307)
        at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3423)
        at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3595) 
        at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:83) 
        at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135) 
        at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95) 
        at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2147) 
        at android.os.Handler.dispatchMessage(Handler.java:107) 
        at android.os.Looper.loop(Looper.java:237) 
        at android.app.ActivityThread.main(ActivityThread.java:7814) 
        at java.lang.reflect.Method.invoke(Native Method) 
        at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:493) 
        at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1075) 
     Caused by: java.lang.NoClassDefFoundError: Failed resolution of: Lcom/google/auth/oauth2/ServiceAccountCredentials;
        at io.spokestack.spokestack.google.GoogleSpeechRecognizer.<init>(GoogleSpeechRecognizer.java:66)
        at java.lang.reflect.Constructor.newInstance0(Native Method) 
        at java.lang.reflect.Constructor.newInstance(Constructor.java:343) 
        at io.spokestack.spokestack.SpeechPipeline.createComponents(SpeechPipeline.java:203) 
        at io.spokestack.spokestack.SpeechPipeline.start(SpeechPipeline.java:182) 
        at io.spokestack.spokestack.Spokestack.start(Spokestack.java:182) 
        at mypackagename.MainActivity.onCreate(MainActivity.kt:54) 
        at android.app.Activity.performCreate(Activity.java:7955) 
        at android.app.Activity.performCreate(Activity.java:7944) 
        at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1307) 
        at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3423) 
        at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3595) 
        at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:83) 
        at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135) 
        at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95) 
        at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2147) 
        at android.os.Handler.dispatchMessage(Handler.java:107) 
        at android.os.Looper.loop(Looper.java:237) 
        at android.app.ActivityThread.main(ActivityThread.java:7814) 
        at java.lang.reflect.Method.invoke(Native Method) 
        at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:493) 
        at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1075) 
     Caused by: java.lang.ClassNotFoundException: Didn't find class "com.google.auth.oauth2.ServiceAccountCredentials" on path: DexPathList[[zip file "/data/app/mypackagename-IVppXU7KnFHxIENF0_Db1w==/base.apk"],nativeLibraryDirectories=[/data/app/mypackagename-IVppXU7KnFHxIENF0_Db1w==/lib/arm64, /data/app/mypackagename-IVppXU7KnFHxIENF0_Db1w==/base.apk!/lib/arm64-v8a, /system/lib64]]
        at dalvik.system.BaseDexClassLoader.findClass(BaseDexClassLoader.java:196)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:379)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:312)
        at io.spokestack.spokestack.google.GoogleSpeechRecognizer.<init>(GoogleSpeechRecognizer.java:66) 
        at java.lang.reflect.Constructor.newInstance0(Native Method)

I'm using the .json file from the service account configured in GCP.
What could be the issue here?

Thank you! 🙏

TFWakeWordAzureASR Profile

Hello,

Your docs indicate TFWakewordAzureASR to be a valid pipeline profile.

java.lang.IllegalArgumentException: TFWakewordAzureASR pipeline profile is invalid!

What is the correct way to call upon the profile?

Network error while using VADTriggerAndroidASR Profile

Hi I am trying to implement the following profile VADTriggerAndroidASR - which seems to give NETWORK_ERROR always after activation. Please find the log below.

Can you please suggest a solution for this?
Some preliminary google search gave the following result.

This might happen due to having an overlapping MediaRecorder or AudioRecord instance active at the same time (link)

{ isActive: true,
      error: 'io.spokestack.spokestack.android.SpeechRecognizerError: SpeechRecognizer error code 2: NETWORK_ERROR\n\tat AndroidSpeechRecognizer$SpokestackListener.onError(AndroidSpeechRecognizer.java:143)\n\tat android.speech.SpeechRecognizer$InternalListener$1.handleMessage(SpeechRecognizer.java:450)\n\tat android.os.Handler.dispatchMessage(Handler.java:106)\n\tat android.os.Looper.loop(Looper.java:216)\n\tat android.app.ActivityThread.main(ActivityThread.java:7266)\n\tat java.lang.reflect.Method.invoke(Native Method)\n\tat com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:494)\n\tat com.android.internal.os.ZygoteInit.main(ZygoteInit.java:975)\n',
      message: null,
      transcript: '',
      event: 'ERROR' }

training tflite model

Hi,
Thanks for the pipeline. Any plans to release the code to train tflite models?

Add proguard rules to keep spokestack even when used dynamically

When a project is minified using proguard, Spokestack classes can get removed unless they are loaded and used up front. Some apps may not want to initialize Spokestack until later (e.g. after authentication). I think there's a way to add proguard rules to the project to keep spokestack through minification. For instance, with -keep class com.pylon.spokestack.** { *; }.

Wakeword-only profile

For some use cases, it's helpful to have wakeword detection without ASR. This configuration is fully supported by Spokestack, but it would be convenient to have a premade pipeline profile that omits ASR to simplify setup.

Implementing this is as simple as copying the TFWakewordGoogleASR profile and omitting the ASR stage. It would probably also be helpful to configure low values for both wake-active-min and wake-active-max (used by the ActivationTimeout stage) since the pipeline shouldn't stay active for long, lest it miss a subsequent wakeword utterance.

Alternatively, the PreASRMicrophoneInput stage could be used as an input stage in conjunction with longer activation timeouts to allow the application to control the microphone while the pipeline is active.

A complete implementation will also add the new profile to the profile test for a sanity check.

Missing three trained TensorFlow Lite models for android

Hi, Thank you for the voice pipelines. I couldn't find the three models that you mentioned over github.

The wakeword trigger uses three trained TensorFlow Lite models: a filter model for spectrum preprocessing, an autoregressive encoder encode model, and a detect decoder model for keyword classification

Can you please guide where to download?

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.