vilassn / whisper_android Goto Github PK

Offline Speech Recognition with OpenAI Whisper and TensorFlow Lite for Android

License: MIT License

Java 2.18% CMake 0.60% C++ 54.34% C 30.98% Starlark 1.84% Shell 0.23% Python 3.76% NASL 0.22% JavaScript 0.71% Ruby 0.01% Swift 1.07% Kotlin 0.71% Dart 0.57% HTML 0.01% CSS 0.01% Go 0.23% TypeScript 0.94% C# 1.11% Lua 0.23% Nim 0.24%

asr openai texttospeech tts whisper text-to-speech speech-recognition tensorflow tflite offline

whisper_android's People

Contributors

Stargazers

Watchers

Forkers

flynn-x sang556 wydgetlabsadmin hangox neu070224 ttcluo veryquant guodong029 yongaru uzstudio ca4ti eltld anchangsu gianpaolof sergenes thestretivibity junglej1m

whisper_android's Issues

Error while run the app -> `java.lang.UnsatisfiedLinkError: dlopen failed: library "libtensorflowlite.so" not found`

Hello @ALL,
I am facing this error when I run whisper_android repo can anyone suggest me solution?

Is it possible to set input language with whisper-base.tflite with this code

Here's the thing, I speak Chinese but the result is in English with similar meaning. So I am wondering if it is possible to set the input language so that the output will be in the same language as the input.

How to fix error version tensorflow lite ?

"E/tflite: Model provided has model identifier 'ion ', should be 'TFL3'"

thanks!

Realtime use possible?

There are some whisper realtime libraries out there.
Is there any possible way to make this library realtime ?

Get timestamps at the segment or word level

Thanks for the port.

Can this output a transcript of the provided audio with timestamps at the segment, word level, or both. I'm trying to transcribe audio files for dubbing and i need timestamp precision for wav file transcripts. Basically the start and end times for words or texts .

Open ai provides an api for this through the [timestamp_granularities[] parameter](https://platform.openai.com/docs/api-reference/audio/createTranscription#audio-createtranscription-timestamp_granularities)

Can you add this feature?

It seems like Voice Activity Detection (VAD) isn't working very accurately?

https://github.com/vilassn/whisper_android/blob/master/app/src/main/cpp/vad.cpp

I tried it with sample rate 16000 and 44100 but both results were "Silence".

Do I need to adjust any parameters to suit different audio files?

Thanks!

How to generate whisper tflite model

I want to generate my own model.
I try :
https://colab.research.google.com/github/usefulsensors/openai-whisper/blob/main/notebooks/whisper_base_tflite_model.ipynb#scrollTo=TzCrY9Q5jVsg
But, it's not work for me.

I have an issue where when I am using real time transcription, when I am not talking, it seems like it parses random text.

I was able to setup model and it works really great. My code is:

`private fun testAudio() {
// Initialize Whisper
val mWhisper = Whisper(this) // Create Whisper instance

// Load model and vocabulary for Whisper
val basePath = Global.fileOperations.getOutputDirectory("/Models", this)!!.path
val modelPath = basePath + "/whisper-tiny.tflite" // Provide model file path

    val vocabPath: String = basePath +
        "/filters_vocab_multilingual.bin" // Provide vocabulary file path
    println("PATHS: ")
    println(modelPath)
    println(vocabPath)
    mWhisper.loadModel(modelPath, vocabPath, true) // Load model and set multilingual mode

// Set a listener for Whisper to handle updates and results

    mWhisper.setListener(object : IWhisperListener {
        override fun onUpdateReceived(message: String?) {
            Log.i("TRANSCRIBE_WHISPER", "New State: $message")
            // Handle Whisper status updates
        }

        override fun onResultReceived(result: String?) {
            Log.i("TRANSCRIBE_WHISPER", result ?: "")
            // Handle transcribed results
        }
    })
    // Initialize Recorder
    val mRecorder = Recorder(this) // Create Recorder instance

// Set a listener for Recorder to handle updates and audio data
mRecorder.setListener(object : IRecorderListener {
override fun onUpdateReceived(message: String) {
// Handle Recorder status updates
}

        override fun onDataReceived(samples: FloatArray) {
            // Handle audio data received during recording
            // You can forward this data to Whisper for live recognition using writeBuffer()
            mWhisper.writeBuffer(samples);
        }
    })

    mRecorder.start(); // Start recording

}`

and  override fun onResultReceived(result: String?) {
            Log.i("TRANSCRIBE_WHISPER", result ?: "")
            // Handle transcribed results
        }

seemed to return:

[audioRecordData][fine] 5s(f:5014 m:0 s:0) : pid 8824 uid 10419 sessionId 41305 sr 16000 ch 1 fmt 1

I'll make a hole in the hole.
2 times this:

[audioRecordData][fine] 10s(f:10000 m:0 s:0) : pid 8824 uid 10419 sessionId 41305 sr 16000 ch 1 fmt 1
then
I'll be back with a little .... <== repeated a lot

thanks for you hard work :P

Not working on virtual devices

I have tried the project as is on 2 virtual devices (Android 14 and 12) and one physical device (Android 12). It seems to not run on virtual devices, you might want to mention this in the readme.md.

Getting CMake exception while gradle sync/build

I cloned repository and started the gradle sync, but got following exception, can anyone help with that?:

1: Task failed with an exception.
-----------
* What went wrong:
Execution failed for task ':app:configureCMakeDebug[arm64-v8a]'.
> [CXX1429] error when building with cmake using C:\Users\15010\AndroidStudioProjects\whisper_android\app\src\main\cpp\CMakeLists.txt: C++ build system [configure] failed while executing:
      @echo off
      "C:\\Users\\15010\\AppData\\Local\\Android\\Sdk\\cmake\\3.22.1\\bin\\cmake.exe" ^
        "-HC:\\Users\\15010\\AndroidStudioProjects\\whisper_android\\app\\src\\main\\cpp" ^
        "-DCMAKE_SYSTEM_NAME=Android" ^
        "-DCMAKE_EXPORT_COMPILE_COMMANDS=ON" ^
        "-DCMAKE_SYSTEM_VERSION=26" ^
        "-DANDROID_PLATFORM=android-26" ^
        "-DANDROID_ABI=arm64-v8a" ^
        "-DCMAKE_ANDROID_ARCH_ABI=arm64-v8a" ^
        "-DANDROID_NDK=C:\\Users\\15010\\AppData\\Local\\Android\\Sdk\\ndk\\23.1.7779620" ^
        "-DCMAKE_ANDROID_NDK=C:\\Users\\15010\\AppData\\Local\\Android\\Sdk\\ndk\\23.1.7779620" ^
        "-DCMAKE_TOOLCHAIN_FILE=C:\\Users\\15010\\AppData\\Local\\Android\\Sdk\\ndk\\23.1.7779620\\build\\cmake\\android.toolchain.cmake" ^
        "-DCMAKE_MAKE_PROGRAM=C:\\Users\\15010\\AppData\\Local\\Android\\Sdk\\cmake\\3.22.1\\bin\\ninja.exe" ^
        "-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=C:\\Users\\15010\\AndroidStudioProjects\\whisper_android\\app\\build\\intermediates\\cxx\\Debug\\6v4z4y72\\obj\\arm64-v8a" ^
        "-DCMAKE_RUNTIME_OUTPUT_DIRECTORY=C:\\Users\\15010\\AndroidStudioProjects\\whisper_android\\app\\build\\intermediates\\cxx\\Debug\\6v4z4y72\\obj\\arm64-v8a" ^
        "-DCMAKE_BUILD_TYPE=Debug" ^
        "-BC:\\Users\\15010\\AndroidStudioProjects\\whisper_android\\app\\.cxx\\Debug\\6v4z4y72\\arm64-v8a" ^
        -GNinja
    from C:\Users\15010\AndroidStudioProjects\whisper_android\app

vilassn / whisper_android Goto Github PK

whisper_android's People

Contributors

Stargazers

Watchers

Forkers

whisper_android's Issues

Error while run the app -> `java.lang.UnsatisfiedLinkError: dlopen failed: library "libtensorflowlite.so" not found`

Is it possible to set input language with whisper-base.tflite with this code

How to fix error version tensorflow lite ?

Realtime use possible?

Get timestamps at the segment or word level

It seems like Voice Activity Detection (VAD) isn't working very accurately?

How to generate whisper tflite model

I have an issue where when I am using real time transcription, when I am not talking, it seems like it parses random text.

Not working on virtual devices

Getting CMake exception while gradle sync/build

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent