vilassn / whisper_android Goto Github PK
View Code? Open in Web Editor NEWOffline Speech Recognition with OpenAI Whisper and TensorFlow Lite for Android
License: MIT License
Offline Speech Recognition with OpenAI Whisper and TensorFlow Lite for Android
License: MIT License
Hello @ALL,
I am facing this error when I run whisper_android repo can anyone suggest me solution?
Here's the thing, I speak Chinese but the result is in English with similar meaning. So I am wondering if it is possible to set the input language so that the output will be in the same language as the input.
"E/tflite: Model provided has model identifier 'ion ', should be 'TFL3'"
thanks!
There are some whisper realtime libraries out there.
Is there any possible way to make this library realtime ?
Thanks for the port.
Can this output a transcript of the provided audio with timestamps at the segment, word level, or both. I'm trying to transcribe audio files for dubbing and i need timestamp precision for wav file transcripts. Basically the start and end times for words or texts .
Open ai provides an api for this through the [timestamp_granularities[] parameter]
(https://platform.openai.com/docs/api-reference/audio/createTranscription#audio-createtranscription-timestamp_granularities)
Can you add this feature?
https://github.com/vilassn/whisper_android/blob/master/app/src/main/cpp/vad.cpp
I tried it with sample rate 16000 and 44100 but both results were "Silence".
Do I need to adjust any parameters to suit different audio files?
Thanks!
I want to generate my own model.
I try :
https://colab.research.google.com/github/usefulsensors/openai-whisper/blob/main/notebooks/whisper_base_tflite_model.ipynb#scrollTo=TzCrY9Q5jVsg
But, it's not work for me.
I was able to setup model and it works really great. My code is:
`private fun testAudio() {
// Initialize Whisper
val mWhisper = Whisper(this) // Create Whisper instance
// Load model and vocabulary for Whisper
val basePath = Global.fileOperations.getOutputDirectory("/Models", this)!!.path
val modelPath = basePath + "/whisper-tiny.tflite" // Provide model file path
val vocabPath: String = basePath +
"/filters_vocab_multilingual.bin" // Provide vocabulary file path
println("PATHS: ")
println(modelPath)
println(vocabPath)
mWhisper.loadModel(modelPath, vocabPath, true) // Load model and set multilingual mode
// Set a listener for Whisper to handle updates and results
mWhisper.setListener(object : IWhisperListener {
override fun onUpdateReceived(message: String?) {
Log.i("TRANSCRIBE_WHISPER", "New State: $message")
// Handle Whisper status updates
}
override fun onResultReceived(result: String?) {
Log.i("TRANSCRIBE_WHISPER", result ?: "")
// Handle transcribed results
}
})
// Initialize Recorder
val mRecorder = Recorder(this) // Create Recorder instance
// Set a listener for Recorder to handle updates and audio data
mRecorder.setListener(object : IRecorderListener {
override fun onUpdateReceived(message: String) {
// Handle Recorder status updates
}
override fun onDataReceived(samples: FloatArray) {
// Handle audio data received during recording
// You can forward this data to Whisper for live recognition using writeBuffer()
mWhisper.writeBuffer(samples);
}
})
mRecorder.start(); // Start recording
}`
and override fun onResultReceived(result: String?) {
Log.i("TRANSCRIBE_WHISPER", result ?: "")
// Handle transcribed results
}
seemed to return:
[audioRecordData][fine] 5s(f:5014 m:0 s:0) : pid 8824 uid 10419 sessionId 41305 sr 16000 ch 1 fmt 1
I'll make a hole in the hole.
2 times this:
[audioRecordData][fine] 10s(f:10000 m:0 s:0) : pid 8824 uid 10419 sessionId 41305 sr 16000 ch 1 fmt 1
then
I'll be back with a little .... <== repeated a lot
thanks for you hard work :P
I have tried the project as is on 2 virtual devices (Android 14 and 12) and one physical device (Android 12). It seems to not run on virtual devices, you might want to mention this in the readme.md.
I cloned repository and started the gradle sync, but got following exception, can anyone help with that?:
1: Task failed with an exception.
-----------
* What went wrong:
Execution failed for task ':app:configureCMakeDebug[arm64-v8a]'.
> [CXX1429] error when building with cmake using C:\Users\15010\AndroidStudioProjects\whisper_android\app\src\main\cpp\CMakeLists.txt: C++ build system [configure] failed while executing:
@echo off
"C:\\Users\\15010\\AppData\\Local\\Android\\Sdk\\cmake\\3.22.1\\bin\\cmake.exe" ^
"-HC:\\Users\\15010\\AndroidStudioProjects\\whisper_android\\app\\src\\main\\cpp" ^
"-DCMAKE_SYSTEM_NAME=Android" ^
"-DCMAKE_EXPORT_COMPILE_COMMANDS=ON" ^
"-DCMAKE_SYSTEM_VERSION=26" ^
"-DANDROID_PLATFORM=android-26" ^
"-DANDROID_ABI=arm64-v8a" ^
"-DCMAKE_ANDROID_ARCH_ABI=arm64-v8a" ^
"-DANDROID_NDK=C:\\Users\\15010\\AppData\\Local\\Android\\Sdk\\ndk\\23.1.7779620" ^
"-DCMAKE_ANDROID_NDK=C:\\Users\\15010\\AppData\\Local\\Android\\Sdk\\ndk\\23.1.7779620" ^
"-DCMAKE_TOOLCHAIN_FILE=C:\\Users\\15010\\AppData\\Local\\Android\\Sdk\\ndk\\23.1.7779620\\build\\cmake\\android.toolchain.cmake" ^
"-DCMAKE_MAKE_PROGRAM=C:\\Users\\15010\\AppData\\Local\\Android\\Sdk\\cmake\\3.22.1\\bin\\ninja.exe" ^
"-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=C:\\Users\\15010\\AndroidStudioProjects\\whisper_android\\app\\build\\intermediates\\cxx\\Debug\\6v4z4y72\\obj\\arm64-v8a" ^
"-DCMAKE_RUNTIME_OUTPUT_DIRECTORY=C:\\Users\\15010\\AndroidStudioProjects\\whisper_android\\app\\build\\intermediates\\cxx\\Debug\\6v4z4y72\\obj\\arm64-v8a" ^
"-DCMAKE_BUILD_TYPE=Debug" ^
"-BC:\\Users\\15010\\AndroidStudioProjects\\whisper_android\\app\\.cxx\\Debug\\6v4z4y72\\arm64-v8a" ^
-GNinja
from C:\Users\15010\AndroidStudioProjects\whisper_android\app
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.