niedev / rtranslator Goto Github PK
View Code? Open in Web Editor NEWOpen source real-time translation app for Android that runs locally
License: Apache License 2.0
Open source real-time translation app for Android that runs locally
License: Apache License 2.0
Hi, this app is pretty cool nice job, I was amazed by it's speed even though I read it's using the small Whisper model. For this reason I wanted to explore switching to using onnxruntime for running Whisper in my app Transcribro to see if I can switch to a bigger model while keeping the same speed (currently using tiny q8_0 with whisper.cpp). However, I couldn't find where the code that uses the Whisper model or how to use the Whisper model in onnxruntime. Could you direct me to an example or where this app uses the Whisper model? Thanks!
Hello, This project is excellent. can you develop the desktop side?
I found it can't Chinese cannot translate and export Vietnamese text, and Vietnamese cannot translate and export Chinese text,how deal it ?
Hi,
NLLB is good start, however, there are many other opensource models that were released in last few years. Wikimedia Foundation has been providing a machine translation service based on a collection of such models(all free and opensource) and has coverage for 250+ languages. See https://translate.wmcloud.org/ and https://diff.wikimedia.org/2023/06/13/mint-supporting-underserved-languages-with-open-machine-translation/
I wonder if it is possible to bring these powerful models optimized for CPU to this app. Disclaimer: I am lead developer of that MT system at Wikimedia Foundation.
"https://github.com/niedev/RTranslator/releases/download/2.0.0/NLLB_cache_initializer.onnx",
"https://github.com/niedev/RTranslator/releases/download/2.0.0/NLLB_decoder.onnx",
"https://github.com/niedev/RTranslator/releases/download/2.0.0/NLLB_embed_and_lm_head.onnx",
"https://github.com/niedev/RTranslator/releases/download/2.0.0/NLLB_encoder.onnx",
"https://github.com/niedev/RTranslator/releases/download/2.0.0/Whisper_cache_initializer.onnx",
"https://github.com/niedev/RTranslator/releases/download/2.0.0/Whisper_cache_initializer_batch.onnx",
"https://github.com/niedev/RTranslator/releases/download/2.0.0/Whisper_decoder.onnx",
"https://github.com/niedev/RTranslator/releases/download/2.0.0/Whisper_detokenizer.onnx",
"https://github.com/niedev/RTranslator/releases/download/2.0.0/Whisper_encoder.onnx",
"https://github.com/niedev/RTranslator/releases/download/2.0.0/Whisper_initializer.onnx"
代码中这些下载不下来;然后下载下来后,放到自己的服务器,替换成自己的地址;启动后把这个错误:
There was an error loading the files of the models for translation and speech recognition, try restarting the app, if the problem persists then reset it.
另外代码有个 intent.setData(Uri.parse("https://play.google.com/store/apps/details?id=com.google.android.tts")); 连不上谷歌,自己下载一个安装了。china是不是不好使
It is not a real-time translation, just a sentence translation, when Vad is detected.
I want to use Rtranslator in one-direction, that is, speak from my device so it shows the translation on projected screen of a computer. This is for teaching or conferences purposes. It seems this is not currently possible. Some other software allow this, for example Microsoft translator (or it did so in the past at least). Would be amazing to have this in the future.
F-Droid is an installable catalogue of FOSS (Free and Open Source Software) applications for the Android platform. Any chance of adding this project to F-Droid?
After downloaded all components ,this app said there's no google tts engine.i installed it in google play.but this app still asked me to install it.
I found google iis engine has a new name
“Speech Recognition and Synthesis from Google”.
My phone is Samsung Galaxy f52 5g
Thank you for wanting to share an idea! But before starting, ensure to check if this feature request respects the following requirements:
Can you implement this library in your app? Google official api are costly so if you implement this library it would be free and unlimited text translation.
Using app to download models is too slow. Use the computer to download 10 models in 2.0 and copy them to the corresponding location. Why do you still need to restart downloading models when you open the app?
Hi,
it is an interesting project. What about using mozilla deep speech for TTS and bergamot for translator? Mozilla voice is in an advanced state since many languages already have a complete dataset.
hello!niedev.
I found that the GalleryImageSelector seems to only support English. I have tried to add multilingual support to it (currently supporting both English and Simplified Chinese). You can find my fork here. I'm not sure if this will cause any other issues, but I have no problem running it on my own phone. Would you consider adding multilingual support to the selector as well?
In addition, the RTranslator notification still uses hardcode, and I have attempted to translate it here.
I am still learning Java. If the above suggestions will cause any problems, please let me know.
will support for dialects?e.g. hokkien
Sometimes I want it to not run in the background when in close it, which is always.
This is a beautiful application, I wish I did not need to kill it to exit fully.
什么时候有ios版?真的很需要
Sorry for my poor java, could we provide some 'python-friendly' example code of work flow in python and their usage of models?
I have a great interest in running a lite model on mobile device, but unfamilar with onnx and onnxruntime, now I have no idea what to do next after loading an onnx model. Could not write corespoding python code for Translator.java
:
import onnx
import onnxruntime as ort
model_path = 'rtraslator/NLLB_encoder.onnx'
model = onnx.load(model_path)
sess = ort.InferenceSession(model.SerializeToString())
sess.run(??)
It would be a great help for so many pythoners to understand what is under the hood. Thanks~
Hi,
Would you please provide instructions, or a method, for sideloading of the components (onnx files) in an off-line mode, so that the app does not have to automatically download them upon first start.
This would most likely benefit everyone, for various reasons such as: lowering mobile data usage, ease of installation on multiple devices, privacy, etc.
Thanks!
this is on CalyxOS with eSpeak as the configured TTS engine. let me know if i can provide any other information
Hello,
As an end user, (My suggestions may be ridiculous because I have no software knowledge.)
"The model used is Whisper-Small-244M with KV cache."
Can Whisper-Large-V3 be used?
Can the user make a choice? (such as tiny, base, small, medium, large)
CPU and GPU are advancing rapidly in GSM phones.
For example, my phone is Qualcomm Snapdragon 8 Gen 2 and Adreno(TM) 740
Can corrections be made during the conversation to prevent people from understanding and translating the wrong word? (Walkie Talkie Mode).
(Walkie Talkie Mode) Can it be adapted for a single language?
Is it possible to input voice for Conversation Mode? (Without keyboard feature)
The title is rather self-explanatory, but the recommendation is to switch to it because faster whisper delivers significant speed advantages over OpenAI's standard whisper model.
Faster Whisper repository:
https://github.com/SYSTRAN/faster-whisper
Would be cool if the app could translate text directly on screen without the need to copy-paste it - or even voice transcribing and translating for voice messages.
Thanks for developing this amazing app, is there a way to translated the app itself using https://weblate.org/en/. Also, does this app support RTL direction?
Nice job, niedev. But I find that devices are connect by bluetooth. It is very constricted, can you add the feature that users can have conversations on internet? Just like microsoft translate.
https://play.google.com/store/apps/details?id=com.microsoft.translator&hl=zh_CN&gl=CA
There was an error with the tts initialization, do you want to continuewithout tts?
would I know have you the plan for iOS?
Any chance this app can be submitted in F-Droid?
deviceName: Redmi k50
osVersion: 1.0.6.0
whisper.cpp supports Hebrew. Could you clarify the limitations that prevent rtranslator from supporting it? I noticed that ctranslate inference with NLLB also supports Hebrew.
This is a very innovative project!
Thanks!
For language learning it would be necessary to hit the "speaker" icon to keep repeating the translation.
There was an error loading the files of the models for translation and speech recognition,try restarting the app,if the problem persists then reset it.
Thank you for wanting to share an idea! But before starting, ensure to check if this feature request respects the following requirements:
Describe the solution you'd like
A clear and concise description of what you want to happen.
Please add support for dark theme, which will improve the appearance on devices with black theme and reduce battery consumption on OLED screens.
In this device there iS no google tts engine, download it to be able to use the tts
My phone is Samsung Galaxy s23 ultra
press continue,but it's not running
The integration of AI-driven features into RTranslator has significantly improved its translation accuracy and efficiency. The recent enhancements leverage advanced machine learning models to deliver more precise translations across various languages. These upgrades are in line with our goal to continuously innovate and enhance user experience. For additional resources and tools that might complement RTranslator, check out this link.
As stated in the title, is it possible to add languages? For example hungarian (magyar)?
If it is at all possible, how time consuming is it?
Some regions can not download AI modules directly at the beginning step.
Would you please add proxy setting at the first step to allow download AI modules trough proxy? thanks
Good repository
Thank you for wanting to share an idea! But before starting, ensure to check if this feature request respects the following requirements:
Is your feature request related to a problem? Please describe.
transcription in walkie talkie mode does not work reliably in noisy environments. even with adjustments to microphone sensitivity, it never stops listening for input, which means translation never begins.
Describe the solution you'd like
either offer a user control for when to start translating the buffer, or switch to a steaming mechanism so that input doesn't need to end before translation starts.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.