stypox / dicio-android Goto Github PK

Dicio assistant app for Android

License: GNU General Public License v3.0

Java 4.97% Ruby 0.47% Shell 0.27% Kotlin 94.29%

android assistant assistive-technology dicio dicio-assistant personal-assistant personal-assistant-framework voice-assistant vosk

dicio-android's Introduction

Ciao! I am Stypox:

Passionate contributor to free software, enjoys writing code to replace apps and services that do not respect the user
Member of MindsHub, a no-profit association focused on education and team work in the fields of informatics, electronics and 3d printing
Participates to competitive programming and cybersecurity competitions and is a tutor for younger students
Studies information engineering at University of Trento
User of Manjaro Linux with KDE and owner of a Fairphone 3+ with /e/OS

I'm currently working on:

Dicio: an Android voice assistant (available on F-Droid and Play Store)
NewPipe: an Android YouTube frontend with many cool features the official YouTube app doesn't have (available on F-Droid)
dicio-numbers: a Java library for multilanguage number parsing and formatting
Tridenta: an app to view public transport information in Trentino (available on F-Droid and Play Store)
Curriculust: a Rust program that allows writing a CV in YAML and turning that into LaTeX and PDF

Other projects I'm proud of are:

Crop detection: an AI model to detect crops in images, a building block of MindsHub's Cyberorto, an autonomous farmer
Plotter: some scripts and algorithms to print G-code, text or images with a custom-made 2D plotter
Quadermas: an Android app to access "Quaderno Elettronico Mastercom" (available on F-Droid)
Olympiad exercises: the competitive programming code I've produced to train for the Italian Olympiad in Informatics
Insigno: an app by MindsHub that gameifies reporting and collecting trash (available on F-Droid, Play Store and Apple Store)

dicio-android's People

Contributors

Stargazers

Watchers

Forkers

triallax comradekingu jarceloelement andreytkachenko cristian19194848 wiltonlazary midnightnerd trman david-allison kustomzone emporia-ai githangar alphacep daoos muflhi01 eddiemattos pixincreate iamnaran alansanchezp mxc48-zz global-localhost global19 global19-atlassian-net tadashi-hikari jorik041 parolteknologio msgpo drew-sinha intensifier birx-web danielegobbetti geraldsoellinger nebkrid primesun vaginessa ankitbhaiya isocietyblackhat kri164 abdul1039 sl1txdvd rex07 berbascum 008chen 4144 pro1mantis sts0mrg0 lfyg hawkeye116477 mdouchin vijay-nailwal shura0 spacingbat3 rockystevejobs charudatta10 paulo-haas vnpower mmar58 tiptoptom jake354 sycomix ketansp gianpaolof

dicio-android's Issues

Timer and clock skills should use default timer instead of Dicio internal timer

Currently
When setting a timer, Dicio internal timer is used.

Expected
Use default timer instead. That will allow for the best integration and UX allowing to then use notifications and system integration like button presses to stop the timer. Currently Dicio alarm cannot be stopped (at least in my limited testing or only by closing Dicio, which is rather inconvenient).

Search results can't be opened

I only get a "View in..." Menu with "No personal apps can open this content".

The code seems to do something complicated in ShareUtils.java. I have LineageOS based on Android 11, no Google Apps, and NewPipe doesn't have issues opening in a browser.

New Skills

Some others skills which will be easy to implement are:

Unit conversions (e.g. feet to cm)
Time Zone conversions
Currency conversions
Setting Alarms

Downloading vosk model failed

Had downloaded this app recently, and while i press middle download button i get following message:
Failed downloading vosk model

device language: english
App version: 0.5

[Bug] Unable to find difference between "to" and "two"

I dont know if this is an issue you can really fix but when trying to use the calculator, it is unable to tell the difference between "to" and "two" by voice in English.

Maybe, not sure though, for now have all calculations that need numbers have to start with the phrase "calculate". After that numbers and operation phrases like "divided" or "by" will be accepted. This way "two" will be prioitised when needed.

Love the project, hope it continues!

null object

Steps to reproduce
1.tell 'search for' to dicio
2.You get an error message
Expected behavior

'I did not understand, could you repeat?' as answer or even better:
Ask what your searching for

Calculator Functions

I can see that there some basic mathematics comprehension integrated, but what are the limits?

For example, could you throw algebraic formulae and expect an answer?

So far, I tried throwing intermediate multiplication (powers, worked fine) and division (square root, was ignored or misunderstood), however I would like to know what the limits are, so as to be able to push them farther.

Also, whenever I speak, "2" and "4" are recognized as "to" and "for", respectively.

Publish on F-Droid

~~Publishing on F-Droid requires alphacep/vosk-api#558 to be solved, since the repository maven { url 'https://alphacephei.com/maven/' } is not allowed in F-Droid.~~ alphacep/vosk-api#558 was solved, waiting for approval by F-Droid: https://gitlab.com/fdroid/fdroiddata/-/merge_requests/9657

Calculator skill unavailable in spanish despite available translation

At least from what I can see, there's a spanish translation for the calculator skill available, despite that the app shows it as unavailable in the settings. I could help to translate it if needed, but it seems to be slightly different from the english file, so I don't know exactly what to do.

Custom translation system for sentences files

Problems

Currently sentences files have to be translated by copying the *.dslf files from the en folder to the new language folder, and then translating them. This is cumbersome since:

GitHub pull requests are not the best tool to handle translations, since even a small update requires a branch, a dev approval, a merge, ... (that's why Weblate is used for app strings)
The sentence and capturing group ids and the section specificity should not be translated, but there is nothing making sure that this does not happen (not even when building the app!)
Syntax errors are only found out about when building the app, but we can't expect translators to be able to build the app by themselves (this would be partially solved by adding a github action to pull requests that reports build errors, though the feedback would still not be instant)
Translating a dslf file is more difficult than just translating a string, since you have to think about all possible ways to say something. It is easy to miss a sentence and only realize about it when testing out the app (and again, we can't expect translators to build the app by themselves)

Custom translation system

Adding the dslf files to Weblate is doable, though that would only solve point 1, since it would not ensure that there are no syntax/semantic errors. I don't think Weblate has any way to add custom plugins that would do that.

Therefore I think the way to go would be to create a custom translation system with these features (italic means "difficult to implement"/"maybe later"):

git integration like Weblate (but more basic)
a view that shows the file currently being translated, with syntax highlighting
a view that shows syntax errors, highlighting them in the code
a view that shows other errors, such as mismatched sentence ids, capturing group ids or section specificity
a view with suggestions on how to improve a sentence
a text field that allows inserting a user input and then shows whether it matches or not, the matched capturing groups and which sentence actually matched
when a skill marketplace will be setup, third-party skill developers should be able to get their skill sentences hosted on the same service

Create a matrix room for the project

Feature Request: Search music apps such as Spotify

It would be nice to be able to say something such as "Play Hello by Adele" and it would search Spotify for a track with that nane

Skill marketplace

Dicio should be able to download skills from a marketplace where users can upload their own skills, whose development would be separated from Dicio, such as Mycroft Marketplace. A skill would be packaged as a compiled java file (i.e. a jar file) and then loaded at runtime by Dicio, to ensure the best performance. An alternative to this would be to have users install an app for each skill, and then sending Android intents around to communicate with skills: this is the approach taken by Athena, but I don't think it would work out well for Dicio since skills would then be unable to natively show graphical output to the user.
I will not focus on creating a skill marketplace right now, since Dicio is in a pretty early stage and still requires a lot of work. When (if) it will become more popular, and more people will start creating skills, creating a skill marketplace will become important. I opened this issue to illustrate my plans for the future ;-)

Language translation skill

Lingva.ml is a open source front-end of Google Translate that contains no trackers of Google or any kind of things.
It is already being deployed and being continuously updated.
It would be great to see if you implement that as a webview part or any kind of thing in Dicio-Assistant.

Ask me anything.

I type translate "something" to <language_name>.
And it throws an output that is done in lingva.ml.

And, App opener is really annoying and buggy. It really pisses me off when ever I ask it to open an app, it opens something other.
Ex: I activate Dicio through edge swipe action as I cannot start it with a voice. I ask it to 'open dialer' and it opens 'camera'! Yes, dialer app's name is 'Phone'. And I find it hard to say, open Phone as I'm already using the phone.

It's really helpful if you make it recognise all the keywords.
Or tell me what stuffs needs to be learnt, I'll learn and try to implement that, and send a pull request.:D

Request: timer skill

I would love to see a timer skill 🙂
As I am not into Java I can sadly not contribute by myself.

Question/Feature request dump

Hello,

Here's a list of things I thought of while testing the app. If there was a gitter/discord/other platform for this project I would've reached out there before writing this big post. I can create separate issues for feature requests that prove viable:

Which vosk model is dicio actually downloading from https://alphacephei.com/vosk/models ? I'm just curious because there are dozens of models listed under different licenses and I wanted to know which dicio is using
Feature request: Passive listening with a wakeup word. How would you feel about dicio always using the microphone but only reacting after a keyphrase such as "Hey Dicio"
What's the long-term plan for dicio's skills if dicio becomes massively popular and recieves many contributions to dicio-skills? I'm not sure how much space skills use currently, but not everyone will want every skill. This sort of leads into my next feature request:
Feature request: 3rd party app integration skills. There's a few automation related apps I think dicio could integrate with nicely. They could reduce work needed on some dicio skills because they implement different things related to device management. Here's a list of apps and their docs on what could be used by dicio:

Termux - arbitrary commands in a terminal emulator for android. Docs on its RUN_COMMAND intent

KeyMapper - trigger keymaps from other apps

Easer - I can't find good docs for this particular one, but it has a Recieve Broadcast event condition in which users can specify received actions and categories. It can then be used to trigger what Easer calls "profiles" which are sets of actions such as toggling WiFi/Bluetooth, starting services kf other apps, and sending other broadcasts.

Sorry again for the long post.

[Weather] fahrenheit

For us cavemen plzkthxbai

small issue with the calculator function.

As you can see, "what's" is not seen as the same as "what is"

also generally it struggled with my accent but that is not really something you guys can fix.

other than that, really cool! cant wait to see more

[Add VOSK Model] Please add Indian English Voice Model

The Indian English voice model from VOSK detects my speech with a 90% accuracy rate...while the normal english model detects at 30-40% accuracy.

Adding the Indian-English speech model should be relatively straightforward.

Thanks :)

Possibility to train city names

It would be great if there would be possibility to train the recognition of cities.

I'm unable to get the model to recognize Baar or Zug. 🙄
And I'm sure this will be the case for other places as well.

Our is there any trick?

Dynamically define trigger sentences for skills

Hello,

thanks for the latest update, which introduces the telephone skill! :)

I want to implement a hands-free workflow to call a person using my Bluetooth headset only. It is almost complete, but unfortunately the microphone recording quality is bad over headset. it can't recognize the word "call" when I speak it. (most of the times I get "oh" instead).

Could you make it possible to additionally define synonyms for the trigger words?

Thanks for your efforts.

Other languages. Where to start?

Hello.
I would like to translate current skills to Russian. How can I do it? I see vosk have a russian voice model, so how to attach it to dicio?

Add [Feature request]: Floating mic button/Improving car usage

Hello,
First of all, congratulations for your great work! I have spent a huge amount of time looking for a free and open source voice assistant. Awesome project, thank you!!

I know it's and early stage I'd like to suggest (no ETA) if it's possible to implement, in the future, something that could help for example using the phone in the car, like a floating mic button. Or quick opening for navigation or music apps, indeed the open function works well yet and I see that is possible to implement some skills building the app.

Thanks and congratulation again

Problems with translation.

I've tried to translate but it won't let me, it says something about the repositories.
https://hosted.weblate.org/projects/dicio-android/strings/#translations

[Feature Request]: Take dictation

It would be good if Dicio could take dictation, composing and modifying text.

Crash when timer isn't visible (Dicio 0.6)

Hey there,

Thanks for Dicio 0.6! I'm glad to have a timer feature, which is the main feature I use in voice assistants :)

Sadly, the timer feature is quite easy to break. If you set a timer, click settings and then click back the app's state resets and a crash will happen on the next TTS update:

FATAL EXCEPTION: main
Process: org.dicio.dicio_android, PID: 24891
java.lang.NullPointerException: Attempt to invoke virtual method 'int android.speech.tts.TextToSpeech.speak(java.lang.CharSequence, int, android.os.Bundle, java.lang.String)' on a null object reference
 at org.dicio.dicio_android.output.speech.AndroidTtsSpeechDevice.speak([AndroidTtsSpeechDevice.java:71](https://androidttsspeechdevice.java:71/))
 at org.dicio.dicio_android.skills.timer.TimerOutput.lambda$setTimer$0$TimerOutput([TimerOutput.java:243](https://timeroutput.java:243/))
 at org.dicio.dicio_android.skills.timer.-$$Lambda$TimerOutput$JVQKXD3e0_QZmKL-N9F6BPYovhc.accept(Unknown Source:14)
 at org.dicio.dicio_android.skills.timer.TimerOutput$SetTimer$1.onTick([TimerOutput.java:74](https://timeroutput.java:74/))
 at android.os.CountDownTimer$1.handleMessage([CountDownTimer.java:130](https://countdowntimer.java:130/))
 at android.os.Handler.dispatchMessage([Handler.java:106](https://handler.java:106/))
 at android.os.Looper.loop([Looper.java:223](https://looper.java:223/))
 at android.app.ActivityThread.main([ActivityThread.java:7664](https://activitythread.java:7664/))
 at java.lang.reflect.Method.invoke(Native Method)
 at com.android.internal.os.RuntimeInit$[MethodAndArgsCaller.run](https://methodandargscaller.run/)([RuntimeInit.java:592](https://runtimeinit.java:592/))
 at com.android.internal.os.ZygoteInit.main([ZygoteInit.java:947](https://zygoteinit.java:947/))

95a4dd62-7402-4bb0-af23-a493d43a1048.mp4

(Please ignore the Dicio notification, this comes from Scoop which I personally find the easiest way to collect stack traces, it doesn't come from your app)

Weather tomorrow

I typed this command but it searched for a city with the name tomorrow.

It should search current city and detect time/date if possible before searching city name.

Feature request: Tasker plugin

It would be great to have a tasker plugin that can recognise text and pass it to tasker variable.
With such thing lot of people will be able to write tasker scripts with offline voice recognition without needs of Dicio recompilation.

Tasker: https://tasker.joaoapps.com
Yes, it's not opensource or even free, but it's quite popular utility for android automation.

Android voice input

On my Lineage OS 18.1 I can set a default assistant app. Once that's set, I would like to set Dicio as voice input, so when I trigger the default assistant by an external source (like my bluetooth headset by pressing a button), it opens and starts listening. Just line I do by mapping a long swipe (it opens Dicio app already listening).

Thanks in advance!

Feature request: Custom Synonyms for App names and Places

Many non-english models have problems with opening apps that have a english name, because the name does not appear in the Vosk dictionary. Also, a lot of places and cities are not recognized correctly by the models. A relatively easy workaround would be a feature that makes it possible to define synonyms for App names. This could also be useful for English speakers who have problems remembering some App names or for unknown Names that are not part of the Vosk dictionary.

Some examples:

Semantic synonymes: Whatsapp -> Text Messenger
Adding wrong transcriptions that often happen: WhatsApp -> what's app, Wort Sepp (German),...
Switching the Alphabet: WhatsApp -> WхатсАпп

I don't think that a predefined list of Synonyms should be part of the App, just a list that every user can define for themselves.

What do you think?

Use Dicio as system STT / voice recognition service

It is not an urgent thing,
but I think it would be very nice to be able to set the system STT in the Dicio's input mode as well.
There are many PoC projects to create FOSS STTs, some based on vosk, some on Mozilla Deepspeech or other.
At present none are really functional, but when they will be ready, I find it useless to download and save in two separate places the same vosk models for example.
I repeat that there is no hurry, but I think it should be done sooner or later.

Ends of words

Does the dicio sentence syntax allow to use different ends for word? I've found two issues with it and I can avoid first of them dut cannot second.

Example of 'open' skill in Russin. It's possible to say "запусти" (do run) or "запустите" (kind form of "run") or "запустить" (to run). I found that I can use line like
запусти|заустите|запустить .what.
But more convenient would be to use something like
'запусти(те|ть)?'
Is it possible?
More hard to avoid. Word ends in capture groups. Example with 'weather' skill. I try to ask weather in Moscow (Москва):
какая погода в Москве?
And openweathermap does not know a city 'Москве' it knows only 'Москва'.
So, is it possible to setup word forms in capture groups?

I suppose the second issue will cause worst troubles in Unit converter skill, because all of units have different forms in singular and plural forms.

Spanish sentences

Hi, I downloaded dicio and noticed that some of the skills are not available in spanish. I can see that some .dslf files are not present in the spanish directory and I would like to contribute with those translations.

Is there specific requirement that the sentences themselves should meet (to ensure that the engine will properly recognize them) or is it enough to just write what I consider the proper translation for a given skill? Sorry if this question seems a bit dumb, I haven't contributed to any STT-related project before and I want to do the best I can.

The project looks amazing so far btw, gratz!

Implement error activity

The NewPipe ErrorActivity and related files should be integrated into Dicio so that errors can be reported easily.

Also, the current default way to display errors generated by a skill being evaluated is to add a new view with the full stack trace to the output screen. This should be replaced with just an error message and a button allowing the user to open the error activity.

Add "retry" button to error messages

Currently when there is an error and a skill fails to do its job, there is no way to retry the same action. The only workaround is to tap on the last input, add a character somewhere, and then press Enter so that it is sent again.

This should be changed by keeping track of the last input (which could even be a list of inputs, if the user is in a conversation with multiple prompts, e.g. the telephone). Then the user should be able to retry the last action both by saying "retry" or by tapping retry on the error message.

application does not start

the application does not start and has an error

Icon Dicio

I think Dicio's icon is way too simplistic (no offense to whoever made it), Dicio is a voice assistant, so its icon should inspire usability.
I tried to make an icon that would better fit the project, I started with the idea that when people see the icon, they should know right away that it's a voice assistant, or at least that it has something to do with voice.
Please tell me what you think about it and if it fits, you can use it as you want.

Wake up word / wakeword recognition

All assistants have a wake up word (e.g. "Hey Google"), so Dicio should have it, too. This should be doable with a service running in the background with Vosk that keeps listening. Athena already has this feature (video), we might take inspiration from it. The wake word recognizer should obviously be easy to enable/disable in settings, and should probably be implemented with a foreground service, so that newer Android versions do not force close it after a while.

Not able to download VOSK model

When clicking the microphone icon, I get a toast that downloading VOSK model, but then it keeps on going round and round. And nothing ever happens. Tried it multiple times.
Help Please

Launching Apps

To test this feature (and joke around a bit), I tried telling Dicio to open itself (by saying "Open Dicio").

It took three tries, which I would probably attribute to my accent, however this issue is about the hilarious results of the first two tries (the last try brought up a toast about opening Dicio, though nothing else happened, good job on that!).

First try: Dicio interpreted my command as "Open CEO", and for some odd reason launched ProtonMail.

Second try: Dicio interpreted my command as "Open D C O if", and oddly enough launched Aegis.

I can understand my accent not going through, but not operations that are completely unrelated to the interpreted results.

feature request: start the Dicio listening service in background via intent

From a comment in #64 , I'd like to be able start the listening service in the background via an intent. Personally I'd send it via Termux's implementation of am and/or a KeyMapper action.

Popup instead of fullscreen when triggered from system buttons

It would be nice if when we open the application with the shortcuts we only have a pop-up of the microphone for example but not that the whole application opens. I think it would give an impression of speed and really of a more discreet assistant that activates in a corner and waits for instructions.

Add checkstyle

Similarly to NewPipe, some Java style should be enforced so that PRs follow a consistent style. NewPipe uses checkstyle for this purpose, and it seems to work well.

skill request: broadcast user defined intent at runtime

I'd like to integrate Dicio into my personal automation system which primarily uses KeyMapper and Termux.

KeyMapper can receive intents that are sent containing a UUID and I want Dicio to send them. As the UUIDs are randomly generated at runtime, I'd need a text box in Dicio to copy the UUIDs to, rather than hard coding them. Additionally, #62 Would probably be a prerequisite to this skill as it'd require having a pronouncable name(s) for the intents.

Running commands in Termux would require requesting a permission and a few other steps

Additionally both KeyMapper and Termux (through am) can send service, activity, and broadcast intents. Is it possible to start Dicio's "listening" in the background (I.e. without starting the main Dicio activity)?

Thank you

I translated the .dslf files into Spanish

Hello everyone, I have translated lyrics, search, open, calculator and weather into Spanish, my English is basic, but with my basic knowledge and a translator I was able to translate it. I wanted to upload it and it won't let me, how could I contribute this translation? .

Implement About screen

Something like this but in a fragment of its own

Add usage instructions somewhere in the app

How to set the app as the system assistant (i.e. that triggers when long pressing the circle button)
How to setup a TTS if the on-device one does not work, with a button that leads to the relevant system settings screen
& more

Not an issue, but can't figure out how to contact otherwise

I've been working on a really similar project, but I already have STT, TTS, and NLU working on-device (without network access). I've also created a plug-in system that doesn't require a user to fork or add skills to the existing project to run them. Would you like to work together to merge the projects, so we can bring a FOSS app to the community?

The primary project is https://github.com/Tadashi-Hikari/Sapphire-Assistant-Framework, but the most active version can be found at https://github.com/Tadashi-Hikari/Athena

Some ideas for skills

I have some ideas for skills:

You can add hello skill (ie. I say: Hello Dicio, Dicio say: Hi, may I help you?.)
You can add tell me a joke skill.
You can add type a messige skill. When you are in car, you need this skill.
~~You can add wake up phrase.~~
~~You can add alarm skill.~~
~~You can add translator skill.~~
You can add what is my name skill.
You can add shopping list skill.
~~And many many other skills.~~

These are my ideas, you can add also your ideas.

Skill translation

Hello, I would like to participate in the translation of skills. I have already understood more or less how it works, but I don't know if there is any need to compile anything.
A little help would be nice