Coder Social home page Coder Social logo

Comments (3)

Kaljurand avatar Kaljurand commented on August 26, 2024

Interesting idea...

You can make a rewrite rule / button that calls: ee.ioc.phon.android.speak/.activity.GetPutPreferenceActivity with extras:

  • key: defaultRewriteTables
  • val: list of table names that you want to be active

E.g. "Command" = "activity" and "Arg1" =

{
"component": "ee.ioc.phon.android.speak/.activity.GetPutPreferenceActivity",
"extras": {
    "key": "defaultRewriteTables",
    "val": ["punctuation", "spelling"]
  }
}

i.e. you would need to list all the tables, so it's not quite the same as toggling. I guess the toggling feature could be added to GetPutPreferenceActivity quite easily.

So this would be possible already via the existing features (although I haven't tested it). Any improvements beyond that would require some thinking/testing, e.g.

  • how convenient the actual UX would be, e.g. you might have to "refresh" the tables by temporarily switching to a different IME tab
  • would you need to be able to reorder the tables in order to resolve ambiguous mappings in a certain way
  • would you want to switch between modes within the same utterance
  • ...

from k6nele.

devycarol avatar devycarol commented on August 26, 2024

Honestly the more I think about it the more it becomes a STT-API-level problem. Because say we get the toggle buttons on the main keyboard interface, cool, but then we have to deal with getting the API to handle each individual 'mode'—not to mention multiple at once. Some of these potential modes in my "dream" scenario go beyond simple rewrites into telling the API to only return certain characters/phrases—I'm not sure about the open source ones, but I believe Google's is incompatible with such functionality.

But I imagine that if that API-level puzzle were solved, then having the buttons would simply be a matter of allowing their addition, linking the timestamps of the words outputted with the rules that were active at the time, as well as letting certain buttons disable others when enabled—"only allow words, forcing lowercase" and "allow punctuation only" don't exactly mix very well 😅

from k6nele.

Kaljurand avatar Kaljurand commented on August 26, 2024

Yes, applying the rewrite rules in post-processing to whatever text the service returns by default would maybe cover the simpler use cases, but wouldn't be expressive enough to deal with homophones etc. in general. Also, the rewrite rules only see the returned formatted text, but not any meta-info that the service might send back via its API (such as timestamps, alternative hypotheses, unformatted results).

The rewrite rules can send queries to a REST API (via FetchUrlActivity.java as done in https://docs.google.com/spreadsheets/d/1lxvkGerd_WMljca0dsgxViw_5cnOEgDzneBL-uXI-xI/edit#gid=0, or by using the "getUrl" command). So, if the service exposes certain features via a REST API (e.g. switching between language models) then you might be able to have a button that switches these features on an off between utterances.

Or maybe you can set up multiple services, each possibly with the same backend server, but configured differently, and then use the existing service switching button(s) to effectively switch between features. In this case you'd probably have to implement each service as a separate lightweight app, because Android (probably) does not support spawning services dynamically.

from k6nele.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.