Coder Social home page Coder Social logo

Comments (3)

ken107 avatar ken107 commented on September 26, 2024 1

We try to break the text into paragraphs rather than sentences, because the latency associated with cloud voices may cause delay between sentences to be too long. On the other hand, some native voices like the "Google US English" require us to break text into chunks that must be no longer than 15 seconds due to the voice engine's imposed limitation. In other words, the chunk size varies depending on the voice and we just have to accept that.

So we need to highlight the text on the page that corresponds to the current chunk being read. The challenge indeed is that the boundaries of the chunk, i.e. its start/end indices, may not align precisely at the DOM element boundaries. The start/end index may fall in the middle of a span. This mean we have to break the span up into two spans just for the purpose of text highlighting. This is slightly tricky to do, as our algorithm has to be designed to work for light mode and darkmode, and work on a variety of websites on the internet that use varying markups and styling.

For this reason I have not invested the time into this, though I've seen a few other extension have been able to do this fairly successfully.

from read-aloud.

darvon123 avatar darvon123 commented on September 26, 2024 1

I'll try exploring the source code of those extensions you mention to see what makes them tick. maybe I can find some function they made have used to accomplish their feats.

the errors I encounter with text highlighter in the past:

  1. column shifting
  2. metadata bleed1
  3. poor highlighter timing 2
  4. desyncing issue
  5. weirdly bad CSS priority leading to washout or hard to read highlighted text

The other extensions: first glance

I find that from a first glance that these extensions you mention seems to have "solved" the problem of highlighting text. but with a closer examination it seems to me that they just sidestep the issue all together by reconstructing the site's main contents in their own style while keep most of the site's formatting "the same". Main the culprits founded doing this is "speechify" and "Natural Reader". Like it looks perfect at a glance but look a little closer and it feels off in some way. I find that it leads to some jarring formatting errors for text heavy sites that I visit frequently. like some text having slightly smaller or larger font-sizes then usually. easy to miss if you're not looking for it. though it does allow for some cool visual aid feature like a Dyslexia font changer and a clear reader mode too.


Edge's: first glance

I think the best approach to this is to follow how edge does it highlighting of text. by only changing what is needed to be change and leave everything else alone; this does seem to cause little of column shifting occasionally.

it seems to me that edge tense to keep it simple. by highlighting in light blue, the current paragraphs its reading and the individually words in yellow. the words themselves aren't completely in sync with what's being said but that's kind of okay since audio syncing is something that's easy to ignore if you're not looking for it. though edges approach is imperfect and still suffers from desyncing especially over large spans of text.


though I do believe that much of the internet's webpages are pretty standardize in my opinion with much of the contents being very simple html documents with very little CSS standing in the way.

if you're worry about CSS priority fighting then why not place the highlighter under a custom tag and set it to a high priority like edge.

besides nobody expect perfection from a free open-source extension.

Footnotes

  1. this is what I came across when trying to make my own higher lighter. [its basically what happens when your higher Lighter function clip a span element leading to it bleeding into your main content this also leads to poor highlighter timing and desyncing]

  2. when your highlighter function is delayed by metadata or by a connection timeout/issue.

from read-aloud.

nhan000 avatar nhan000 commented on September 26, 2024 1

If this ever gets implemented, please have an op-out option. I like how this extension allows me to interact with the text (highlight, copy) without it changing the reading position like with other extensions and the Edge built-in Read Aloud feature. I'm willing to trade that for the text highlighting capability.

The best thing for my use would be to have 2 modes:

  • Highlight the text but doesn't change the reading location when I interact with the text.
  • Regular mode: change reading location when click a word (as with other extensions).

from read-aloud.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.