Coder Social home page Coder Social logo

Comments (3)

jaacoppi avatar jaacoppi commented on June 13, 2024

from espeak-ng.

Rick-McCoy avatar Rick-McCoy commented on June 13, 2024

Huh, I think that's the true extent of this bug; the dakuten itself isn't the cause.
Since there are overlapping rules for both るう and うぃ, espeak-ng converts the former first, and fails upon encountering .

From what I can glean, espeak-ng consumes the longest possible grapheme sequence specified in the rules sequentially, i.e. a greedy algorithm.
If that is the case, we could handle these anomalies by specifying all possible corner cases:

かあぁ -> ka a:
しいぃ -> s\\i i:
つうぅ -> t_su u:
ねえぇ -> ne e:
...

Or alternatively, we could just add rules for all the smaller versions of the nouns and call it a day:

ぁ -> a
ぁー -> a:
ぃ -> i
...

Of course, this still leaves the problem of the dakuten (and handakuten), which by definition doesn't have a fixed sound.

I propose a mixed strategy: remove the separation of dakuten/handakuten and treat graphemes such as as one grapheme.
Then we could add the smaller versions separately.
We would still need to rewrite most rules, but I think this would minimize the work necessary.

from espeak-ng.

Rick-McCoy avatar Rick-McCoy commented on June 13, 2024

Hmm, this isn't limited to small kana, either. The long vowel indicator (chōonpu) causes this too:

$ espeak-ng -v ja とおー -X
Translate 'とおー'
 36     と      [to]
 57     とお   [to:]

Translate ''
 36     と      [to]

Translate ''
 36     お      [o]

Translate ''
Found: '_ja' [dZ'ap@ni:z]  
t'o 'o _:(en)dZ'ap@ni:z(ja)l'et@

Unlike the above samples which are admittedly pretty niche, this is a very common combination.

from espeak-ng.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.