espeak-ng / espeak-ng Goto Github PK
View Code? Open in Web Editor NEWeSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
License: GNU General Public License v3.0
eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
License: GNU General Public License v3.0
The voice files are closer to language files. These languages should be organised by the closest ISO 639-5 language family code (e.g. voices/europe/cy
moving to language/cel/cy
-- Celtic/Welsh).
Additionally, the languages should be BCP 47 compliant (e.g. en-GB-scotland
instead of en-sc
). Where extensions are needed, the private use tags from Cainteoir Text-to-Speech should be used. These should be described in a privateuse.dat file included in the espeak-ng
project.
The command line interfaces to espeak-ng and speak-ng, as well as the C API, should all be documented in markdown that generates man pages and HTML.
The MBROLA voice phoneme maps should be removed -- these should be replaced by MBROLA voice specific phoneme tables. These should support:
It should be possible to modify the phoneme sequence to account for missing diphones (e.g. the h-6
diphone on the German voices).
When 32655d build is compiled and installed it makes just few beeps and clicks and halts.
(Built on Ubuntu 14.04.3 LTS, kernel x86_64 3.13.0-74-lowlatency).
This code is not used by the program and is accessible by the source control history. Therefore, this code should be removed to make the code more readable.
When trying to compile language with advanced debugging
e.g. from src
map (and similarly from different other maps):
./espeak-ng --compile-debug=lv
It returns:
Can't access (r) file 'lv_rules'
Even though generated files in general are not included in version control, adding src/espeak-ng.1.html
(and/or src/speak-ng.1.html
?) would allow to use it in references from other documents.
The ucd-tools
project is being used in the eSpeak for Android project. It is also needed on platforms like Windows Mobile. Additionally, different versions of platforms support different versions of Unicode. Using ucd-tools
would make this the Unicode support consistent.
These files should use the BCP47 code for the language (en-GB-scotland
, pt-BR
, da
, etc.) and should be separate from the voice files.
The binary data files are difficult to work with and version properly. These should be stored in a text format and compiled to binary during the build, like the language dictionaries are.
Currently, not all code paths inherited from the espeak codebase return the correct error code. The code should be checked so that:
espeak_ng_STATUS
code;espeak_ng_ERROR_CONTEXT
object.This will ensure that the underlying error is correctly preserved.
Currently at master 36be9ac src/libespeak-ng/speech.h line 41: PLATFORM_POSIX is a hardcoded define.
This makes it impossible to compile on non-posix platforms such as Windows, even by providing contrary PLATFORM_* defines via compiler commandlines.
Perhaps it is a work in progress to move the code across to some kind of configure.h.in or equivalent, but for now this is completely breaking compilation of libespeak-ng on Windows for NVDA.
Note that commenting this line out, addressing the fclose bug in LoadSpectSeq, handling NULL log FILE arguments in compilePhonemeData and compileIntonations, and addressing the phsource directory check in compilePhonemeData2, libespeak-ng is being successfully compiled and used by NVDA.
spect.h
This should add Visual C++ project files to support Windows-based builds.
The mkdictlist file is missed, I want to use it to add new voices for arabic
The program, library, environment variables, etc. should reflect the project being espeak-ng
. This will help differentiate it from the upstream espeak
project, should Jonathan continue to make releases to it, allowing both to run on a system. It also helps to avoid confusion.
Currently, the espeak portability is a mix of PLATFORM_*
checks and HAVE_*
checks.
The portability approach should be like how libressl handles portability. That is:
src/include/compat
.src/compat/API_NAME.c
compatibility implementations.This will keep the main source code clean and free of #if ...
checks.
Various places in the codebase (e.g. the async thread code) call assert on errors. This will cause the calling application to exit abruptly.
These assert
checks should be made into if checks that return espeak_ng_STATUS
error codes that are propagated to the caller.
The speak-ng
program should link to the static library.
This is an artifact of the Windows Visual Studio build support for pre-compiled headers. The pre-compiled headers are only used for including a lean and mean version of windows.h. Thus, in the places where windows.h is actually used, it should be included within an #ifdef PLATFORM_WINDOWS
block. The Visual Studio project files should use the "not using pre-compiled headers" option.
The SAPI bindings should be rewritten to avoid the Copyright assignments to Microsoft. The rewrite should also look to improve the SAPI binding in general.
General Infrastructure:
midl
, etc.SAPI Interfaces:
ISpTTSEngine::Speak SPVTEXTFRAG state actions:
Voice Management:
The existing espeak phoneme table voice data should be restructured into a src/voices/jonathan
. It should be capable of speaking as many IPA phonemes as possible.
Once the project has been restructured to make use of this voice profile, the existing voice data should be removed.
SMS encoded Greek is uppercase Latin and Greek characters, where the Latin characters are visually identical to the Greek characters (e.g. using A for alpha).
NOTE: This will require heuristics as the Android platform, etc. will map the SMS characters to Unicode, mixing the Latin and Greek scripts.
The eSpeak project has a lot of issues identified by Coverity. The espeak-ng code should pass a Coverity scan.
These functions are duplicated in libespeak-ng
, espeak-ng
and speak-ng
. There should only be one version of each of these functions.
The espeak codebase should pass the clang static analysis checks.
In Text to Phoneme Translation it is said that rule groups can be from 01 to 25 but in dictsource .._rules files group numbers are up to 38.
Please update documentation to what is actually allowed range.
This requires porting the code from wxWidgets to C.
This allows the espeak program to support building the phoneme table and intonation data, making it possible to build espeak without espeakedit on a headless (GUI-less) system.
The voice data is the phoneme tables and intonations. This would allow selecting a different location for this data:
espeak-data
path);Access to this functionality should be provided by the C API and command-line interface.
This will allow newer versions of sonic to be used and incorporate fixes from distributions.
Tested on commit 5d295bd
Listen http://odo.lv/ftp/temp/broken.wav.mp3
It looks like lengthening phoneme : is much too long for all vowels.
Last good pronunciation is for commit 5bbc0d3
The documentation for espeak contains a description of available languages, along with an assessment of its quality. This is not maintainable, as the supported languages are continually being updated, and new languages added.
This change will add metadata to the language files that describe their quality/maturity, to indicate their level of assessment by native speakers for how good the pronunciations are. The current maintainer (or unmaintained
if none are currently provided), will track if the voice is being actively improved.
Various functions inconsistently have {//===...
at the start and } // end of ...
comments. These make the code harder to read and should be removed.
The way that eSpeak is designed, it uses a modified version of praat to determine the spectral parameters for a phoneme file. This should not be needed -- it should be possible to do this within espeak-ng itself (or the associated GUI voice editor).
The phoneme definitions should provide:
The ascii transcription used should be specified within the voice definition file.
NOTE: The kirshenbaum/cainteoir phoneme features should be documented in a markdown file in the docs directory.
The eSpeak codebase uses old-style C89 variable declaration constraints. It should declare the variables at the point they are first used to make the code easier to read and maintain.
On systems that need getopts (e.g. Windows), the getopt
compatibility helper should be moved to a separate src/compat/getopt.[hc]
file. This allows the compatibility code to be shared between espeak-ng
and speak-ng
as well as use a standard (well tested) implementation of the compatibility code.
The espeak command line and API should provide the following input modes:
On (L)ubuntu 14.04, when autogen.sh is invoked following warnings are shown:
libtoolize: Consider adding `-I m4' to ACLOCAL_AMFLAGS in Makefile.am.
Makefile.am:54: warning: '%'-style pattern rules are a GNU make extension
Makefile.am:58: warning: '%'-style pattern rules are a GNU make extension
The espeakedit
program requires wxWidgets to build. This dependency should be removed, such that:
Implementing the editor functionality from scratch allows avoiding the complexities of the espeakedit code (it relies on accessing internal APIs) and in porting from wxWidgets to Qt. It also allows the editor to be redesigned to match the needs of making it easy to create, edit and test the voice and language creation graphically.
NOTE: It should be possible to create, edit and test voices and languages on a command line without needing to use the GUI. The GUI should just make the process easier.
The main issue here is reworking the event logic to use Mac compatible versions of the sem_
functions, e.g. via POSIX APIs.
The speak_lib.h
API should be redesigned to better fit the usage, provide more detailed status codes, etc. and be placed in espeak_ng.h
. This will allow the eSpeak NG API to be used independently of the eSpeak API, and make it possible to evolve the API to meet the needs of eSpeak NG to provide new features.
For eSpeak compatibility, the speak_lib.h
methods should be implemented in speak_lib.c
-- these should forward to the espeak_ng.h
APIs, map the status codes, etc..
The code should be reformatted to:
if
, return
, etc.;return x
instead of return(x)
.Other style improvements should be applied to reflect modern C practices.
The eSpeak project uses int
/0
/1
for boolean values. The stdbool.h header and bool
/true
/false
should be used instead. This will help with the readability and maintainability of the code.
Although some improvements in the Dutch language have been included in the original espeak in 2012 and 2013, Dutch is still far from ideal. Things which have to be improved:
In this branch, i started work on ch/g. Only thing I did so far is renaming the [x2] phoneme to [x], making words with [x] use the [x2] phoneme in Afrikaans. Please let me know whether this is desired behavior, or whether I should change nl-rules and nl-list to have [x2] instead, thereby abandoning [x].
There are 3 types of emoticons/emoji that can be supported:
:)
;:smile:
.Ideally, the punctuation sequences and emoji shortcodes should be mapped to the Unicode characters, and those characters specify the text of how they are pronounced (e.g. "smiling face"). The Unicode character support should be possible, but I suspect the others would need modifications to espeak's text analysis logic to detect the emoticon/emoji sequences -- I don't currently understand this logic too well, so would need some time understanding how it works.
The other issue is sharing the punctuation sequence and emoji shortcode mappings, so the different languages don't need to duplicate those definitions. I don't currently know how possible that would be.
Pronouncing the Unicode characters can be tricky as well ("emoji" technically also cover emoticons and other characters like playing cards that are not necessarily part of the emoji block).
The Unicode characters can combine in complex ways. The flag "emoji" are an encoding of the 2-letter country code that flag represents (IT for the Italian flag, US for the American flag, GB for the British flag, etc.) -- all these permutations need supporting. Another complexity is the recent addition of skin tone modifiers.
Windows does not provide strings.h
, needed for strcasecmp
. This should be checked via configure checks and wrapped in #ifdef HAVE_STRINGS_H
.
The strcasecmp
function should also be wrapped in a configure check. On Windows systems, _stricmp
should be used instead, e.g. via:
#define strcasecmp _stricmp
in a Windows-specific config.h
file.
The current directory checking function in src/compiledata.c needs to be made portable.
On POSIX systems, it should use access
, then use stat
and S_IFDIR
to check if it is actually a directory. On Windows, it should use GetFileAttributes
.
The docs
directory contains documentation in HTML format. This should be converted to markdown, with the index linked from the README.md file. The docs can be build using kramdown
.
Latest espeak doesn't compile:
...
src/libespeak-ng/speak_lib.c:148:4: warning: ISO C90 forbids mixed declarations and code [-Wpedantic]
espeak_ERROR a_error = event_declare(event);
^
src/libespeak-ng/speak_lib.c: In function 'sync_espeak_terminated_msg':
src/libespeak-ng/speak_lib.c:225:2: warning: ISO C90 forbids mixed declarations and code [-Wpedantic]
int finished=0;
^
src/libespeak-ng/speak_lib.c: In function 'MarkerEvent':
src/libespeak-ng/speak_lib.c:570:2: warning: ISO C90 forbids mixed declarations and code [-Wpedantic]
espeak_EVENT *ep;
^
src/libespeak-ng/speak_lib.c: In function 'sync_espeak_Synth':
src/libespeak-ng/speak_lib.c:632:2: error: 'for' loop initial declarations are only allowed in C99 mode
for (int i=0; i < N_SPEECH_PARAM; i++)
^
src/libespeak-ng/speak_lib.c:632:2: note: use option -std=c99 or -std=gnu99 to compile your code
src/libespeak-ng/speak_lib.c: In function 'espeak_Initialize':
src/libespeak-ng/speak_lib.c:772:2: warning: ISO C90 forbids mixed declarations and code [-Wpedantic]
int param;
^
src/libespeak-ng/speak_lib.c: In function 'espeak_Synth':
src/libespeak-ng/speak_lib.c:854:2: warning: ISO C90 forbids mixed declarations and code [-Wpedantic]
espeak_ERROR a_error=EE_INTERNAL_ERROR;
^
src/libespeak-ng/speak_lib.c:870:2: warning: ISO C90 forbids mixed declarations and code [-Wpedantic]
t_espeak_command* c1 = create_espeak_text(text, size, position, position_type, end_position, flags, user_data);
^
src/libespeak-ng/speak_lib.c:876:2: warning: ISO C90 forbids mixed declarations and code [-Wpedantic]
t_espeak_command* c2 = create_espeak_terminated_msg(*unique_identifier, user_data);
^
src/libespeak-ng/speak_lib.c: In function 'espeak_Synth_Mark':
src/libespeak-ng/speak_lib.c:935:2: warning: ISO C90 forbids mixed declarations and code [-Wpedantic]
t_espeak_command* c1 = create_espeak_mark(text, size, index_mark, end_position,
^
src/libespeak-ng/speak_lib.c:942:2: warning: ISO C90 forbids mixed declarations and code [-Wpedantic]
t_espeak_command* c2 = create_espeak_terminated_msg(*unique_identifier, user_data);
^
src/libespeak-ng/speak_lib.c: In function 'espeak_Key':
src/libespeak-ng/speak_lib.c:977:2: warning: ISO C90 forbids mixed declarations and code [-Wpedantic]
espeak_ERROR a_error = EE_OK;
^
src/libespeak-ng/speak_lib.c:986:2: warning: ISO C90 forbids mixed declarations and code [-Wpedantic]
t_espeak_command* c = create_espeak_key( key, NULL);
^
src/libespeak-ng/speak_lib.c: In function 'espeak_Char':
src/libespeak-ng/speak_lib.c:1009:2: warning: ISO C90 forbids mixed declarations and code [-Wpedantic]
espeak_ERROR a_error;
^
src/libespeak-ng/speak_lib.c:1017:2: warning: ISO C90 forbids mixed declarations and code [-Wpedantic]
t_espeak_command* c = create_espeak_char( character, NULL);
^
src/libespeak-ng/speak_lib.c: In function 'espeak_SetParameter':
src/libespeak-ng/speak_lib.c:1109:2: warning: ISO C90 forbids mixed declarations and code [-Wpedantic]
espeak_ERROR a_error;
^
src/libespeak-ng/speak_lib.c:1117:2: warning: ISO C90 forbids mixed declarations and code [-Wpedantic]
t_espeak_command* c = create_espeak_parameter(parameter, value, relative);
^
src/libespeak-ng/speak_lib.c: In function 'espeak_SetPunctuationList':
src/libespeak-ng/speak_lib.c:1138:2: warning: ISO C90 forbids mixed declarations and code [-Wpedantic]
espeak_ERROR a_error;
^
src/libespeak-ng/speak_lib.c:1146:2: warning: ISO C90 forbids mixed declarations and code [-Wpedantic]
t_espeak_command* c = create_espeak_punctuation_list( punctlist);
^
src/libespeak-ng/speak_lib.c: In function 'espeak_Cancel':
src/libespeak-ng/speak_lib.c:1219:2: error: 'for' loop initial declarations are only allowed in C99 mode
for (int i=0; i < N_SPEECH_PARAM; i++)
^
make[1]: *** [src/libespeak-ng/src_libespeak_la-speak_lib.lo] Error 1
make[1]: Leaving directory `/home/valdis/code/espeak-ng'
make: *** [all] Error 2
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.