Coder Social home page Coder Social logo

apertium-tyv's Issues

Double <perf> tag

In my generated paradigm for кел, some forms have double <perf> tag:

келиптипкен:кел<v><iv><perf><perf><ger_perf><nom>
келивитпепкен:кел<v><iv><perf><neg><perf><ger_perf><nom>
келиптипкеш:кел<v><iv><perf><perf><gna_perf>
келивитпепкеш:кел<v><iv><perf><neg><perf><gna_perf>
келиптиптер:кел<v><iv><perf><perf><p3><sg>
келивитпептер:кел<v><iv><perf><neg><perf><p3><sg>

plus all of their inflections by case, person, etc. Looks like a double -{I}pt{I}, not sure if Tuvan allows this.

чор + -Iр form

Opening an issue per @ftyers' request.

чор:чор<v><iv><aor><p3><sg>

Is this inflection correct, or should it be чоор?

More mismatches and missing inflections

Two more forms in Iskhakov & Pal'mbakh are not being generated:

  • Прошедшее повествовательное время на -п-тыр, прошедшее историческое/заглазное/неожиданное, эрткен үэниң медээ хевири (I&P 373). The book says it's a past tense used to describe a sudden occurrence.

    кээп-тир мен
    кээп-тир сен
    кээп-тир
    кээп-тир бис
    кээп-тир силер
    кээп-тирлер
    

    Without the hyphen, the analyzer can parse the кээптир мен as кел<v><iv><perf><aor><p1><sg>, but it
    generates келиптир мен for that form.

  • Прошедшее-настоящее время на -пышаан (I&P 379). Looks like it denotes an action that started in the past and is still going on, Anderson & Harrison annotate is as durative.

    келбишаан мен
    келбишаан сен
    келбишаан
    келбишаан бис
    келбишаан силер
    келбишааннар
    

    There is an analysis for келбишаан as a verbal adverb though: кел<v><iv><gna_still>. Are these forms
    considered analytic?

some remaining imperative forms

Some imperatives are still broken.

This includes the following regressions because of #2:

>       2 ^чугаалаваайн/*чугаалаваайн$
>       2 ^сагындырбаайн/*сагындырбаайн$
>       1 ^кортпаайн/*кортпаайн$
>       1 ^чугаалаваайн/*чугаалаваайн$
>       1 ^чорбаайн/*чорбаайн$
>       1 ^чажырбаайн/*чажырбаайн$
>       1 ^узуткаваайн/*узуткаваайн$
>       1 ^барбаайн/*барбаайн$
>       1 ^чажырбаайн/*чажырбаайн$
>       1 ^тайылбырлаваайн/*тайылбырлаваайн$
>       1 ^адаваайн/*адаваайн$

And the following form from tests/verbs.yaml:

[1/3][FAIL] саг<v><tv><imp><p1><du> => Missing results: саалы
[1/3][FAIL] саг<v><tv><imp><p1><du> => Unexpected results: сааалы

Generator errors identified through the shared task

This is a list of generator errors that Aziyana Bayyr-ool identified while working on the error analysis for the shared task.

Incorrect inflections:

Generated form Correct form
ижиарлар:ижик<v><TD><aor><p3><pl> ижигерлер
көрдүнүүлү:көрдүн<v><iv><imp><p1><du> көрдүнээли
садырлар:сад<v><tv><aor><p3><pl> садарлар
садыылы:сад<v><tv><imp><p1><du> садаалы
тырылыйн:тырыл<v><TD><imp><p1><sg> тырлыйн
тырылырлар:тырыл<v><TD><aor><p3><pl> тырлырлар
ужуаалы:ужук<v><TD><imp><p1><du> ужаалы
холужуптур бис:холуш<v><iv><perf><aor><p1><pl> холужуптар бис
холужуптурлар:холуш<v><iv><perf><aor><p3><pl> холужуптарлар
хоорулур:хоорул<v><iv><aor><p3><sg> хоорлур
хоорулур мен:хоорул<v><iv><aor><p1><sg> хоорлур мен
хоорулур сен:хоорул<v><iv><aor><p2><sg> хоорлур сен
чыглыңар:чыыл<v><iv><imp><p2><pl> чыглыылыӊар (see Note below)
шымыныр силер:шымын<v><TD><aor><p2><pl> шымныр силер
мөгеейн:мөгей<v><iv><imp><p1><sg> мөгейээйн (rare/unusual)
мөгееалы:мөгей<v><iv><imp><p1><du> мөгейээли (rare/unusual)

Note: чыглыңар:чыыл<v><iv><imp><p2><pl>: Aziyana says this form exists (meaning 'вы собирайтесь') but does not correspond to this lemma. The correct form for чыыл should be чыглыылыӊар ('давайте соберемся').

Incorrect lemmas:

Lemma in the lexicon Correct lemma
номчун<v> номчуттун
өпей<v> өпейле (Aziyana says өпей exists too but as a name)

Forms that are plausible but rarely or never used, so Aziyana has doubts about them:

мөгеейн:мөгей<v><iv><imp><p1><sg>
мөгееалы:мөгей<v><iv><imp><p1><du>
аржаяйн:аржай<v><TD><imp><p1><sg>
арзаяйн:арзай<v><TD><imp><p1><sg>
мажаяйн:мажай<v><TD><imp><p1><sg>

Installed modes are missing files

modes.xml includes some modes with install="yes", but the required
files aren't installed.

Some generic suggestions:

  • -lexc and -twol modes probably aren't useful to users

  • -spell modes should depend on --enable-ospell

  • .deps files are never installed, so any modes using them shouldn't be
    installed.

  • Messages for package app-dicts/apertium-tyv-9999:

  • Failed to find '/usr/share/apertium/apertium-tyv/.deps/tyv.twol.hfst' in install image.

  • QA: missing files required for mode tyv-twol.

  • Failed to find '/usr/share/apertium/apertium-tyv/.deps/tyv.LR.lexc.hfst' in install image.

  • QA: missing files required for mode tyv-lexc.

  • Failed to find '/usr/share/apertium/apertium-tyv/tyv.zhfst' in install image.

  • QA: missing files required for mode tyv-spell.

  • Failed to find '/usr/share/apertium/apertium-tyv/.deps/acceptor.default.hfst' in install image.

  • QA: missing files required for mode tyv-tokenise.

Possible inflection errors

I've been comparing Apertium-generated paradigms with the ones in Iskhakov & Pal'mbakh 1961 grammar book (Ф. Г. Исхаков, А. А. Пальмбах. Грамматика тувинского языка: Фонетика и морфология.) and found some mismatches.
Disclaimer: I am not a speaker of Tuvan.

  1. Some Apertium-generated imperative forms for кел:

    келеалыңар:кел<v><iv><imp><p1><pl>
    келейн:кел<v><iv><imp><p1><sg>
    келеалы:кел<v><iv><imp><p1><du>
    

    I&P book has келиилиңер, келийн, келиили respectively (pp. 391-392).

  2. Some <p3><pl> forms have a double -лер. I haven't seen this in the literature and it looked suspicious.

    келдилер:кел<v><iv><ifi><p3><pl>
    келдилерлер:кел<v><iv><ifi><p3><pl>
    

    I&P has келдилер for this analysis (I&P 365), and Harrison, 2000 has keldi(ler). The same pattern in other tenses:

    келгеннер:кел<v><iv><ger_past><nom>+э<cop><aor><p3><pl>
    келгеннерлер:кел<v><iv><ger_past><nom>+э<cop><aor><p3><pl>
    келгендирлер:кел<v><iv><ger_past><nom>+э<cop><aor><evid><p3><pl>
    келгендирлерлер:кел<v><iv><ger_past><nom>+э<cop><aor><evid><p3><pl>
    ...
    

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.