Coder Social home page Coder Social logo

поши, пошиең about apertium-tat HOT 14 OPEN

apertium avatar apertium commented on June 23, 2024
поши, пошиең

from apertium-tat.

Comments (14)

jonorthwash avatar jonorthwash commented on June 23, 2024

Some of these will require changes to the lexc also. Could you create a yaml file for these in tests/morphophonology/ (maybe something like final-i.yaml)?

from apertium-tat.

mansayk avatar mansayk commented on June 23, 2024

I already modified lexc. What else can I do there? Could you please check twol rules?
Ok, I will try to create yaml later.

from apertium-tat.

mansayk avatar mansayk commented on June 23, 2024

I created that file with some rules:
c1f843a

from apertium-tat.

jonorthwash avatar jonorthwash commented on June 23, 2024

Could you add the абый, сеңел, and зур examples too? For analyses that have multiple "correct" forms, you can put two lines.

from apertium-tat.

mansayk avatar mansayk commented on June 23, 2024

Done

from apertium-tat.

mansayk avatar mansayk commented on June 23, 2024

@jonorthwash could you fix twol rules according to this table, please?

Base form Current Correct
поши пошисы пошие
тарихи тарихисы тарихие
гади гадисе гадие
песи песисе песие
ки - киеп
музей музейе музее
ансамбль - ансамбле
гаеп гаепы гаебе
гает гаеты гаете
каникул каникулләр каникуллар
суд судга судка
ю - юа
җәмгыять җәмгыятьны җәмгыятьне

from apertium-tat.

IlnarSelimcan avatar IlnarSelimcan commented on June 23, 2024

Hi!

The correct 3rd person possession form of "поши" is "пошие" (пошиең, пошиемны...). But currently it is processed as "пошисы", what I think is not correct, @IlnarSelimcan, right?

Right, "пошие" is the correct form.

It also applies to other words ending with "и": тарихи (тарихиена, тарихисы?), бриджи (бриджиен, бриджисы?), гади (гадиен), песи (песиең), ралли (раллиена, раллисы?), verb ки (киеп)...

The form -сы is also used in some cases: абый абые абыйсы, сеңел сеңеле сеңлесе, зур зурысы...

from apertium-tat.

jonorthwash avatar jonorthwash commented on June 23, 2024

The form -сы is also used in some cases: абый абые абыйсы, сеңел сеңеле сеңлесе, зур зурысы...

You're saying that there are multiple correct realisations for these possessive forms, right? If so, which should be the default (at least for абый and сеңел)?

from apertium-tat.

mansayk avatar mansayk commented on June 23, 2024

According to Tatar orthographic dictionary (2017) - абые,
Tatar explanatory dictionary (2013) - both абые and абыйсы.

I don't know how to choose the default one :) Personally I use абыйсы, сеңлесе. @IlnarSelimcan what do you think?

from apertium-tat.

mansayk avatar mansayk commented on June 23, 2024

Maybe the default one should be that made according to rules: абые in this case?

from apertium-tat.

jonorthwash avatar jonorthwash commented on June 23, 2024

Maybe the default one should be that made according to rules

I think it should be the most common one, whichever that is. It'll be just as hard to implement either way—it's just a matter of which entry isn't included in the generation transducer.

from apertium-tat.

mansayk avatar mansayk commented on June 23, 2024

To be frank, I cannot decide which one is the most common: абые or абыйсы. I suppose @IlnarSelimcan thinks the same way. The only possibility is to use, for example, some corpus. According to the Corpus of Written Tatar consisting of ~350 mln word occurrences:
абые* - 12,998 occurrences
абыйсы* - 20,067 occurrence
Despite the second word does not obey grammatical rules, it is used significantly more often. So we should choose it for the default?

from apertium-tat.

jonorthwash avatar jonorthwash commented on June 23, 2024

Maybe a better criterion than "which is used more often?" would be "which do native speakers find least unexpected?"

@mansayk, you use абыйсы and it's more common in the corpus—which would argue in favour of that form. You say you don't consider абые wrong, but is does it feel in any way less common? Like, when you encounter it in text, does it feel jarring, old-fashioned, or somehow unusual? That would be the best argument for going with абыйсы as the default.

@IlnarSelimcan, we're still waiting for your input on this.

from apertium-tat.

mansayk avatar mansayk commented on June 23, 2024

you use абыйсы and it's more common in the corpus—which would argue in favour of that form.

Ok, I agree, I will take it into account next time. But the second form doesn't feel less common in any way to me: it is not old-fashioned, it is definitely not unusual, it doesn't feel jarring and as I pointed before it is used as the only form in Tatar orthographic dictionary (2017)... Personally I consider it just as an alternative form, some kind of absolute synonym. @IlnarSelimcan we really need your help here.

from apertium-tat.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.