Comments (14)
Some of these will require changes to the lexc
also. Could you create a yaml
file for these in tests/morphophonology/
(maybe something like final-i.yaml
)?
from apertium-tat.
I already modified lexc. What else can I do there? Could you please check twol rules?
Ok, I will try to create yaml later.
from apertium-tat.
I created that file with some rules:
c1f843a
from apertium-tat.
Could you add the абый, сеңел, and зур examples too? For analyses that have multiple "correct" forms, you can put two lines.
from apertium-tat.
Done
from apertium-tat.
@jonorthwash could you fix twol rules according to this table, please?
Base form | Current | Correct |
---|---|---|
поши | пошисы | пошие |
тарихи | тарихисы | тарихие |
гади | гадисе | гадие |
песи | песисе | песие |
ки | - | киеп |
музей | музейе | музее |
ансамбль | - | ансамбле |
гаеп | гаепы | гаебе |
гает | гаеты | гаете |
каникул | каникулләр | каникуллар |
суд | судга | судка |
ю | - | юа |
җәмгыять | җәмгыятьны | җәмгыятьне |
from apertium-tat.
Hi!
The correct 3rd person possession form of "поши" is "пошие" (пошиең, пошиемны...). But currently it is processed as "пошисы", what I think is not correct, @IlnarSelimcan, right?
Right, "пошие" is the correct form.
It also applies to other words ending with "и": тарихи (тарихиена, тарихисы?), бриджи (бриджиен, бриджисы?), гади (гадиен), песи (песиең), ралли (раллиена, раллисы?), verb ки (киеп)...
The form -сы is also used in some cases: абый абые абыйсы, сеңел сеңеле сеңлесе, зур зурысы...
from apertium-tat.
The form -сы is also used in some cases: абый абые абыйсы, сеңел сеңеле сеңлесе, зур зурысы...
You're saying that there are multiple correct realisations for these possessive forms, right? If so, which should be the default (at least for абый and сеңел)?
from apertium-tat.
According to Tatar orthographic dictionary (2017) - абые,
Tatar explanatory dictionary (2013) - both абые and абыйсы.
I don't know how to choose the default one :) Personally I use абыйсы, сеңлесе. @IlnarSelimcan what do you think?
from apertium-tat.
Maybe the default one should be that made according to rules: абые in this case?
from apertium-tat.
Maybe the default one should be that made according to rules
I think it should be the most common one, whichever that is. It'll be just as hard to implement either way—it's just a matter of which entry isn't included in the generation transducer.
from apertium-tat.
To be frank, I cannot decide which one is the most common: абые or абыйсы. I suppose @IlnarSelimcan thinks the same way. The only possibility is to use, for example, some corpus. According to the Corpus of Written Tatar consisting of ~350 mln word occurrences:
абые* - 12,998 occurrences
абыйсы* - 20,067 occurrence
Despite the second word does not obey grammatical rules, it is used significantly more often. So we should choose it for the default?
from apertium-tat.
Maybe a better criterion than "which is used more often?" would be "which do native speakers find least unexpected?"
@mansayk, you use абыйсы and it's more common in the corpus—which would argue in favour of that form. You say you don't consider абые wrong, but is does it feel in any way less common? Like, when you encounter it in text, does it feel jarring, old-fashioned, or somehow unusual? That would be the best argument for going with абыйсы as the default.
@IlnarSelimcan, we're still waiting for your input on this.
from apertium-tat.
you use абыйсы and it's more common in the corpus—which would argue in favour of that form.
Ok, I agree, I will take it into account next time. But the second form doesn't feel less common in any way to me: it is not old-fashioned, it is definitely not unusual, it doesn't feel jarring and as I pointed before it is used as the only form in Tatar orthographic dictionary (2017)... Personally I consider it just as an alternative form, some kind of absolute synonym. @IlnarSelimcan we really need your help here.
from apertium-tat.
Related Issues (20)
- "алд" instead of "ал" HOT 9
- асфальтны is not analyzed correctly HOT 16
- "бульдог" is not analyzed in the form "бульдогка" HOT 3
- бульдозер, бульдозерында HOT 1
- бунтарь, бунтарьлар HOT 16
- конъюнктивитны HOT 1
- объективрак HOT 1
- шәфәкъны
- Affixes after quotes HOT 2
- "китаб" instead of "китап" HOT 9
- Loanwords after marking them HOT 4
- Rule conflicts HOT 3
- -RUS tag vs -RUS-BACK and -RUS_FRONT HOT 4
- гыйнвар:январь HOT 2
- corpus data in tests-tatcorpus HOT 11
- Add analysis for 'дисәңче'
- Does archaic -мак verb form accept additional affixes
- Unrecognized numerals HOT 2
- Installed modes are missing files
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from apertium-tat.