Comments (9)
This isn't necessarily wrong. A lemma can be anything, and here we chose the morphological root instead of the citation form. @IlnarSelimcan, what do you think makes the most sense?
from apertium-tat.
Btw, @mansayk, the invalid
label means that an issue isn't valid and that we shouldn't pay attention to it. I think the label you're after is just bug
.
from apertium-tat.
O, thank you, I didn't know that about "invalid" label.
from apertium-tat.
About lemma, the basic form of that word is "ал" - that is a normal word with the meaning "front side" and we cannot lemmatize that word as "алд", right? I think lemma "ал" is a better choice here.
from apertium-tat.
I think lemma "ал" is a better choice here.
@IlnarSelimcan, what do you think? I'm happy either way, and it's trivial to change, but I want to make sure there wasn't a reason it's алд. Possibilities that come to mind are for translation: that as алд there's no possibility of confusion with the adjective(?), verb, and auxiliary ал.
from apertium-tat.
Closed 4395924
from apertium-tat.
Historically, these three have been "алд", "аст", "өст", but seem to shift more and more towards variants without д/т in all forms, at least in speech.
In forms without possessives (аска, өскә, алга, алны, алдан...), or in plural (алларына, өсләренә, асларыннан) д/т won't surface. Based on that, "ас", "өс" and "ал" seem to be more appropriate for the lemma, but I'm not sure whether we have listed all arguments pro and contra here.
Here are some excerpts from suzlek.antat.ru
Тэтимол 2015
АЛ III, иск., кит. алд «перед, передняя часть; передний» < гом. төрки алд, алт «ал, ас» бор. гом. төрки «аяк асты» тамырыннан (ал > аст мәгъ. күчеше барган тарафың, юлың аяк астында калуы белән аңлатыла: ал тамыры фин-угор *ul «ас» тамыры белән чагыштырыла. Гомумән, бу сүзләрнең этимологиясе бик юраулы. Алд, аст, өст, арт сүзләрнең уртак элементы -д/-т да бит төрлечә карала (к. ЭСТЯ I: 140–141).
Элекке (гарәп шрифтындагы) язылышта гадәттә алт сүзенең -т өлеше төшерелеп калдырылмаган һәм бу дөрес тә кебек (алд як һ.б. дип язылган), һәрхәлдә гомуми системага муафикъ.
Ал продуктив нигез: алгы, алдын һ.б. К. Алын.
Тэтимол 2015
ӨС I– по существу неправильное написание слова өст «верх; верхняя одежда» < гом. төрки һәм бор. төрки üst. Ф. Исхаков фикеренчә (к. Ал, Арт, Ас ), < үстү < үснү < үсүн-ү үсешендә бор. *үсүн ~ хак., тув. үзүн «өске як, үсү ягы (?)» сүзеннән килеп чыккан, бу үсүн ~ үзүн исә *үс- ~ *үз- «каплау» фигыленнән (чаг. Өсәк ) ясала ала, к. Будагов I: 135–136; ЭСТЯ I: 638–639. Элек (гарәп шрифтында) өст-баш, өсткә, өстке дип язганнар һәм дөрес эшләгәннәр. Латинчага һәм кириллицага күчкәч өs, өс дип кенә язу орфографиянең фәнни нигезләреннән чигенү (имеш, «халыкча») булган. Чаг. – рус орфографиясендә авазларның әйтелмәгәннәрен язу бик күп: ләкин бу хәлне «төзәтергә» өндәүчеләр юк.
Дерив.: өсле, өссез, парлы сүзләрдә: өсте-башы (өсе түгел?), өсте-асты һ.б. К. Өстәр, Өстә-ү, Өстен.
from apertium-tat.
+1 for "ал, ас, өс", because they are orthographically correct and understandable for everyone.
from apertium-tat.
I think it makes sense to have ал, ас, and өс as the lemmas (as long as there aren't other nouns these would become ambiguous with), since they are orthographically correct on their own and are dictionary headwords. Also, this is in line with how we treat дус.
For the record, the argument that the /d/ and /t/ don't surface when the forms are on their own (but do before a vowel) could be used for either position: since they are there underlyingly (from a generativist standpoint), the forms with them could make more sense as the lemmas.
from apertium-tat.
Related Issues (20)
- асфальтны is not analyzed correctly HOT 16
- "бульдог" is not analyzed in the form "бульдогка" HOT 3
- бульдозер, бульдозерында HOT 1
- бунтарь, бунтарьлар HOT 16
- конъюнктивитны HOT 1
- объективрак HOT 1
- шәфәкъны
- Affixes after quotes HOT 2
- "китаб" instead of "китап" HOT 9
- Loanwords after marking them HOT 4
- Rule conflicts HOT 3
- -RUS tag vs -RUS-BACK and -RUS_FRONT HOT 4
- гыйнвар:январь HOT 2
- поши, пошиең HOT 14
- corpus data in tests-tatcorpus HOT 11
- Add analysis for 'дисәңче'
- Does archaic -мак verb form accept additional affixes
- Unrecognized numerals HOT 2
- Installed modes are missing files
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from apertium-tat.