cuny-cl / latin_scansion Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
I'm reading Cser's The Phonology of Classical Latin and here are some things he claims we might want to support (or enforce)
Pynini 2.1.5 was just released and we may want to update. It doesn't have any major API or performance changes so I think we can just edit the environment.yml
and/or requirements.txt
files.
Thrax 1.3.7 has a trivial but fatal registration bug with StringFile
. I will make an upstream release of a new version, get it into Conda-Forge, and then migrate this library to it. This should clear the CI failure. I am assigning myself.
Should run the scansion test, if that's possible (see https://stackoverflow.com/questions/58849994/running-tests-at-circleci-in-conda-environment for how to get Conda running in CircleCI).
In compound words with prepositions such as ad, ab, sub, dē, in, etc., I think the preposition should be treated as one syllable (but not always). As it is now, the meter grammar attaches the last consonant of the preposition as the onset of the next syllable. The issue would be to know in which cases the prefix in the compound word (if the subsequent syllable begins with a vowel) should be treated as one syllable, or whether the prefix should cause a split.
Example:
"ex numerō subit; ac magnō tellūris amōre"
subit is [su.bit] (i.e. the "b" is the onset)
"Ūnus abest, mediō in flūctū quem vīdimus ipsī"
abest is [ab.estː] (i.e. the "b" is the coda)
If #94 is accepted, we will be checking in textprotos into the repo. We should perhaps add a test to the CI system that automatically runs latin_validate
over any and all textprotos checked in to make sure that new PRs don't introduce syntax errors in them.
n/t
A script should be used to confirm that a Document
textproto file is parseable. (As a side effect it could optionally reserialize the data in question to ensure it's as canonical as possible.)
This will allow us to manually add comments to the files (e.g., a note to the effect that some verse requires hypermetric), but ensure we don't break anything adding these annotations.
Currently the protobuf output by the Python scansion engine does not emit information about syllable boundaries, syllable weight, or foot structure. Only a slight redesign should be necessary to preserve this information and to store it in the proto.
My colleague Richard (politely) requests that we consider also supporting hendacasyllables, like some of Catullus's poems.
Technically this is not impossible but would be slightly annoying in that these are not traditionally described as having foot structure, just verse structure. Probably easiest to have a separate pipeline but reuse the lower-level (below the foot) grammar definitions, and then have a radio button to select the type of verse.
Just putting this here for later ;)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.