Comments (3)
@bertsky thanks for debugging.
- The METS problem should be fixed.
- That's right. There is no separate example for recording WORD in Page.
- It's correct, all possibilities can occur. I'm sorry, glyphs only with coordinates based on squares do not represented reality.
- All Unicode that refer to private areas should now be returned to "Standard" Unicode. In the case of ligatures there are now sometimes two characters (e.g. the ligature ch).
But see: https://ocr-d.github.io/gt//trans_documentation/ocr_d_koordinationsgremium_codierung.html - Readme.md added and corrected.
from assets.
Thanks @tboenig – that was fast!
Regarding 4, now I am puzzled: I still find ſt
U+FB05 (ſt
ligature from Unicode-Ll) in the new version. It does appear in the IMPACT table you referenced, but how is that relevant here? As far as I understand, our GT guidelines are different: ligatures are to be split up in all but transcription level 3 – but all the other GT files seem to be level 2. Which one is correct?
BTW, the table on ligatures in your guidelines has a typo: for the ſt
ligature it says U+EADA
instead of U+FB05
.
from assets.
@bertsky Thanks,
- the table on ligatures
The error in the table is corrected. - thank you again, for your reading carefully:
The current version has to be corrected again, it would correspond to a Level 3 version then, I'm working on a correct version for Level 2 right now. This version will also be published.
from assets.
Related Issues (20)
- 1000pages: Inconsistent annotation of column separators in "krafft_landwirtschaft02_1876"" HOT 1
- 1000pages: Non-existent separator annotated on page 0018 of "krafft_landwirthschaft03_1876"" HOT 2
- 1000pages: Missing text on page 0003 and 0004 of "lenau_gedichte_1832" HOT 3
- Change the file name in DFKI test data HOT 2
- Most/All workspaces in bag files don't validate HOT 4
- Add references to OCR-D Ground Truth repo. HOT 1
- provide TableRegion/Grid examples HOT 6
- Repository not usable on case insensitive filesystems (like macOS and Windows) HOT 6
- Update scribo-tests with correct `k` parameters for sauvola-ms-fg HOT 1
- Add a METS with lots of files for testing HOT 9
- Lots of XSD validation errors HOT 2
- Self-contained make "update-bagit" target
- zip files broken links
- euler_rechenkunst01_1738 has wrong structLink
- OCR-D GT uses wrong mods:languageTerm/@authority
- wrong image references
- Validation errors for 'gutachten'
- Broken CI validation test and warning because of outdated code
- make local image refs LOCTYPE=OTHER OTHERLOCTYPE=FILE instead of URL HOT 1
- Missing license
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from assets.