Coder Social home page Coder Social logo

Comments (4)

BartChris avatar BartChris commented on June 25, 2024

@markusweigelt The linked feature sounds great. But i think it would be very useful to have the following functionality - in a way also the basis of the described quite fancy functionality -:

  • show the recognized OCR as plain text inside of Kitodo (by parsing the ALTO and filtering out the XML tags)
  • enable the OCR of a single page or multiple pages from Kitodo (from the metadata editor)
  • give an indication in the Kitodo editor which pages are OCR processed and which are not

The main use case for me would be to allow the OCR to be done at the beginning of a workflow. Even before people have done some quality assurance (missing pages etc.). So that the OCR does not have to wait. And to allow people to use the OCR results while structring. And if people then do corrections in Kitodo enable the OCR only for newly added pages for example.

I am not quite sure if those features could be adressed in the KITODO-OCRD-project or wether they are something for the Kitodo development fund, what do you think?

from ocrd_kitodo.

bertsky avatar bertsky commented on June 25, 2024

Most of the things which kitodo/kitodo-production#5476 describes are new Kitodo UI features – out of scope for our OCR-D integration project, so yes, that would mean Kitodo development fund.

  • show the recognized OCR as plain text inside of Kitodo (by parsing the ALTO and filtering out the XML tags)

What we can do here is previewing OCR results with OCR-D browser.

On the Kitodo side, for the intended extension, I think you're right – a simple plain text editor would suffice (one line per TextLine with all its ./String/@CONTENT concatenated).

  • enable the OCR of a single page or multiple pages from Kitodo (from the metadata editor)

Already possible (see --page-id option for for_production.sh and for_presentation.sh). The syntax is explained here (notice multi-value / range / regex support).

  • give an indication in the Kitodo editor which pages are OCR processed and which are not

That's also something we (as integration project) have little control over, since it's a genuine UI feature. All we can do is ensure the filesystem side (FULLTEXT subdirectory and file names) fits Kitodo's conventions.

The main use case for me would be to allow the OCR to be done at the beginning of a workflow. Even before people have done some quality assurance (missing pages etc.). So that the OCR does not have to wait. And to allow people to use the OCR results while structring. And if people then do corrections in Kitodo enable the OCR only for newly added pages for example.

Yes, these are valid use-cases, too. But renaming pages adds the difficulty of ensuring consistency (as long as OCR is still running). I'll try to reformulate under kitodo/kitodo-production#5476.

For ocrd_kitodo IMO we can already close (as it's already supported from our side).

from ocrd_kitodo.

bertsky avatar bertsky commented on June 25, 2024

For ocrd_kitodo IMO we can already close (as it's already supported from our side).

Except perhaps the feature that we should skip pages which have already been processed earlier (an ALTO file exists).

from ocrd_kitodo.

BartChris avatar BartChris commented on June 25, 2024

Great, thanks for your detailed answer!

from ocrd_kitodo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.