Coder Social home page Coder Social logo

lectaurep / lepidemo Goto Github PK

View Code? Open in Web Editor NEW
3.0 0.0 1.0 3.52 MB

LECTAUREP Pipeline demonstration to TEI Publisher

License: Creative Commons Attribution 4.0 International

Python 8.26% Jupyter Notebook 91.74%
tei pagexml pipeline htr escriptorium tei-publisher

lepidemo's Introduction

License: CC BY 4.0 DOI

LEPIDEMO : LECTAUREP PIPELINE DEMONSTRATOR

Going from eScriptorium to TEI-Publisher

This demonstration shows the implementation of a pipeline going from PAGE XML to TEI Publisher created within the frame of the LECTAUREP project.

LECTAUREP is a project jointly led by Inria (ALMAnaCH) and the Archives nationales de France (DMC). Its purpose is to facilitate the exploration of thousands of pages of directories listing minutes and deeds redacted by Parisians notaries between the beginning of the 19th century and the mid-20th centuries. To do so, LECTAUREP relies on automatic transcription performed with Kraken via the eScriptorium web application.

Images are loaded on the platform, then transcribed and annotated, and finally exported to PAGE XML files. The last section of the pipeline aims at offering users a platform to visualise, querry and read the pages of the directories. An almost ready-to-use solution consist in using TEI-Publisher, which requires transforming the PAGE XML files into compliant TEI XML.

LEPIDEMO demonstrates how this transformation can be plugged into eScriptorium as a simple python script.

A Jupyter notebook

The demonstration can be followed step by step using the lepidemo.ipynb Jupyter scenario.

Installation

  • Create a python virtual environment: `virtualenv -p python3 [ENVIRONMENT NAME]
  • Activate it source [ENVIRONMENT NAME]/bin/activate
  • Then launch Jupyter with jupyter notebook
  • Openlepidemo.ipynb with jupyter browser and then follow cells instructions.

Cite this work

Chagué, A., & Scheithauer, H. LEPIDEMO, a Pipeline Demonstrator for LECTAUREP to go from eScriptorium to TEI-Publisher [Computer software]

lepidemo's People

Contributors

alix-tz avatar hugoschtr avatar

Stargazers

 avatar  avatar  avatar

Forkers

hugoschtr

lepidemo's Issues

Synch w/ zenodo

  • add DOI badge
  • add DOI in citation
  • add license
  • add zenodo metadata

màj tei Header

Vu la PR #1 il faut mettre à jour la manière dont le TEI Header est composé.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.