Coder Social home page Coder Social logo

archivesnationalesfr / rico-converter Goto Github PK

View Code? Open in Web Editor NEW
27.0 5.0 2.0 70.79 MB

A tool to convert EAC-CPF and EAD 2002 XML files to RDF datasets conforming to Records in Contexts Ontology (RiC-O)

Home Page: https://archivesnationalesfr.github.io/rico-converter/

License: Other

Java 31.54% XSLT 66.44% Batchfile 0.33% Shell 1.69%
eac-cpf ead xml xslt ric-o rdf archives

rico-converter's Introduction

RiC-O converter

A tool to convert EAC-CPF and EAD 2002 XML files to RDF datasets conforming to Records in Contexts Ontology (RiC-O).

This repository includes the converter, some examples files and documentation.

The converter is written mostly using XSLT stylesheets, wrapped in a Java command-line application.

Documentation

The documentation can be found at https://archivesnationalesfr.github.io/rico-converter

References

rico-converter's People

Contributors

dependabot[bot] avatar florenceclavaud avatar tfrancart avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

rico-converter's Issues

Check for empty values when generating literal properties (everywhere)

A query to check for empty literal values in the POC data:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?p (COUNT(?s) AS ?count) WHERE {
  ?s ?p ?o
  FILTER(STR(?o) = "")
} GROUP BY ?p

Yields the following results:

1 rdfs:label "29"^^xsd:integer
2 rico:descriptiveNote "3"^^xsd:integer
3 rico:textualValue "14"^^xsd:integer
4 rico:identifier "6390"^^xsd:integer

Looks like they are rico:identifier on Instantiation such as http://data.archives-nationales.culture.gouv.fr/instantiation/041185-Grandrye_Pierre_0-i1

Generate shortcut properties

Générer systématiquement le triplet de type ‘shortcut’ utilisant la
object property déclarée pour cela dans l’ontologie, dans le cas d’une
relation complexe (impliquant l’instance d’une sous-classe de Relation)
Voir par ex. https://www.ica.org/standards/RiC/ontology#performsOrPerformed
Cette évolution concerne surtout la conversion des notices EAC-CPF. Elle impliquera également de modifier les tests unitaires.

hard-coded h4 in ead2rico.xslt

Thanks for the great work. I noticed that there are many hard-coded html:h4...</html:h4> in French within the ead2rico.xslt. For example, line 121, 127, 159. Is there a workaround for other languages? Many thanks.

Fix hasDocumentaryFormType

Correction dans l’utilisation de hasDocumentaryFormType (à utiliser uniquement pour les classes
Record et Record Part). Utiliser, dans le cas de Record Set, les propriétés similaires aux nouvelles
propriétés qui ont pour range Language (voir plus haut).

Express sequences of Records or Record Sets in a Record Set

We forgot to do so :(, and it is important of course: if we do not do this, we shall lose some information compared to what the EAD sequence of c elements means.
So I think we should do it in the v3 / RiC-O 1.0 version.
We could at least use the new RiC-O rico:directlyPrecedesInSequence object property to connect a Record Resource (whose description is in a EAD c element) to the one that is described by the first following-sibling c element. Same for connecting the Record Resource to the last preceding-sibling c element (using rico:directlyFollowsInSequence).
See also this issue, opened a few days ago about RiC-O: ICA-EGAD/RiC-O#97

Fix includesOrIncluded

Correction à faire dans l’emploi de includesOrIncluded : à remplacer par hasOrHadConstituent
lorsque la cible est un RecordPart (et idem pour relation inverse)

Update language controlled vocabulary file

See comments on #31 :

So we should also modify the VOCABULARY_LANGUAGES param and the eac2rico:URI-Language function in the eac2rico-uris.xslt file + upload the new version of the vocab and delete the previous one. So, do we do this in the same branch or should we rather create a new issue and branch to handle this?

  • Take the updated file from vocabulary repo
  • Update the reading of the vocabulary to read skos:Concept and not rico:Language

Typo in ricoconverter.sh

Hi! I'm testing the converter in a Mac environment and it seems to me that there is a little typo in the "ricoconverter.sh" file.
At line 23 appears a "@" that causes an error while trying to launch a command, since it seems to interfere with the file path, giving e.g. the error:

"ERROR : Could not read file @parameters/convert_eac.properties: java.nio.file.NoSuchFileException: @parameters/convert_eac.properties"

I tried and deleted the "@" at line 23 and the file works correctly. But I'm not an expert at all, so I wanted to check with you the correctness of this change.
Thank you very much!
Lucia

Renaming and refactoring to generate new properties hasContentOfType, hasOrHadLanguage, hasOrHadLegalStatus, hasRecordState

Necessitate a refactoring of the RecordSet.
See also #3

Une attention particulière doit être apportée aux propriétés suivantes : hasContentOfType,
hasOrHadLanguage (qui remplace HasLanguage), hasOrHadLegalStatus (qui remplace
hasLegalStatus), hasRecordState (qui remplace hasRecordResourceState). Le domaine de ces
propriétés a été réduit aux classes Record et Record Part (et Agent pour les propriétés
hasOrHadLanguage et hasOrHadLegalStatus) alors qu’il était autrefois fixé à RecordResource.

And also:

Pour la classe RecordSet, de nouvelles propriétés ont été introduites. Par ex., pour relier un
RecordSet et un Language, on utilise désormais les propriétés suivantes :
hasOrHadAllMembersWithLanguage
hasOrHadSomeMembersWithLanguage
Voir à ce sujet le ticket ICA-EGAD/RiC-O#20
Ce qui conduira dans RiC-O Converter à réécrire les instructions correspondantes.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.