Coder Social home page Coder Social logo

oeg-upm / morph-csv Goto Github PK

View Code? Open in Web Editor NEW
10.0 11.0 3.0 113.64 MB

Enhancing virtual KG access over tabular data with RML and CSVW

Home Page: https://morph.oeg.fi.upm.es/tool/morph-csv

License: Apache License 2.0

Python 69.30% Dockerfile 0.68% Shell 5.16% HTML 7.80% JavaScript 16.29% CSS 0.77%
data-integration csv knowledge-graph obda

morph-csv's Introduction

Morph-CSV

Morph-CSV is an open source tool for querying tabular data sources using SPARQL. It exploits the information from the query, RML+FnO mappings and CSVW metadata to enhance the performance and completness of traditional OBDA systems (SPARQL-to-SQL translators). At this moment can be embebed in the top of any R2RML-compliant system. For detail information, watch the introductory video about Morph-CSV. IF you have any related question on how to create RML+FnO or CSVW annotations, please ask to the W3C Community Group on Knowledge Graph Construction

Morph-csv workflow

Citing Morph-CSV: If you used Morph-CSV in your work, please cite as:

@article{chaves2021enhancing,
  author    = {Chaves{-}Fraga, David and Ruckhaus, Edna and Priyatna, Freddy and Vidal, Maria{-}Esther and Corcho, Oscar},
  title     = {Enhancing Virtual Ontology Based Access over Tabular Data with Morph-CSV},
  journal   = {Semantic Web},
  year      = {2021},
  doi       = {https://doi.org/10.3233/SW-210432},
  publisher = {IOS Press}
}

How to use it?

First of all clone the repository:

git clone https://github.com/oeg-upm/morph-csv.git
cd morph-csv

The best way to run Morph-CSV is using its user interface, deployable with docker*:

 docker-compose up -d

An user interface as we show in the following image will be display in localhost:5000 Morph-csv demo

If you prefer a CLI tool, we provide two ways to run morph-csv: using the created docker image or directly run with Python3:

  • Using docker and docker-compose*:

    docker-compose up -d
    docker exec -it morphcsv python3 /morphcsv/morphcsv.py -c /configs/config-file.json -q /queries/query-file.rq
  • Using python3 (under a UNIX system):

    pip3 install -r requirements.txt
    python3 morphcsv.py -c path-to-config-file.json -q path-to-query-file.rq

*If you have any local resource you want to use copy it to the corresponding shared volume (folders: data, mappings, configs or queries)

Define your config.json file

The path of the data sources in CSVW and YARRRML anotations have to be the same.

{
  "csvw":"PATH OR URL to CSVW annotations",
  "yarrrml": "PATH OR URL TO YARRRML+FnO Mapping"
}

Evaluation

Morph-CSV has ben tested over three use cases: BSBM, Madrid-GTFS-Bench and Bio2RDF project. You can get the resources used and the results obtained in the branch evaluation.

Publications:

  • David Chaves-Fraga, Edna Ruckhaus, Freddy Priyatna, Maria-Esther Vidal, Oscar Corcho: Enhancing Virtual Ontology Based Access over Tabular Data with Morph-CSV. Semantic Web Journal, 2021. Online
  • David Chaves-Fraga, Freddy Priyatna, Idafen Santana-Pérez and Oscar Corcho: Virtual Statistics Knowledge Graph Generation from CSV files. Emerging Topics in Semantic Technologies: ISWC2018 Satellite Events. Vol. 36. Studies on the Semantic Web. IOS Press,2018, pp. 235–244 Online Version
  • Oscar Corcho, Freddy Priyatna, David Chaves-Fraga: Towards a New Generation of Ontology Based Data Access. Semantic Web Journal, 2020. Online version
  • Ana Iglesias-Molina, David Chaves-Fraga, Freddy Priyatna, Oscar Corcho: Enhancing the Maintainability of the Bio2RDF project Using Declarative Mappings. 12th International Semantic Web Applications and Tools for Health Care and Life Sciences Conference, 2019. Online version
  • David Chaves-Fraga, Luis Pozo, Jhon Toledo, Edna Ruckhaus, Oscar Corcho: Morph-CSV: Virtual Knowledge Graph Access for Tabular Data. 19th International Semantic Web Conference P&D, 2020. Online

Authors and Contact

Ontology Engineering Group - Data Integration:

Acknowledgements

The development of Morph-CSV has been supported by the Spanish national project Datos 4.0

morph-csv's People

Contributors

anaigmo avatar dachafra avatar dependabot[bot] avatar fpriyatna avatar jatoledo avatar ocorcho avatar w0xter avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

morph-csv's Issues

Formalization

based on the constraints, normalize the data when:

  • csvw:separator is defined in CSVW
  • two TM have the same source

Implement rr:class support

We can add the rr:class of the subject maps from the generated R2RML applying a regex to the ttl file searching for the names of the ObjectMaps correspondings with each subject to include the rdf:type of the subject using the rr:class predicate

Review why FK are not being applied

For example, in Q9 of GTFS-MAdrid-Benh FK should be created (and they are declared in the CSVW) but the morph-csv output does not contain any PK
https://morph.oeg.fi.upm.es/demo/morphcsv/run/gtfs/9

"foreignKey": [
          {
            "columnReference": "route_id",
            "reference": {
              "resource": "/data/ROUTES.csv",
              "columnReference": "route_id"
            }
          },
          {
            "columnReference": "shape_id",
            "reference": {
              "resource": "/data/SHAPES.csv",
              "columnReference": "shape_id"
            }
          },
          {
            "columnReference": "service_id",
            "reference": {
              "resource": "/data/CALENDAR.csv",
              "columnReference": "service_id"
            }
          },
          {
            "columnReference": "service_id",
            "reference": {
              "resource": "/data/CALENDAR_DATES.csv",
              "columnReference": "service_id"
            }
          }
        ],

Remove duplicates

After the applications of the rules, remove the duplicates of the raw data with uniq command

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.