Coder Social home page Coder Social logo

maastrichtu-cds / datafairifier Goto Github PK

View Code? Open in Web Editor NEW
7.0 9.0 5.0 8.4 MB

A system that supports the creation and validation of mappings and the creation of RDF data from relational data.

License: Apache License 2.0

Shell 15.54% PLpgSQL 2.27% JavaScript 0.64% Jupyter Notebook 17.65% Python 55.32% Dockerfile 8.37% Batchfile 0.20%
fair docker rdf

datafairifier's People

Contributors

bpmweel avatar fdiblen avatar jvsoest avatar martinedevos avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

datafairifier's Issues

View db table header in Jupyter nb to support SQL query writing

In the Jupyternotebook:
When the user is writing the SQL query, including mapping the column headers to required variable names,
it would be convenient if he/she could see a list with these column headers displayed in the notebook. Otherwise, he/she has to open and inspect the database elsewhere.

R2RML breaks with NaN values in data

When there is NaN information in the age column, stored as string in postgres. The R2RML conversiont tool breaks on these values, and discards all subsequent rows. This results in a list of missing patients.
The same happens with NULL values in the database.

Test Ontop performance vs Blazegraph

Although we have defined R2RML mappings, we need to test the performance of Ontop in comparison to constructing all triples. Is there a performance gain/reduction?

Make data mapping easier using UI

This is possible in Protege, but is an ontop-proprietary solution.

The question is how we can make an R2RML mapping using a graphical tool which helps in selecting the right tables/columns/values.

Start GraphDB with empty db and default r2rml

Currently, the user has to manually create two new repositories in GraphDB when starting the system:

  • One for database
  • One for the R2RML file
    This should be automated and part of starting GraphDB

MIA Docker Image

A Docker containing all MIA micro-services except the UniversalWorker

ReadMe - move "configuration of the infrastructure"?

The paragraph on configuration of the infrastructure is not really clear to me @jvsoest : Is this targeted at people who attend a workshop or hackathon at Maastro?
Maybe we could move this paragraph to the documentation files, and have only generic high-level install&run&use instructions in the ReadMe.

Automatic import csv data in Virtuoso

We start the jupyter notebook assuming that the data is already loaded in Virtuoso. For automatic import we assume there is a csv file in the same directory as docker-compose.

Link between one object and two separate string subjects is visualised wrong

I created a link between an object and two different subjects through two separate predicates. However, when visualizing, the function allocates them to the same Literal in this case. See figure below:
image

This should be visualized as one object with two separate LITERAL strings, connected by one arrow each.

Blazegraph Docker

We need to reuse/test an available blazegraph Docker or create a new one ourselves.

keyDb postgreSQL docker

For the imaging and database pathway we need a Docker image with a postgreSQL server with the CAT key database

MIA Universal Worker Docker Image

To build the Universal Worker for Docker we need to make a linux compatible matlab jar.
This should be used to create a Linux UW and stored into a docker container.
We need to add a linux build agent to our Bamboo

load used ontologies in separate repository

Placeholder for enhancement.

Currently the jupyter notebook reads the ontology to feed the terminology mapping. However, we can also load the ontology in a separate database, when importing the R2RML.

Advantages:

  • we don't need to reload every time the ontology
  • we can use the ontology during querying (using SERVICE <โ€ฆ.> options in sparql)
  • we can use it in the mapping tool, including inferencing on subClassOf* reasoning. Some terminology options are sometimes a logical statement between the concepts "delineation" AND "specific organ/tissue".

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.