Coder Social home page Coder Social logo

rdfio / rdf2smw Goto Github PK

View Code? Open in Web Editor NEW
18.0 3.0 3.0 102 KB

Convert RDF to Semantic MediaWiki facts in MediaWiki XML format, with a standalone commandline tool

License: MIT License

Go 99.28% Shell 0.72%
rdf mediawiki xml-format rdf-data commandline rdf-triples semantic-mediawiki smw

rdf2smw's Introduction

RDFIO Extension for Semantic MediaWiki

Build Status Test Coverage Code Climate Rating Code Climate Issues Codacy Grade Latest Stable Version Licence

Screenshot of the SPARQL Endpoint shipped with RDFIO

Updates

Sep 4, 2017: Our paper on RDFIO was just published! If you use RDFIO in scientific work, please cite:
Lampa S, Willighagen E, Kohonen P, King A, Vrandečić D, Grafström R, Spjuth O
RDFIO: extending Semantic MediaWiki for interoperable biomedical data management
Journal of Biomedical Semantics. 8:35 (2017). DOI: 10.1186/s13326-017-0136-y.

Introduction

This extension extends the RDF import and export functionality in Semantic MediaWiki by providing import of arbitrary RDF triples (not only OWL ontologies, as before (see about Ontology import, and a SPARQL endpoint that allows write operations.

Technically, RDFIO implements the PHP/MySQL based triple store (and its accompanying SPARQL Endpoint) provided by the ARC2 library. For updating wiki pages with new triples on import/sparql update, the WOM extension is used.

The RDF import stores the original URI of all imported RDF entities (in a special property), which can later be used by the SPARQL endpoint, instead of SMW's internal URIs, which thus allows to expose the imported RDF data "in its original formats", with its original URIs. This allows to use SMW as a collaborative RDF editor, in workflows together with other semantic tools, from which it is then possible to "export, collaboratively edit, and import again", to/from SMW.

This extensions was initially developed as part of a Google Summer of Code 2010 project, and further extended as part of a FOSS OPW 2014 project.

Installation

Easiest: Use the ready-made Virtual Machine

The absolute easiest way to try out RDFIO is to import the Ready-made Virtual Machine with RDFIO 3.0.2 (with MW 1.29 and SMW 2.5) into VirtualBox or VMWare, and just start browsing the local wiki installation.

Steps:

  1. Download the .ova file from doi.org/10.6084/m9.figshare.5383966.v1
  2. In VirtualBox (should be similar in VMWare), select "File > Import appliance"
  3. Click the folder icon
  4. Locate the .ova file you downloaded
  5. Click "Next", "Agree" to the license, and finally "Import", to start the import
  6. Start the virtual machine
  7. Click log in (No password required)
  8. Click the icon on the desktop
  9. You will now see a local wiki installation with an RDFIO enabled wiki, in a browser!
  10. Enjoy!

Easy: Vagrant box

Another quite easy way, is to use the RDFIO Vagrant box, which will automatically set up MediaWiki, SemanticMediaWiki and RDFIO in a virtual machine in under 20 minutes.

Medium-hard: Install semi-manually using composer

Install dependencies

$smwgShowFactbox = SMW_FACTBOX_NONEMPTY;

Installation steps

Assuming you have followed the steps above to install the dependencies for RDFIO:

  1. Install RDFIO by executing the following commands in a terminal:

    cd <wiki_folder>
    composer require rdfio/rdfio --update-no-dev
  2. Log in to your wiki as a super user

  3. Browse to http://[your-domain]/wiki/Special:RDFIOAdmin

  4. Click the "Setup" button to set up ARC2 database tables.

  5. If you already have semantic annotations in your wiki, you need to go to the article "Special:SMWAdmin" in your wiki, and click "Start updating data", and let it complete, in order for the data to be available in the SPARQL endpoint.

Optional but recommended steps

  • Edit the MediaWiki:Sidebar page and add the following wiki snippet, as an extra menu (I use to place it before just the "* SEARCH" line), which will give you links to the main functionality with RDFIO from the main links in the left sidebar on the wiki:

    * Semantic Tools
    ** Special:RDFIOAdmin|RDFIO Admin
    ** Special:RDFImport|RDF Import
    ** Special:SPARQLEndpoint|SPARQL Endpoint
    ** Special:SPARQLImport|SPARQL Import
    
  • Create the article "MediaWiki:Smw_uri_blacklist" and make sure it is empty (you might need to add some nonsense content like {{{<!--empty-->}}}).

Test that it works

  • Access the SPARQL endpoint at http://[url-to-your-wiki]/Special:SPARQLEndpoint
  • Access the RDF Import page at http://[url-to-your-wiki]/Special:RDFImport
  • Access the SPARQL Import page at http://[url-to-your-wiki]/Special:SPARQLImport
  • Optionally, if you want to really see that it works, try adding some semantic data to wiki pages, and then check the database (using phpMyAdmin e.g.) to see if you get some triples in the table named arc2store_triple.

Additional configuration

These are some configuration options that you might want to adjust to your specific use case. That goes into your LocalSettings.php file. Find below a template with the default options, which you can start from, add to your LocalSettings.php file and modify to your liking:

# ---------------------------------------------------------------
#  RDFIO Configuration
# ---------------------------------------------------------------
# An associative array with base uris as keys and corresponding 
# prefixes as the items. Example:
# array( 
#       "http://example.org/someOntology#" => "ont1",
#       "http://example.org/anotherOntology#" => "ont2"
#      );
# $rdfiogBaseURIs = array();
# ---------------------------------------------------------------
# Query by /Output Equivalent URIs SPARQL Endpoint 
# (overrides settings in HTML Form)
# 
# $rdfiogQueryByEquivURI = false;
# $rdfiogOutputEquivURIs = false;
#
# $rdfiogTitleProperties = array(
#  'http://semantic-mediawiki.org/swivt/1.0#page',
#  'http://www.w3.org/2000/01/rdf-schema#label',
#  'http://purl.org/dc/elements/1.1/title',
#  'http://www.w3.org/2004/02/skos/core#preferredLabel',
#  'http://xmlns.com/foaf/0.1/name',
#  'http://www.nmrshiftdb.org/onto#spectrumId'
# );
# ---------------------------------------------------------------
# Allow edit operations via SPARQL from remote services
#
# $rdfiogAllowRemoteEdit = false;
# ---------------------------------------------------------------

Dependencies

Known limitations

Bugs, new feature request and contact information

Please reports bugs and feature requests in the issue tracker here on Github.

Links

Related work

rdf2smw's People

Contributors

samuell avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

rdf2smw's Issues

Use as specific template name as possible

If a page is categorized into multiple categories, try to use the most specific one, e.g. by checking if there is a owl:subClassOf, or rdf:type relation between any of the categories, and choose the most "sub-classy" of them!

Missing Triples and Export Ontology

Hi,
I have 2 issue:

  1. When i convert the ontology file into xml files using rdf2smw, it does not include all the triples from the ontology. I also try one of the suggestions that i found in the issues of this repository to sort the ontology file before conversion using cat triples.nt | sort -k2,2 -k1,1 > sorted.triples.nt but it didn't help in my case.
  2. Is there a way to export the imported ontology from SMW?

Here is the ontology file: Ontology

XML batch import failure

I would like to import a list of RDF's into my Semantic MediaWiki. As instructed in the guide, I have converted my triples into XML files. Once I attempt to import the first XML file (templates) using the command php wiki/htdocs/maintenance/importDump.php semantic_mediawiki_pages_templates.xml I am getting the following error message:

[e52017568cfd4bfeb7a15a95] [no req]   Wikimedia\Rdbms\DBConnectionError from line 769 of /wiki/htdocs/includes/libs/rdbms/database/Database.php: Cannot access the database: No such file or directory (localhost:3306)
Backtrace:
#0 /wiki/htdocs/includes/libs/rdbms/loadbalancer/LoadBalancer.php(995): Wikimedia\Rdbms\Database->reportConnectionError(string)
#1 /wiki/htdocs/includes/libs/rdbms/loadbalancer/LoadBalancer.php(666): Wikimedia\Rdbms\LoadBalancer->reportConnectionError()
#2 /wiki/htdocs/includes/GlobalFunctions.php(3062): Wikimedia\Rdbms\LoadBalancer->getConnection(integer, array, boolean)
#3 /wiki/htdocs/includes/cache/localisation/LCStoreDB.php(45): wfGetDB(integer)
#4 /wiki/htdocs/includes/cache/localisation/LocalisationCache.php(414): LCStoreDB->get(string, string)
#5 /wiki/htdocs/includes/cache/localisation/LocalisationCache.php(460): LocalisationCache->isExpired(string)
#6 /wiki/htdocs/includes/cache/localisation/LocalisationCache.php(336): LocalisationCache->initLanguage(string)
#7 /wiki/htdocs/includes/cache/localisation/LocalisationCache.php(273): LocalisationCache->loadItem(string, string)
#8 /wiki/htdocs/languages/Language.php(600): LocalisationCache->getItem(string, string)
#9 /wiki/htdocs/includes/title/MediaWikiTitleCodec.php(86): Language->needsGenderDistinction()
#10 /wiki/htdocs/includes/title/MediaWikiTitleCodec.php(217): MediaWikiTitleCodec->getNamespaceName(integer, string)
#11 /wiki/htdocs/includes/cache/LinkCache.php(234): MediaWikiTitleCodec->getPrefixedDBkey(Title)
#12 /wiki/htdocs/includes/Title.php(3233): LinkCache->addLinkObj(Title)
#13 /wiki/htdocs/includes/Title.php(4320): Title->getArticleID(integer)
#14 /wiki/htdocs/extensions/SemanticMediaWiki/src/PermissionPthValidator.php(80): Title->exists()
#15 /wiki/htdocs/extensions/SemanticMediaWiki/src/PermissionPthValidator.php(57): SMW\PermissionPthValidator->checkUserPermissionOn(Title, User, string, array)
#16 /wiki/htdocs/extensions/SemanticMediaWiki/src/MediaWiki/Hooks/HookRegistry.php(544): SMW\PermissionPthValidator->checkQuickPermissionOn(Title, User, string, array)
#17 [internal function]: SMW\MediaWiki\Hooks\HookRegistry->SMW\MediaWiki\Hooks\{closure}(Title, User, string, array, boolean, boolean)
#18 /wiki/htdocs/includes/Hooks.php(186): call_user_func_array(Closure, array)
#19 /wiki/htdocs/includes/Title.php(1987): Hooks::run(string, array)
#20 /wiki/htdocs/includes/Title.php(2533): Title->checkQuickPermissions(string, User, array, string, boolean)
#21 /wiki/htdocs/includes/Title.php(1936): Title->getUserPermissionsErrorsInternal(string, User, string, boolean)
#22 /wiki/htdocs/includes/import/WikiImporter.php(1068): Title->userCan(string)
#23 /wiki/htdocs/includes/import/WikiImporter.php(755): WikiImporter->processTitle(string, string)
#24 /wiki/htdocs/includes/import/WikiImporter.php(577): WikiImporter->handlePage()
#25 /wiki/htdocs/maintenance/importDump.php(322): WikiImporter->doImport()
#26 /wiki/htdocs/maintenance/importDump.php(269): BackupReader->importFromHandle(resource)
#27 /wiki/htdocs/maintenance/importDump.php(106): BackupReader->importFromFile(string)
#28 /wiki/htdocs/maintenance/doMaintenance.php(111): BackupReader->execute()
#29 /wiki/htdocs/maintenance/importDump.php(327): require_once(string)
#30 {main}

Can someone please help me understand this error message?

Thanks,
TT


List of tools that I am using:

  • Apache Web Server - Port 8080
  • MediaWiki 1.29.1
  • Semantic MediaWiki 2.5.4
  • PHP 7.0.22
  • MySQL 5.6.37 - Port 3306
  • RDFIO 3.0.2
  • rdf2smw v0.6

Create data listing pages for categories

One could create, for each category, either on the category itself, or on a page named "List of {{{category}}}s", a page with the following content:

{{#ask:[[Category:Compound]]
 |?HaspKa
 |?HasDOIUrl
 |format=table
 |link=all
 |headers=show
 |searchlabel=... further results
 |class=sortable wikitable smwtable
 |sort=HaspKa
 |limit=250
}}

Auto-create template pages

Template pages could be auto-created, with all the facts that are written through them, since they follow a pretty standard pattern:

... table code ...
!propertyname
|[[propertyname::{{{propertyname|}}}]]
... more table code ...

... or possibly:

... table code ...
!propertyname
|{{#arraymap:{{{propertyname|}}}|,|x|[[propertyname::x]]|,}}
... more table code ...

... to account for multi-argument variables.

Allow "assume sorted" option

We could do a slightly different processing algorithm if we can assume the data is sorted (which should be much more efficient using a pure text sorting tool anyway, for n-triples files), which will require far less memory and probably be faster.

TripleAggregator fails indeterminally (race condition?)

How to reproduce:

  1. run go test multiple times.

Result:

Sometimes:

$ go test

--- FAIL: TestTripleAggregator (0.00s)
        tripleaggregator_test.go:55: Subject of first aggregate is wrong
        tripleaggregator_test.go:60: Subject in triple 1 of first aggregate is wrong
        tripleaggregator_test.go:63: Subject in triple 1 of first aggregate is wrong
        tripleaggregator_test.go:66: Subject in triple 1 of first aggregate is wrong
        tripleaggregator_test.go:60: Subject in triple 2 of first aggregate is wrong
        tripleaggregator_test.go:63: Subject in triple 2 of first aggregate is wrong
        tripleaggregator_test.go:66: Subject in triple 2 of first aggregate is wrong
        tripleaggregator_test.go:60: Subject in triple 3 of first aggregate is wrong
        tripleaggregator_test.go:63: Subject in triple 3 of first aggregate is wrong
        tripleaggregator_test.go:66: Subject in triple 3 of first aggregate is wrong
        tripleaggregator_test.go:72: Subject of second aggregate is wrong
        tripleaggregator_test.go:77: Subject in triple 4 of second aggregate is wrong
        tripleaggregator_test.go:80: Subject in triple 4 of second aggregate is wrong
        tripleaggregator_test.go:83: Subject in triple 4 of second aggregate is wrong
        tripleaggregator_test.go:77: Subject in triple 5 of second aggregate is wrong
        tripleaggregator_test.go:80: Subject in triple 5 of second aggregate is wrong
        tripleaggregator_test.go:83: Subject in triple 5 of second aggregate is wrong
        tripleaggregator_test.go:77: Subject in triple 6 of second aggregate is wrong
        tripleaggregator_test.go:80: Subject in triple 6 of second aggregate is wrong
        tripleaggregator_test.go:83: Subject in triple 6 of second aggregate is wrong
FAIL
exit status 1
FAIL    github.com/samuell/rdf2smw      0.006s

Expected result:

Always:

$ go test

PASS
ok      github.com/samuell/rdf2smw      0.005s

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.