iherman / canonical_rdf Goto Github PK
View Code? Open in Web Editor NEWProof-of-concept implementation of Aidan Hogan's RDF canonicalization algorithm in node.js
License: Other
Proof-of-concept implementation of Aidan Hogan's RDF canonicalization algorithm in node.js
License: Other
Hi,
I am currently trying to understand the algorithms better - your code helps a lot! Thank you very much.
While implementing my own interpretation of the algorithm, I stumbled upon the ordering of graphs, which you have implemented in the Dataset as isSmaller
.
As I understood from the paper, the ordering itself was not specified so you had to come up with your own:
G < H if and only if G ⊂ H or there exists a triple t ∈ G \ H such that no triple t' ∈ H \ G exists where t' < t .
You implemented a shortcut with
// If HmG, i.e., H\G is empty then the second predicate is trivially true, so we are also done
if( HmG.length === 0 ) return true;
While your implementation does exactly what you defined, I was wondering - more generally - the follwoing:
Let G = { (s,p,o), (s,p,x) } a graph consisting of two triples.
Let H = { (s,p,o) } a graph consisting of one triple such that H is a true subgraph of G.
Then:
GmH = {(s,p,x)}
and HmG = { }
.
Now, isSmaller(G,H)
will return true
as HmG
is empty, just as you defined.
However, isSmaller(H,G)
will also return true
as H is a true subgraph of G, just as you defined.
Intuitively, I would have expected to first check for true subgraphs in both directions and second if none is a true subgraph of the other then to check if all triples in GmH are (lexicographically) smaller than all triples in HmG otherwise G is not smaller than H.
So, in a nutshell, I was expecting the check if H is subgraph of G, instead of the shortcut. The rest is fine to my understanding.
So: if( HmG.length === 0 ) return false
;
Am I missing something? I would appreciate any clarifications and explanations why I am totally wrong about this :)
Cheers
Christoph
Feeding the output of the tool into RDFlib's rdfpipe
yields Turtle with bnodes. The question is whether this was intended, i.e. to assing unique labels to bnodes in a stable way but not to replace them with IRIs that have a path starting with /.well-known/genid/
(as per https://www.w3.org/TR/rdf11-concepts/#section-skolemization)?
Also, what term could be used to distinguish the two? https://www.w3.org/2011/rdf-wg/wiki/Skolemisation left me with the impression that if bnode ids are not replaced with a permanent URI, the process is not really a skolemization.
Hi @iherman! Thanks for doing this PoC!
Do you think you could run this implementation against some of the tests from the RDF normalization test suite? The expected output will be different of course from the existing algorithms, but a number of the tests relabel blank nodes in isomorphic datasets such that the same output should be generated.
It would be especially interesting to see how it does with the "evil" tests:
https://github.com/json-ld/normalization/blob/gh-pages/tests/test044-in.nq
https://github.com/json-ld/normalization/blob/gh-pages/tests/test045-in.nq
https://github.com/json-ld/normalization/blob/gh-pages/tests/test046-in.nq
Each of these should produce the same output.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.