kg-construct / rml-cc Goto Github PK
View Code? Open in Web Editor NEWRML-CC: Collections and Containers definitions for RML
Home Page: https://w3id.org/rml/cc/spec
License: Creative Commons Attribution 4.0 International
RML-CC: Collections and Containers definitions for RML
Home Page: https://w3id.org/rml/cc/spec
License: Creative Commons Attribution 4.0 International
In section 3.2.2 rml:strategy
the following domain and range are defined for rml:strategy
- The domain of rml:strategy is rml:GatherMap.
- The range of rml:strategy is an IRI.
In general I think we should be cautious specifying the domain. In this case, I can imagine that other constructs may also have a need for specifying a strategy. I think it's best to leave domain open for this property.
The range is specified to be an IRI. This should be a class. I think it makes sense to define rml:Strategy
as a class for the strategy constants.
E.g., the example currently contains identifiers, which may be too complex for a simple example.
Rename rml:Append and rml:CartesianProduct to rml:append and rml:cartesianProduct to stick to the common rule that only class names start with a capital letter.
Would be nice to have test cases covering the complete specification:
I think we had this discussion in the past but I'd like to bring it up again. I see in this spec that we have rml:allowEmptyListAndContainer
. Does this refer to an empty list or container or o an empty cell/element/object in the list? If the former should we clarify? if the latter shouldn't we use the same property as for the core specification? i.e. all term maps handled in the same way e.g., rml:allowEmpty
.
@chrdebru listed multiple cases of lists
1-collecting-values-from-the-same-term-map
2-collecting-values-from-different-term-maps
collecting the values from different term maps (simple)
collecting values from a reference object maps
2b-collecting-values-from-different-term-maps-with-multi-valued-term-maps
3-processing-empty-collections-and-containers
4-nested-collections-and-containers
5-collections-and-containers-as-subjects
6-identifying-collections-and-containers
I think so far the excel sheet of kg-construct/rml-core#26 covers only 1,3 and 5 or at least it definitely doesn't cover the cases where the lists are generated from different term maps, that still needs to be incorporated. I leave the outline of cases here for future reference.
(it also needs to be disambiguated if it's indeed meant term maps and not expression maps or both)
While we have not yet come up with an example, nothing would prevent us from generating the following:
:a :b :c .
:b a rdf:Bag ;
rdf:_1 :foo ;
rdf:_2 :bar .
And the same for lists where the IRI is the predicate is also the IRI of the first cons-pair. Should we add an example to demonstrate this possibility?
The following file ./ontology/documentation/sections/introduction-en.html
has two ontology NS prefixes. Shouldn't we use rml:
for the second? I do not want to propose a change, as maybe @anaigmo used a WIDOCO config file.
Add a note about rr:column
using an example. The section should be self-contained so that it can be removed in case rr:column
is removed from the core specification.
This example should be added only when the fields specification is released.
Anyway, the issue does not prevent from releasing a first version of the specification.
See existing example: #10 (comment)
this question was raised by @chrdebru
how do we handle the generation of null lists?
this issue might be related to kg-construct/rml-core#16
how do we handle the two cases?
Containers/Collections may be generated from different "term maps" or from a single "term map" that returnes multi values. How do we handle the two cases? What are their similarities/differences?
Some classes and properties still use prefix rr:
, replace them with rml:
as prefix.
Example: rr:TermMap
Create examples about strategies using an example with names (to demonstrate the utility of a cartesian product). Also, note that the cartesian product generates multiple lists/containers; therefore, they are identified by iteration + some "sentinel value."
Is it me, or can
"A rml:GatherMap
MAY have exactly one rml:strategy
property.."
be interpreted as "it can have multiple"?
Is the following working not more precise:
"A rml:GatherMap
MUST have at most one rml:strategy
property." ?
We should follow the proposed standard RFC 9535 JSONPath for all expressions used in examples in the spec and in test cases.
Most notably this requires all expressions to start with a $
as per https://www.rfc-editor.org/rfc/rfc9535.html#section-2.2.1.
For my clarification, could you provide an example if I want to convert
{
"values": [
{
"parentId": "a"
"values": ["1", "2", "3"]
},
{
"parentId": "a"
"values": ["4", "5", "6"]
},
{
"parentId": "b"
"values": ["7", "8", "9"]
}
]
}
into
<list/a> rdf:_1 ("1" "2" "3") ; rdf:_2 ("4" "5" "6") .
<list/b> rdf:_1 ("7" "8" "9") .
I'm wondering if the collection-containers specification extends RML or R2RML. I see that rr:column
is used so it gives me the impression that it extends R2RML but could we phrase it in a way that covers both?
The property gatherAs
has as a range one of the following: bag, list, ... Not the class of bags union list union ...
<http://w3id.org/rml/gatherAs> rdf:type owl:ObjectProperty ;
rdfs:domain <http://w3id.org/rml/GatherMap> ;
rdfs:range [ rdf:type owl:Class ;
owl:oneOf ( rdf:Alt
rdf:Bag
rdf:List
rdf:Seq
)
] ;
rdfs:comment "Relates a GatherMap with the desired result type of collection or container."@en ;
rdfs:isDefinedBy <http://w3id.org/rml/cc/> ;
rdfs:label "gather as" .
this is minor, but just here to keep track and not forget
https://https://
typosNew structure:
spec
with all the resources for the specification (the current content of the repo)*ontology
, which I think is coming from #31shapes
, coming from #32test-cases
, for #33Other changes:
rml-cc
@frmichel proposed the ability to change the iterator inside a term map. This would allow one to "manipulate" the "input" prior to applying the term map. A use case could be to flatten a list of values, for instance. Examples were included in the context of RML containers and collections, BUT I believe that this proposal would need to be discussed in the RML spec. What do you think?
I'm wondering whether a visual overview of this extension would make sense to include, to get the gist very quickly,
something like below
classDiagram
direction BT
class GatherMap {
@type TermMap
TermMap[] rml:gather 1
[rml:Append rml:CartesianProduct] rml:strategy 0..1 rml:Append
[rdf:Seq, rdf:Bag, rdf:Alt, rdf:List] rml:gatherAs 1
xsd:boolean rml:allowEmptyListAndContainer 0..1 "false"
}
Question posed by @chrdebru .
it is not explicitly mentioned at the spec, but in the most cases it is a blank node.
There are cases of RDF containers that have IRIs but ut is not commin
Add examples of templates and references to identify lists and containers that lead to invalid lists and containers but are valid RDF. These examples are useful to indicate the pitfals.
In the case of the source/target spec, we use a different namespace, but here we still use rml
. What would be the best strategy? I leave it as a comment here but whatever w decide should hold for all specs.
This spec is copied from the Target one, but keeps the goatcounter URL in the dev.html
to count visitors.
https://github.com/kg-construct/collection-containers-spec/blob/main/dev.html#L209
Goatcounter is a privacy safe alternative for Google Analytics.
Please update the URL otherwise the visitor counter will be wrong soon :)
@andimou @chrdebru @dachafra @pmaria, here is a summary of what I tried to explain during the call.
In the current specification, we have assumed that creating a list/container through two separate iterations would yield property rdf:firt/rdf:_1 twice, leading to an ill-formed collection/container. The given example generates this
_:b0 rdf:first 1 , 3 ;
rdf:rest ( 2 ) ;
rdf:rest ( 4 ) .
whereas we were willing to generate this:
_:b0 rdf:first 1 ;
rdf:rest ( 2, 3, 4 ) .
Nevertheless, I wonder whether this is a conceptual issue or just an implementation issue. For now, I'd favor the latter: I think this is a question about how the RDF library that is being used will behave.
If so, instead of assuming that the implementation will behave wrongly, the specification could simply state what must be the right behavior: whenever, during an iteration, we create a collection/container that happens to already exists (the head node IRI or BN id already exists), then the processor needs to append the term(s) to the existing ones.
Is this feasible in terms of implementation, or am I missing a conceptual hurdle?
https://kg-construct.github.io/rml-resources/portal/
rml:cartessianProduct
is misspelled: should be rml:cartesianProduct
(one s
)rml:Append
should be rml:append
(lowercase)rml:CartesianProduct
should be rml:cartesianProduct
(lowercase and one s
)README still points to the Target spec ๐
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.