Coder Social home page Coder Social logo

pyshacl's Introduction

RDFLib

Build Status Documentation Status Coveralls branch

GitHub stars Downloads PyPI PyPI DOI

Contribute with Gitpod Gitter Matrix

RDFLib is a pure Python package for working with RDF. RDFLib contains most things you need to work with RDF, including:

  • parsers and serializers for RDF/XML, N3, NTriples, N-Quads, Turtle, TriX, Trig and JSON-LD
  • a Graph interface which can be backed by any one of a number of Store implementations
  • store implementations for in-memory, persistent on disk (Berkeley DB) and remote SPARQL endpoints
  • a SPARQL 1.1 implementation - supporting SPARQL 1.1 Queries and Update statements
  • SPARQL function extension mechanisms

RDFlib Family of packages

The RDFlib community maintains many RDF-related Python code repositories with different purposes. For example:

  • rdflib - the RDFLib core
  • sparqlwrapper - a simple Python wrapper around a SPARQL service to remotely execute your queries
  • pyLODE - An OWL ontology documentation tool using Python and templating, based on LODE.
  • pyrdfa3 - RDFa 1.1 distiller/parser library: can extract RDFa 1.1/1.0 from (X)HTML, SVG, or XML in general.
  • pymicrodata - A module to extract RDF from an HTML5 page annotated with microdata.
  • pySHACL - A pure Python module which allows for the validation of RDF graphs against SHACL graphs.
  • OWL-RL - A simple implementation of the OWL2 RL Profile which expands the graph with all possible triples that OWL RL defines.

Please see the list for all packages/repositories here:

Help with maintenance of all of the RDFLib family of packages is always welcome and appreciated.

Versions & Releases

See https://rdflib.dev for the release overview.

Documentation

See https://rdflib.readthedocs.io for our documentation built from the code. Note that there are latest, stable 5.0.0 and 4.2.2 documentation versions, matching releases.

Installation

The stable release of RDFLib may be installed with Python's package management tool pip:

$ pip install rdflib

Some features of RDFLib require optional dependencies which may be installed using pip extras:

$ pip install rdflib[berkeleydb,networkx,html,lxml]

Alternatively manually download the package from the Python Package Index (PyPI) at https://pypi.python.org/pypi/rdflib

The current version of RDFLib is 7.0.0, see the CHANGELOG.md file for what's new in this release.

Installation of the current main branch (for developers)

With pip you can also install rdflib from the git repository with one of the following options:

$ pip install git+https://github.com/rdflib/rdflib@main

or

$ pip install -e git+https://github.com/rdflib/rdflib@main#egg=rdflib

or from your locally cloned repository you can install it with one of the following options:

$ poetry install  # installs into a poetry-managed venv

or

$ pip install -e .

Getting Started

RDFLib aims to be a pythonic RDF API. RDFLib's main data object is a Graph which is a Python collection of RDF Subject, Predicate, Object Triples:

To create graph and load it with RDF data from DBPedia then print the results:

from rdflib import Graph
g = Graph()
g.parse('http://dbpedia.org/resource/Semantic_Web')

for s, p, o in g:
    print(s, p, o)

The components of the triples are URIs (resources) or Literals (values).

URIs are grouped together by namespace, common namespaces are included in RDFLib:

from rdflib.namespace import DC, DCTERMS, DOAP, FOAF, SKOS, OWL, RDF, RDFS, VOID, XMLNS, XSD

You can use them like this:

from rdflib import Graph, URIRef, Literal
from rdflib.namespace import RDFS, XSD

g = Graph()
semweb = URIRef('http://dbpedia.org/resource/Semantic_Web')
type = g.value(semweb, RDFS.label)

Where RDFS is the RDFS namespace, XSD the XML Schema Datatypes namespace and g.value returns an object of the triple-pattern given (or an arbitrary one if multiple exist).

Or like this, adding a triple to a graph g:

g.add((
    URIRef("http://example.com/person/nick"),
    FOAF.givenName,
    Literal("Nick", datatype=XSD.string)
))

The triple (in n-triples notation) <http://example.com/person/nick> <http://xmlns.com/foaf/0.1/givenName> "Nick"^^<http://www.w3.org/2001/XMLSchema#string> . is created where the property FOAF.givenName is the URI <http://xmlns.com/foaf/0.1/givenName> and XSD.string is the URI <http://www.w3.org/2001/XMLSchema#string>.

You can bind namespaces to prefixes to shorten the URIs for RDF/XML, Turtle, N3, TriG, TriX & JSON-LD serializations:

g.bind("foaf", FOAF)
g.bind("xsd", XSD)

This will allow the n-triples triple above to be serialised like this:

print(g.serialize(format="turtle"))

With these results:

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

<http://example.com/person/nick> foaf:givenName "Nick"^^xsd:string .

New Namespaces can also be defined:

dbpedia = Namespace('http://dbpedia.org/ontology/')

abstracts = list(x for x in g.objects(semweb, dbpedia['abstract']) if x.language=='en')

See also ./examples

Features

The library contains parsers and serializers for RDF/XML, N3, NTriples, N-Quads, Turtle, TriX, JSON-LD, RDFa and Microdata.

The library presents a Graph interface which can be backed by any one of a number of Store implementations.

This core RDFLib package includes store implementations for in-memory storage and persistent storage on top of the Berkeley DB.

A SPARQL 1.1 implementation is included - supporting SPARQL 1.1 Queries and Update statements.

RDFLib is open source and is maintained on GitHub. RDFLib releases, current and previous are listed on PyPI

Multiple other projects are contained within the RDFlib "family", see https://github.com/RDFLib/.

Running tests

Running the tests on the host

Run the test suite with pytest.

poetry install
poetry run pytest

Running test coverage on the host with coverage report

Run the test suite and generate a HTML coverage report with pytest and pytest-cov.

poetry run pytest --cov

Viewing test coverage

Once tests have produced HTML output of the coverage report, view it by running:

poetry run pytest --cov --cov-report term --cov-report html
python -m http.server --directory=htmlcov

Contributing

RDFLib survives and grows via user contributions! Please read our contributing guide and developers guide to get started. Please consider lodging Pull Requests here:

To get a development environment consider using Gitpod or Google Cloud Shell.

Open in Gitpod Open in Cloud Shell

You can also raise issues here:

Support & Contacts

For general "how do I..." queries, please use https://stackoverflow.com and tag your question with rdflib. Existing questions:

If you want to contact the rdflib maintainers, please do so via:

pyshacl's People

Contributors

ajnelson-nist avatar ashleysommer avatar aucampia avatar bollwyvl avatar dependabot[bot] avatar gtfierro avatar jameshowison avatar jamiefeiss avatar johannesloetzsch avatar jyucsiro avatar konradhoeffner avatar martijn-y-ai avatar mfsy avatar mgberg avatar mpolitze avatar nicholascar avatar nicholsn avatar panaetius avatar piyush69 avatar rinkehoekstra avatar tcmitchell avatar wcrd avatar westurner avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pyshacl's Issues

Add new option for passing in an ontology specification document

It can sometimes (often) be the case that the combination of the SHACL Shape file and the Data File together do not give the pySHACL validation engine enough information to generate a correct validation result, even if inferencing is run across the input data file.

For example:

  1. I have a shape file which asserts that for all instances of the class Human, if they have a property called hasPet, the target object of that property must be an instance of the class Animal.

  2. I have a data file containing statements:

  • Person1 Instance of Human named "Amy", she has a property hasPet with the target Pet1.
  • Pet1 Instance of Lizard named "Sebastian"

If I run the validator across those inputs, it will return a validation result indicating failure because the pet is not of type animal. Even if inferencing is run on the data file, there is no way for the validator to know that Lizard is a subclass of Animal, so the validation still returns the result.

In order for this validation to work, there needs to be a statement of (Lizard, rdfs:subclassOf, Animal) included in the data file before submitting it to the validator, and basic RDFS inferencing must be run on the data graph before validating, to ensure the (Pet1, rdf:type, Animal) triple is created in the data graph.

This is a very simple example but hopefully highlights the problem faced, where any extra ontological information required for inferencing needs to be added into the data file before passing it to the data file. This is inconvenient because in most practical applications of pySHACL, the data file is an isolated data snippet, without any accompanying ontological information.

It is sometimes the case that extra ontologicial information is added into the SHACL Shape file, or indeed that the SHACL Shapes are included as part of an ontology document itself. This does not help in this situation, because the file passed into the validator and parsed into the SHACL Shapes graph does not get mixed into the data graph, so those extra ontological statements do not take effect in the inferencing step on the data graph (and inferencing is never applied to the SHACL graph).

I propose an extra feature for pySHACL where you can optionally specify the location to an extra static ontology document, which gets ingested and mixed into the data graph prior to the inferencing step.

This will be a new feature in the python module, and exposed as an option on the command line tool, and as an optional field on the web tool.

Command-line use does not work in Windows

The path is not getting interpreted correctly:
file://c:\my\full\path\test.ttl/ does not look like a valid URI, trying to serialize this will break.

I've tried with forward slashes, backslashes, no slashes (all files in current directory), full filespec (with and without c:), etc. Couldn't get any of them to work.

Windows 10.

FocusNode and ValueNode in ReportGraph should be able to point to same BNode

Related to #55
In building the report graph, when a valueNode and focusNode are both the same and are a Blank Node, the validation result's valueNode and focusNode will never have the same ID in the Report Graph. In the current implementation they are both copied over separately (thus they are two new blank nodes), but I could put a simple check if they are the same node, only copy it into the report graph once, then use that for both valueNode and focusNode, then they will have the same ID.

cannot import name 'convert_graph' from 'owlrl'

Hey out there.
I get a really confusing error while trying to set up pySHACL.

I just try to import from pyshacl import validate in a python script and get the following error.

Traceback (most recent call last):
  File ".\shaclCheck.py", line 1, in <module>
    from pyshacl import validate
  File "C:\Python37x64\lib\site-packages\pyshacl\__init__.py", line 3, in <module>
    from pyshacl.validate import validate, Validator
  File "C:\Python37x64\lib\site-packages\pyshacl\validate.py", line 5, in <module>
    import owlrl
  File "C:\Python37x64\Scripts\owlrl.py", line 4, in <module>
    from owlrl import convert_graph, RDFXML, TURTLE, JSON, AUTO, RDFA
ImportError: cannot import name 'convert_graph' from 'owlrl' (C:\Python37x64\Scripts\owlrl.py)

I think i did set up all Path variables related to the packages but i dont get this error fixed.

My System is a Win10 set. I dont know if this does cause the problem?
Can somebody help me here?

Best regards

improve error message when using sh:ignoredProperties without sh:closed

This is clearly a low priority, minor issue, but considering the amount of time I spent trying to figure out what was wrong with my rather large SHACL data, I thought it was worth considering and suggesting a change.

A minimal example to illustrate the issue is:

import rdflib
from pyshacl import validate

data = """
@prefix asdf: <http://example.org/asdf/> .
@prefix ex: <http://example.org/> .

asdf:e2e a ex:termA ;
    ex:child asdf:23e .

asdf:23e a ex:termB .
"""

shaclData = """
@prefix ex: <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .

ex:termShape a sh:NodeShape ;
    sh:ignoredProperties ( rdf:type ) ;
    sh:targetClass ex:termB .
"""

dataGraph = rdflib.Graph().parse( data = data, format = 'ttl' )
shaclGraph = rdflib.Graph().parse( data = shaclData, format = 'ttl' )
report = validate( dataGraph, shacl_graph = shaclGraph, abort_on_error = False, meta_shacl = False, debug = False, advanced = True, do_owl_imports = True )

This generates what I found to be a confusing error message:

ConstraintLoadError: ClosedConstraintComponent must have at least one sh:closed predicate.
https://www.w3.org/TR/shacl/#ClosedConstraintComponent

The issue is that when using sh:ignoredProperties, sh:closed is expected.

pySHACL is reporting that one is using something related to closed shapes without having a closed shape.

If possible, I would love to see some improvement to the clarity of the error.

Validating files containing multiple named graphs

The load.py script currently loads all passed files into an rdflib Graph object. For JSON-LD files that contain multiple named graphs, this means that the resulting graph object g in the referenced line [1] will be empty, and the validation will succeed without warning.

I know that pySHACL currently does not support TriG or NQuads files, but if you allow for JSON-LD, you should allow for these as well as the big difference is the support for named graphs.

There are three ways around this:

  • Show a warning that files containing named graphs cannot currently be validated (not ideal, as this should be easy to fix).
  • Simple quick fix is to load files into a ConjunctiveGraph. This disregards all named graph information, but makes all triples available for validation. This is the behaviour of the SHACL playground implementation. This is not ideal either, as we would like to validate individual graphs (as per the SHACL spec).
  • Better is to load the file into a Dataset and then iterate over each contained graph for validation purposes. This is a more involved fix that requires the loader to always load into a Dataset, and then the validator should iterate over each graph contained in the dataset.

[1]

if g is None:

validation with superclass constraints

pyshacl seems to ignore constraints defined on a superclass when validating an instance of a subclass. e.g given the SHACL

@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix ex: <http://example.com/ex#> .

ex: a owl:Ontology ;
    rdfs:label "Example"@en ;
    rdfs:comment "Example"@en ;
    owl:versionInfo "" ;
    sh:declare [ sh:namespace "http://example.com/ex#" ;
            sh:prefix "ex" ] .

ex:Parent a rdfs:Class ;
    rdfs:isDefinedBy ex: ;
    rdfs:comment "The parent class"@en ;
    rdfs:subClassOf owl:Thing .

ex:ParentShape a sh:NodeShape ;
    rdfs:isDefinedBy ex: ;
    sh:property [
        sh:datatype xsd:string ;
        sh:path ex:name ;
        sh:maxCount 1 ;
        sh:minCount 1 ;
    ] ;
    sh:targetClass ex:Parent .

ex:Child a rdfs:Class ;
    rdfs:isDefinedBy ex: ;
    rdfs:comment "The child class"@en ;
    rdfs:subClassOf ex:Parent .

ex:ChildShape a sh:NodeShape ;
    rdfs:isDefinedBy ex: ;
    rdfs:subClassOf ex:ParentShape ;
    sh:property [
        sh:datatype xsd:integer ;
        sh:path ex:age ;
        sh:maxCount 1 ;
        sh:minCount 1 ;
    ] ;
    sh:targetClass ex:Child .

Validating a json-ld instance of Child that is missing the name property from Parent against the above SHACL

{
    "@context": {
        "@vocab": "http://example.com/ex#"
    },
    "@type": "Child",
    "age": 3
}

does not find a violation. I had expected that validating a subclass instance would also include constraints from the superclass.

Is this a misunderstanding of SHACL on my part or an issue with pyshacl?

Feature Request: Variables in validation reports of SPARQLConstraintComponent

I want to request a feature for validation reports of SPARQLConstraint(Component) as described in the SHACL Recommondations.
I'd really appreciate the possibility to use variables from SELECT queries in the sh:message, i.e.:

:VerifyPowerAdapterSupplyShape
  a sh:NodeShape ;
  sh:targetClass ex:Computer ;
  sh:sparql [
    a sh:SPARQLConstraint ;
    sh:message "The power adapter ({?availablePower} W) must provide more power than the parts of the computer consume ({?requiredPower} W)." ;
    sh:prefixes ex: ;
    sh:select """
      SELECT $this ?availablePower ?requiredPower
      WHERE {
        $this ex:hasPowerAdapter ?powerAdapter .
        ?powerAdapter ex:hasPowerSupply ?availablePower .
        {
          SELECT (SUM(?power) as ?requiredPower)
          WHERE { 
	    $this ex:hasPart ?device .
	    ?device ex:hasRequiredPower ?power .
          }
        }
        FILTER(?availablePower < ?requiredPower) .
      }
    """ ;
  ] .

That would help me alot to debug ontologies that rely on complex SHACL-SPARQL constraints :)

shacl advanced features

Hi,
I'm just trying to clarify if pySHACL support advanced features such as sh:target, sh:filterShapeNode etc? Looks like it doesn't support those properties currently..

Thanks,
Yi

pySHACL for yaml?

hey RDFlib! I'm working on validation functions for schema.org content, and specifically we have yaml (and frontend matter of html) definitions of specifications (that load nicely into JSON). I'm wondering if there would be some logical way to use pySHACL to validate these inputs? See our discussion here --> schemaorg/schemaorg#2069 (comment) and here is an example input with yaml as frontend matter (that can be loaded as json of course). I'm also wondering if there is development space to be able to define tests / criteria in yaml, since this is the current language of many continuous integration services like Travis, Circle, etc. I started some thinking about this but before implementing something new, wanted to check with what standards are used in the community. Generally the criteria I am looking for are:

  • Python based (for easy use by the scientific community)
  • For the same reason, yaml or json-ld (but probably not rdf natively)
  • simple in that it doesn't have extra dependencies beyond what is already used

Thanks for your feedback! Please join in on the first issue listed above if you have thoughts! I'm very happy to contribute something here (with guidance) or to create a simplified version that goes from a yaml criteria to a validated specification.

how do i validate a property which is an object?

I have a jsonld data file which looks something like this:

   { "@type": "ex:Activity",
      "schema:description": "example schema",
      "ui": { "order": [
                  "john",
                  "mark",
                  "lisy" ],
           "shuffle": false }

How do I write a constraint for the property ui since it is an object?

Access to inferred triples

According to 8.4 General Execution Instructions for SHACL Rules implementations modify the data graph if triples get inferred, and/or may "construct a logical data graph that has the original data as one subgraph and a dedicated inferences graph as another subgraph, and where the inferred triples get added to the inferences graph only."

I've been following the Classification With SHACL Rules article and I would like to extract the graph of inferred triples which would include <http://bakery.com/ns#AppleTartC> a <http://bakery.com/ns#NonGlutenFreeBakedGood>, <http://bakery.com/ns#VeganBakedGood> . merged into the data graph or as a inference graph.

Is this feature available?

Output with anonymous focus nodes, could it print whole node?

I'm doing some validation in data with anonymous nodes. The output of validate (e.g., results_text) shows:

Focus Node: [ ]
Value Node: [ ]

That makes it pretty hard to know which node has the issue. Any thoughts on how to identify them. I'm wondering if there could be an option to print either the datafile:line_number (hard, I know) or perhaps the whole anonymous node (ours are small, I know they could be very big, but even a few lines would probably help locate them).

Unexpected violation when using sh:qualifiedMinCount and sh:qualifiedValueShape

pyshacl is giving an unexpected violation, one that I'm not seeing on the javascript https://shacl.org/playground/ (and pyshacl is also not showing the sh:message of the only property).

Data:

@prefix ex: <http://example.org/ns#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix schema: <http://schema.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ex:Document
    a schema:Document ;
    schema:isTargetOf  [ a schema:HasAuthor ;
                         schema:isPresent true ] ;
    schema:isTargetOf  [ a schema:otherClass ;
                         schema:isPresent true ] ;
.

shacl constraints

@prefix dash: <http://datashapes.org/dash#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix schema: <http://schema.org/> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

schema:DocumentShape
    a sh:NodeShape ;
    sh:targetClass schema:Document ;
    sh:property [
        sh:message "At least one Author" ;
        sh:path schema:isTargetOf ;
        sh:qualifiedMinCount 1 ;
        sh:qualifiedValueShape [
            sh:class schema:HasAuthor ;
        ]
    ] ;
.

Python code used:

import rdflib
from pyshacl import validate
data_filename = "data/shacl/example_data_value.ttl"
data_graph = rdflib.Graph()
data_graph.parse(data_filename, format='n3')

constraints_filename = "data/shacl/shacl_constraints_value.ttl"
constraints_graph = rdflib.Graph()
constraints_graph.parse(constraints_filename, format='n3')

r = validate(data_graph,
            shacl_graph=constraints_graph,
            # ont_graph=og,
             inference='rdfs', abort_on_error=False,
             meta_shacl=False, debug=True, advanced=True)
conforms, results_graph, results_text = r
conforms

What I'm seeing in the terminal (note the absence of the sh:message)

$ python3 data/shacl/validate_transition.py
Constraint Violation in ClassConstraintComponent (http://www.w3.org/ns/shacl#ClassConstraintComponent):
        Severity: sh:Violation
        Source Shape: [ sh:class schema:HasAuthor ]
        Focus Node: [ rdf:type rdfs:Resource, schema:otherClass ; schema:isPresent Literal("true" = True, datatype=xsd:boolean) ]
        Value Node: [ rdf:type rdfs:Resource, schema:otherClass ; schema:isPresent Literal("true" = True, datatype=xsd:boolean) ]

error using python module

I am trying to validate a data graph with it corresponding shapes graph. When I use the commandline method it works fine. However I get an erro on using the python module.

I am doing this:

r = validate(data_graph, shacl_graph='./validation/ActivityShape.ttl', ont_graph=None, advanced=True, inference='rdfs', abort_on_error=False)
conforms, results_graph, results_text = r

I get the following error:

Traceback (most recent call last):
  File "validation/test1.py", line 64, in <module>
    r = validate(data_graph=data_graph, shacl_graph='./validation/ActivityShape.ttl', ont_graph=None, advanced=True, inference='rdfs', abort_on_error=False)
  File "/Users/sanuann/validation-env/lib/python3.6/site-packages/pyshacl/validate.py", line 253, in validate
    do_owl_imports=False)  # no imports on data_graph
  File "/Users/sanuann/validation-env/lib/python3.6/site-packages/pyshacl/rdfutil/load.py", line 110, in load_from_source
    first_char = source[0]
IndexError: string index out of range

What am I doing wrong?

Debain/Ubuntu package for pySHACL

PySHACL is maturing and becoming an increasingly powerful and relevant tool for validating SHACL. I believe it is the go-to tool for SHACL validation on the commandline, and should be easily accessible for as many users as possible.

I want to get pySHACL packaged as a debian package and available from the official debian repositories, and in turn into Ubuntu repositories.

PySHACL has two dependencies, RDFLib and OWL-RL. RDFLib is already packaged and available in the debian repositories, so I need to get owlrl in too before I can package and publish a pySHACL debian package.

I've already submitted an ITP (Intent to Package) for both owlrl and pySHACL, to the Debian WNPP list.
I've created an Uploader account on the Debain Mentors site, so that I can request a sponsor to sponsor the package (to authorize it on my behalf) once the package is uploaded to the Mentors site staging area.

CLI: -m option produces an exception

When using pyshacl with -m option, pyshacl reports a traceback about a ValueError: read of closed file.

### pyshacl -m -s shape.ttl data.ttl
Traceback (most recent call last):
  File "/usr/local/bin/pyshacl", line 71, in <module>
    is_conform, v_graph, v_text = validate(args.data, **validator_kwargs)
  File "/usr/local/lib/python3.7/site-packages/pyshacl/validate.py", line 194, in validate
    rdf_format=shacl_graph_format)
  File "/usr/local/lib/python3.7/site-packages/pyshacl/util.py", line 176, in load_into_graph
    data = target.read()
ValueError: read of closed file

shape.ttl and data.ttl are valid files with valid shapes and RDF data.
When using pyshacl -s shape.ttl data.ttl- so without -m - pyshacl works as expected.

Support for recursion

According to the SHACL spec:

The validation with recursive shapes is not defined in SHACL and is left to SHACL processor implementations.

I was wondering if pySHACL has any plans to support recursive shapes or does it?

SPARQLFunction support

I tried using the SHACL found at http://datashapes.org/schema.ttl to validate some data and received the following error:

NotImplementedError: SHACL Advanced Feature SPARQLFunction is not yet supported.

I'll add my vote to getting this feature implemented.

SPARQL Target Select not working as expected

Hi,

we are trying to use SPARQL-based targets in our SHACL-Tests.
Our Test should use all non-anonymous instances of owl:Class as Focus Nodes, but it seems its not working:

The Test:

@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .


<#LODE-class-comment-violation>
    a sh:Shape ;
    sh:target [
        a sh:SPARQLTarget ;
        sh:select """
        SELECT ?this WHERE {
            ?this <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
            FILTER ( !isBlank(?this) )
        }
        """;
    ];
    sh:severity sh:Violation;
    sh:path rdfs:comment;
    sh:nodeKind sh:Literal;
    sh:minCount 1;
    sh:name "comment not correctly specified"@en;
    sh:message "rdfs:comment is missing or is no Literal"@en .

The result conforms as true with this ontology as data graph in pyshacl (with advanced=true in the validate function), but does not conform if we try the same in the shacl play service.
Is this a bug or did we miss something?

Thanks in advance,
Denis

validation showing true inspite of errors in data shape

if Data graph is this:

`{

"@context": { "@vocab": "http://schema.org/" },
"@id": "http://example.org/ns#Bob",
"@type": "Person",
"givenName": "Robert",
"familyName": "Junior",
"birthDate": "1971-07-07",
"deathDate": "1968-09-10",
"address": {
    "@id": "http://example.org/ns#BobsAddress",
    "streetAddress": "1600 Amphitheatre Pkway",
    "postalCode": 9404
}

}`

and Shapes Graph this :

@Prefix dash: http://datashapes.org/dash# .
@Prefix rdf: https://www.w3.org/1999/02/22-rdf-syntax-ns# .
@Prefix rdfs: https://www.w3.org/2000/01/rdf-schema# .
@Prefix schema: http://schema.org/ .
@Prefix sh: https://www.w3.org/ns/shacl# .
@Prefix xsd: https://www.w3.org/2001/XMLSchema# .

schema:PersonShape
a sh:NodeShape ;
sh:targetClass schema:Person ;
sh:property [
sh:path schema:givenName ;
sh:datatype xsd:string ;
sh:name "given name" ;
] ;
sh:property [
sh:path schema:birthDate ;
sh:lessThan schema:deathDate ;
sh:maxCount 1 ;
] ;
sh:property [
sh:path schema:gender ;
sh:in ( "female" "male" ) ;
] ;
sh:property [
sh:path schema:address ;
sh:node schema:AddressShape ;
] .

schema:AddressShape
a sh:NodeShape ;
sh:closed true ;
sh:property [
sh:path schema:streetAddress ;
sh:datatype xsd:string ;
] ;
sh:property [
sh:path schema:postalCode ;
sh:or ( [ sh:datatype xsd:string ] [ sh:datatype xsd:integer ] ) ;
sh:minInclusive 10000 ;
sh:maxInclusive 99999 ;
] .

when I do this:
pyshacl -s /path/to/shapesGraph.ttl -m -i rdfs -a -f human /path/to/dataGraph.json-ld -df json-ld

why doesn't it show validation errors? (as we can clearly see there is error in address and birthDate in the data graph)

Measurement of prevalence in the shacl-report

Hello,

is there any way to measure the prevalence of executed SHACL-tests, like getting the total number of instances of the sh:targetClass or a percentage like 0.95 of the instances of the given sh:targetClass fulfill the restrictions? If not I think it would be nice to have.

Best Regards

Validator runtime error

Validator says there is a runtime error, no additional details turning on debug:

$ pyshacl -s 03-Network.ttl -e 03-Network.ttl sample-network.ttl
Validator encountered a Runtime Error.

Info:

$ python3.6
Python 3.6.9 (default, Nov  7 2019, 10:44:02) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyshacl
INFO:rdflib:RDFLib Version: 4.2.2
>>> print(pyshacl.__version__)
0.11.3.post1

Turtle files attached as text files.
03-Network.ttl.txt
sample-network.ttl.txt

PySHACL considers non-conforming datagraph to be conforming

The attached files illustrate the behavior I am seeing. The W3C validator (https://shacl.org/playground/) does flag this as non-conforming, so I am trusting this is not operator error on my part.

The shacl graph includes a property shape defined as follows:

 ex:Func  a       owl:Class , sh:NodeShape ;
        rdfs:label       "Func" ;
        rdfs:subClassOf   ex:Function ;
        sh:property      [ a         sh:PropertyShape ;
                           sh:class   ex:FuncParam_Func_a ;
                           sh:path    ex:hasParameter ;
                           sh:minCount 1;
                           sh:name	 "Func_a"
                         ] .

and the graph being validated includes

test:FuncNode	a	ex:Func;
	ex:hasParameter test:FuncParam_b .
	
test:FuncParam_a	a	ex:FuncParam_Func_a .
test:FuncParam_b	a	ex:FuncParam_Func_b .

simpleOnto.zip

Validating using a shacl graph

[Python 3.7.0, rdflib 4.2.2, pyshacl 0.9.8.post1]

I am using a graph as shacl_graph shown below.

conforms, v_graph, v_text = validate(g, shacl_graph=g2,
                                     data_graph_format='turtle',
                                     shacl_graph_format='turtle',
                                     inference='rdfs', debug=True,
                                     serialize_report_graph=True)

Validation Report
Conforms: True

The g2 graph I use is the following:

@prefix hei: <http://hei.org/customer/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

hei:HeiAddressShape a sh:NodeShape ;
    sh:property [ rdfs:comment "Street constraint" ;
            sh:datatype xsd:string ;
            sh:minlength 30 ;
            sh:path hei:Ship_to_street ] ;
    sh:targetClass hei:Hei_customer .

Data validated is:

hei:hei_cust_1281 a hei:Sfg_customer ;
    rdfs:label "XYZHorecagroothandel" ;
    hei:Klant_nummer 1281 ;
    hei:Ship_to_City "Middenmeer" ;
    hei:Ship_to_postcode "1799 AB" ;
    hei:Ship_to_street "Industrieweg" 

The issue is when I pass a graph object no validation is done; passing the g2 validation graph as a string works fine. I did expect both options to work fine.

owl imports of other shape graphs

So, very new to all this... but have a question.

Is it possible to define an owl:imports in a shape file to pull in previously defined shapes from another file? Ref https://github.com/ESIPFed/science-on-schema.org/blob/master/tools/sospy/shapegraphs/reqrec.ttl

I'm looking at https://book.validatingrdf.com/bookHtml011.html section 5.4 for inspiration.

Note: Maybe I'm being too cute trying to import from a github raw URL?

I get a proper violation from the recomended.ttl file, but I can not import it and use it. I don't know if this is not possible or (more likely) I'm doing it wrong.

Thanks

pyshacl -s ./shapegraphs/recomendShape.ttl  -m  -f human -df json-ld ./datagraphs/dataset-minimal.json-ld
Validation Report
Conforms: False
Results (1):
Constraint Violation in MinCountConstraintComponent (http://www.w3.org/ns/shacl#MinCountConstraintComponent):
        Severity: sh:Violation
        Source Shape: [ sh:maxCount Literal("1", datatype=xsd:integer) ; sh:minCount Literal("1", datatype=xsd:integer) ; sh:path <http://schema.org/citation> ]
        Focus Node: [ ]
        Result Path: <http://schema.org/citation>
ย 

pyshacl -s ./shapegraphs/reqrec.ttl  -m  -f human -df json-ld ./datagraphs/dataset-minimal.json-ld
Validation Report
Conforms: True

Enforcing minimum number of instances doesn't work

Following the trick mentioned in https://www.w3.org/wiki/SHACL/Examples i wanted to write a shape to validate the existence of a node.

The shape

{
    "@context": {
       "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
       "sh": "http://www.w3.org/ns/shacl#",
       "schema": "http://schema.org/"
    },
    "@graph": [
        {
            "@id": "_:forceDatasetShape",
            "@type": "sh:NodeShape",
            "sh:targetNode": "schema:DigitalDocument",
            "sh:property": [
                {
                    "sh:path": [
                        {
                            "sh:inversePath": [{
                                "@id": "rdf:type",
                                "@type": "@id"
                             }]
                        }
                    ],
                    "sh:minCount": 1
                }
            ]
        }
    ]
}

with the graph

{}

throws a validation error in the SHACL playground https://shacl.org/playground/

But pyshacl says that it's conforming. Does this inversePath trick not work with pySHACL?

Command I'm using is: pyshacl -a -m -s shape.json graph.json -sf json-ld -df json-ld
With pyshacl version 0.11.3


On a side note, SHACL playground validates successfully with

{
    "@context": { "schema": "http://schema.org/", "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#" },

    "@id": "http://example.org/ns#Bob",
    "rdf:type": "http://schema.org/DigitalDocument"
}

but not with

{
    "@context": { "schema": "http://schema.org/", "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#" },

    "@id": "http://example.org/ns#Bob",
    "@type": "http://schema.org/DigitalDocument"
}

or

{
    "@context": { "schema": "http://schema.org/", "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#" },

    "@id": "http://example.org/ns#Bob",
    "rdf:type": "schema:DigitalDocument"
}

which is weird. I was under the impression that @type is an alias for rdf:type.

Enable sh:pattern on IRIs

It is quite a common requirement to test an IRI to check if it is in a specific namespace, or contains a path element which is a specific character string or pattern. While the SHACL spec appears to restrict application of sh:pattern to string literals, it would be helpful to allow a 'relaxed' mode where it can also apply to IRIs (which are, after all, just a sequence of characters).

Note that the TopBraid SHACL engine (maintained by the SHACL editor @HolgerKnublauch ) does operate in this mode - see https://groups.google.com/forum/?utm_source=digest&utm_medium=email#!topic/topbraid-users/BUoROZt0BhM

Guidance request

I've been happily using pyshacl (installed via pip3) to work on shacl rules, and in the process have broken my rules file in a manner I can't seem to correct, so was hoping you may have some tips for a newcomer ...

The error I get is simply

Validator encountered a Runtime Error:
Shape pointed to by sh:property does not exist or is not a well-formed SHACL PropertyShape.If you believe this is a bug in pyshacl, open an Issue on the pyshacl github page.

I don't believe it is a bug in pyshacl; the same two files in the shacl playground produce only VALIDATION FAILURE: Missing subject -- when I load the shacl into RDFlib and query, I can't find any sh:property triple with an unbound subject, but that may be a very naive approach.

run through meta-shacl shows nothing terribly helpful; unrelated things I know work in some engines such as using rdf:list items instead of spelling out a first/rest list.

In the -d debug trace, do the last few Constraint Report/Violations clues to the bad rule?
How might I get more information about what I've messed up in the shacl?

No module named 'pyldapi'

Hi,
This validator arrived just at the right time to enable more adoption of SHACL. Thank you for this effort.
I'm using a jupyter notebook running on python 3.6 and I installed the pyshacl module with:

!pip install git+https://github.com/RDFLib/[email protected]#egg=pyshacl

As suggested in the 'Use' section of the README file, I tried a basic validation by running:

from pyldapi import validate
validate(target_graph, shacl_graph, inference='rdfs', abort_on_error=False)

But I got a ModuleNotFoundError: No module named 'pyldapi'

I guess it's because the validate function seems to be part of the pyshacl module. I got it right by running :

from pyshacl import validate 
validate(target_graph, shacl_graph, inference='rdfs', abort_on_error=False)

Thanks.

Regular expression in sh:pattern not processed correctly

I have the following:

graph_data = """
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix sch:  <http://schema.org/> .
@prefix sh:   <http://www.w3.org/ns/shacl#> .
@prefix ex:  <http://example.org/> .
@prefix xsd:  <http://www.w3.org/2001/XMLSchema#> .

ex:JohnDoe a ex:XXXX .
ex:JohnDoe ex:name "hello.txt" .
"""

shape_data = """
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix sch:  <http://schema.org/> .
@prefix sh:   <http://www.w3.org/ns/shacl#> .
@prefix ex:   <http://example.org/> .
@prefix xsd:  <http://www.w3.org/2001/XMLSchema#> .

ex:PersonShape
  a sh:NodeShape ;
  sh:targetClass ex:XXXX ;
  sh:property ex:PersonShape-name .

ex:PersonShape-name
  a sh:PropertyShape ;
  sh:path ex:name ;
  sh:minCount 1 ;
  sh:pattern  ".*.txt" .
"""
        
data  = rdflib.Graph().parse( data = graph_data, format = 'ttl' )
shape = rdflib.Graph().parse( data = shape_data, format = 'ttl' )

print( f"{data.serialize( format = 'ttl' ).decode( 'utf8' )}" )

report = validate( data, shacl_graph=shape, abort_on_error = False, meta_shacl = False, debug = True, advanced = True )

print( report[2] )

The sh:pattern should be ".*\.txt", but when I do that, the following errors are generated:

... notation3.py", line 1591, in strconst  "bad escape")
... notation3.py", line 1615, in BadSyntax  raise BadSyntax(self._thisDoc, self.lines, argstr, i, msg)

  File "<string>", line unknown
BadSyntax

At least according to http://www.datypic.com/books/xquery/chapter19.html, I am using the escape correctly.

Validation does not work for classes that are also node shapes

If I run the following code:

shapes = rdf.Graph()
shapes.parse(data="""
    @prefix sh: <http://www.w3.org/ns/shacl#> .
    @prefix owl: <http://www.w3.org/2002/07/owl#> .
    @prefix ex: <http://example.org/ns#> .

    ex:Person
          a owl:Class ;
          a sh:NodeShape ;
          sh:property ex:NameConstraint ;
    .

    ex:NameConstraint
          a sh:PropertyShape ;
          sh:path ex:name ;
          sh:minCount 1 ;
        .
""",format="ttl")

data = rdf.Graph()
data.parse(data="""
    @prefix ex: <http://example.org/ns#> .

    ex:Bob
          a ex:Person ;
    .
""",format="ttl")

r = sh.validate(data_graph=data,shacl_graph=shapes,inference='rdfs')
print(r[2])

no validation errors are reported. In order to force the error to be recognized, I have to explicitly declare ex:Person sh:targetClass ex:Person in the shapes graph which shouldn't be necessary.

This is how TopQuadrant products represent classes and node shapes by default, so it would be great if pyshacl could support this.

RDFClosure dependency in the code breaks at least pyshacl CLI completely

It seems like RDFClosure has been renamed to OWL-RL and that RDFClosure has been removed from the online repositories. Thus it's not installed as a dependency for pyshacl and thus pyshacl throws a lot of ModuleNotFoundErrors. E.g.:

### pyshacl --help
Traceback (most recent call last):
  File "/usr/local/bin/pyshacl", line 17, in <module>
    from pyshacl import validate
  File "/usr/local/lib/python3.7/site-packages/pyshacl/__init__.py", line 3, in <module>
    from pyshacl.validate import validate
  File "/usr/local/lib/python3.7/site-packages/pyshacl/validate.py", line 12, in <module>
    from pyshacl.inference import CustomRDFSSemantics, CustomRDFSOWLRLSemantics
  File "/usr/local/lib/python3.7/site-packages/pyshacl/inference/__init__.py", line 2, in <module>
    from .custom_rdfs_closure import CustomRDFSSemantics, CustomRDFSOWLRLSemantics
  File "/usr/local/lib/python3.7/site-packages/pyshacl/inference/custom_rdfs_closure.py", line 2, in <module>
    from RDFClosure.RDFSClosure import RDFS_Semantics as OrigRDFSSemantics
ModuleNotFoundError: No module named 'RDFClosure'

I'm using version 0.9.7 of pyshacl from pypi and the issue happens for each command of pyshacl, I have tested so far.

Windows binary for pySHACL cli

It would be good to wrap pySHACL as a Windows EXE so windows users can execute the CLI without necessarily having to install python

Does pySHACL support SHACL-JS?

Hello,

I noticed that pySHACL supports SHACL Advanced Features
(SPARQL).

I wonder if you also have to support the SHACL JavaScript Extensions (SHACL-JS)?

Best Regards,

Angelo

SPARQL targets giving wrong inference

When I try to do SPARQL based SHACL validation, I am getting the wrong results.I am trying to filter out processes Testsparql:Process where Testsparql:Cranecapacity is less than Testsparql:Moduleweight. However I am getting the desired output when my datafile and shape file is in a single RDF. However when I split it into 2 RDF, I am not getting the correct inference.

2 file case:

from pyshacl import validate
shapes_file = '''
@prefix Testsparql: <http://semanticprocess.x10host.com/Ontology/Testsparql#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

Testsparql:PrefixDeclaration
  rdf:type sh:PrefixDeclaration ;
  sh:namespace "http://semanticprocess.x10host.com/Ontology/Testsparql#"^^xsd:anyURI ;
  sh:prefix "Testsparql" ;
.

Testsparql:Processshape
  rdf:type rdfs:Class ;
  rdf:type sh:NodeShape ;
  rdfs:subClassOf owl:Class ;
  sh:sparql [
      sh:message "Invalid process" ;
      sh:prefixes <http://semanticprocess.x10host.com/Ontology/Testsparql> ;
      sh:select """SELECT $this 
        WHERE {
			 $this  rdf:type Testsparql:Process.
			$this Testsparql:hasResource ?crane.
			$this Testsparql:hasAssociation ?module.
			?crane Testsparql:Cranecapacity ?cc.
			?module Testsparql:Moduleweight ?mw.
					FILTER (?cc <= ?mw).

     }""" ;
    ] ;
.

'''
shapes_file_format = 'turtle'

data_file = '''
@prefix Testsparql: <http://semanticprocess.x10host.com/Ontology/Testsparql#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://semanticprocess.x10host.com/Ontology/Testsparql>
  rdf:type owl:Ontology ;
  owl:imports <http://datashapes.org/dash> ;
  owl:versionInfo "Created with TopBraid Composer" ;
  sh:declare Testsparql:PrefixDeclaration ;
.
Testsparql:Crane
  rdf:type rdfs:Class ;
  rdfs:subClassOf owl:Class ;
.
Testsparql:Crane_1
  rdf:type Testsparql:Crane ;
  Testsparql:Cranecapacity "500"^^xsd:decimal ;
.
Testsparql:Crane_2
  rdf:type Testsparql:Crane ;
  Testsparql:Cranecapacity "5000"^^xsd:decimal ;
.
Testsparql:Cranecapacity
  rdf:type owl:DatatypeProperty ;
  rdfs:domain Testsparql:Crane ;
  rdfs:range xsd:decimal ;
  rdfs:subPropertyOf owl:topDataProperty ;
.
Testsparql:Module
  rdf:type rdfs:Class ;
  rdfs:subClassOf owl:Class ;
.
Testsparql:Module_1
  rdf:type Testsparql:Module ;
  Testsparql:Moduleweight "800"^^xsd:decimal ;
.
Testsparql:Moduleweight
  rdf:type owl:DatatypeProperty ;
  rdfs:domain Testsparql:Module ;
  rdfs:range xsd:decimal ;
  rdfs:subPropertyOf owl:topDataProperty ;

.
Testsparql:Process
  rdf:type rdfs:Class ;
  
  rdfs:subClassOf owl:Class ;
  .
Testsparql:ProcessID
  rdf:type owl:DatatypeProperty ;
  rdfs:domain Testsparql:Process ;
  rdfs:range xsd:string ;
  rdfs:subPropertyOf owl:topDataProperty ;
.
Testsparql:Process_1
  rdf:type Testsparql:Process ;
  Testsparql:ProcessID "P1" ;
  Testsparql:hasAssociation Testsparql:Module_1 ;
  Testsparql:hasResource Testsparql:Crane_1 ;
.
Testsparql:Process_2
  rdf:type Testsparql:Process ;
  Testsparql:ProcessID "P2" ;
  Testsparql:hasAssociation Testsparql:Module_1 ;
  Testsparql:hasResource Testsparql:Crane_2 ;
.
Testsparql:hasAssociation
  rdf:type owl:ObjectProperty ;
  rdfs:domain Testsparql:Process ;
  rdfs:range Testsparql:Module ;
  rdfs:subPropertyOf owl:topObjectProperty ;
.
Testsparql:hasResource
  rdf:type owl:ObjectProperty ;
  rdfs:domain Testsparql:Process ;
  rdfs:range Testsparql:Crane ;
  rdfs:subPropertyOf owl:topObjectProperty ;
.

'''
data_file_format = 'turtle'

conforms, v_graph, v_text = validate(data_file, shacl_graph=shapes_file,
                                     target_graph_format=data_file_format,
                                     shacl_graph_format=shapes_file_format,
                                     inference='rdfs', debug=True,
                                     serialize_report_graph=True)
print(conforms)
print(v_graph)
print(v_text)

Result is :

True
b'@prefix Testsparql: <http://semanticprocess.x10host.com/Ontology/Testsparql#> .\n@prefix owl: <http://www.w3.org/2002/07/owl#> .\n@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .\n@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .\n@prefix sh: <http://www.w3.org/ns/shacl#> .\n@prefix xml: <http://www.w3.org/XML/1998/namespace> .\n@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .\n\n[] a sh:ValidationReport ;\n    sh:conforms true .\n\n'
Validation Report
Conforms: True 

However, if the same data is given in a single file

from pyshacl import validate
data_file = '''
# baseURI: http://semanticprocess.x10host.com/Ontology/Testsparql
# imports: http://datashapes.org/dash
# prefix: Testsparql

@prefix Testsparql: <http://semanticprocess.x10host.com/Ontology/Testsparql#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://semanticprocess.x10host.com/Ontology/Testsparql>
  rdf:type owl:Ontology ;
  owl:imports <http://datashapes.org/dash> ;
  owl:versionInfo "Created with TopBraid Composer" ;
  sh:declare Testsparql:PrefixDeclaration ;
.
Testsparql:Crane
  rdf:type rdfs:Class ;
  rdfs:subClassOf owl:Class ;
.
Testsparql:Crane_1
  rdf:type Testsparql:Crane ;
  Testsparql:Cranecapacity "500"^^xsd:decimal ;
.
Testsparql:Crane_2
  rdf:type Testsparql:Crane ;
  Testsparql:Cranecapacity "5000"^^xsd:decimal ;
.
Testsparql:Cranecapacity
  rdf:type owl:DatatypeProperty ;
  rdfs:domain Testsparql:Crane ;
  rdfs:range xsd:decimal ;
  rdfs:subPropertyOf owl:topDataProperty ;
.
Testsparql:Module
  rdf:type rdfs:Class ;
  rdfs:subClassOf owl:Class ;
.
Testsparql:Module_1
  rdf:type Testsparql:Module ;
  Testsparql:Moduleweight "800"^^xsd:decimal ;
.
Testsparql:Moduleweight
  rdf:type owl:DatatypeProperty ;
  rdfs:domain Testsparql:Module ;
  rdfs:range xsd:decimal ;
  rdfs:subPropertyOf owl:topDataProperty ;
.
Testsparql:PrefixDeclaration
  rdf:type sh:PrefixDeclaration ;
  sh:namespace "http://semanticprocess.x10host.com/Ontology/Testsparql#"^^xsd:anyURI ;
  sh:prefix "Testsparql" ;
.
Testsparql:Process
  rdf:type rdfs:Class ;
  rdf:type sh:NodeShape ;
  rdfs:subClassOf owl:Class ;
  sh:sparql [
      sh:message "Invalid process" ;
      sh:prefixes <http://semanticprocess.x10host.com/Ontology/Testsparql> ;
      sh:select """SELECT $this 
        WHERE {
			 $this  rdf:type Testsparql:Process.
			$this Testsparql:hasResource ?crane.
			$this Testsparql:hasAssociation ?module.
			?crane Testsparql:Cranecapacity ?cc.
			?module Testsparql:Moduleweight ?mw.
					FILTER (?cc <= ?mw).

     }""" ;
    ] ;
.
Testsparql:ProcessID
  rdf:type owl:DatatypeProperty ;
  rdfs:domain Testsparql:Process ;
  rdfs:range xsd:string ;
  rdfs:subPropertyOf owl:topDataProperty ;
.
Testsparql:Process_1
  rdf:type Testsparql:Process ;
  Testsparql:ProcessID "P1" ;
  Testsparql:hasAssociation Testsparql:Module_1 ;
  Testsparql:hasResource Testsparql:Crane_1 ;
.
Testsparql:Process_2
  rdf:type Testsparql:Process ;
  Testsparql:ProcessID "P2" ;
  Testsparql:hasAssociation Testsparql:Module_1 ;
  Testsparql:hasResource Testsparql:Crane_2 ;
.
Testsparql:hasAssociation
  rdf:type owl:ObjectProperty ;
  rdfs:domain Testsparql:Process ;
  rdfs:range Testsparql:Module ;
  rdfs:subPropertyOf owl:topObjectProperty ;
.
Testsparql:hasResource
  rdf:type owl:ObjectProperty ;
  rdfs:domain Testsparql:Process ;
  rdfs:range Testsparql:Crane ;
  rdfs:subPropertyOf owl:topObjectProperty ;
.
'''
data_file_format = 'turtle'

conforms, v_graph, v_text = validate(data_file, shacl_graph=None,
                                     target_graph_format=data_file_format,
                                     shacl_graph_format=shapes_file_format,
                                     inference='rdfs', debug=True,
                                     serialize_report_graph=True)
print(conforms)
print(v_graph)
print(v_text)

It gives the correct inference.

False
b'@prefix Testsparql: <http://semanticprocess.x10host.com/Ontology/Testsparql#> .\n@prefix owl: <http://www.w3.org/2002/07/owl#> .\n@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .\n@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .\n@prefix sh: <http://www.w3.org/ns/shacl#> .\n@prefix xml: <http://www.w3.org/XML/1998/namespace> .\n@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .\n\n[] a sh:ValidationReport ;\n    sh:conforms false ;\n    sh:result [ a sh:ValidationResult ;\n            sh:focusNode Testsparql:Process_1 ;\n            sh:resultMessage "Invalid process" ;\n            sh:resultSeverity sh:Violation ;\n            sh:sourceConstraint [ sh:message "Invalid process" ;\n                    sh:prefixes <http://semanticprocess.x10host.com/Ontology/Testsparql> ;\n                    sh:select """SELECT $this \n        WHERE {\n\t\t\t $this  rdf:type Testsparql:Process.\n\t\t\t$this Testsparql:hasResource ?crane.\n\t\t\t$this Testsparql:hasAssociation ?module.\n\t\t\t?crane Testsparql:Cranecapacity ?cc.\n\t\t\t?module Testsparql:Moduleweight ?mw.\n\t\t\t\t\tFILTER (?cc <= ?mw).\n\n     }""" ] ;\n            sh:sourceConstraintComponent sh:SPARQLConstraintComponent ;\n            sh:sourceShape Testsparql:Process ;\n            sh:value Testsparql:Process_1 ] .\n\n'
Validation Report
Conforms: False
Results (1):
Constraint Violation in SPARQLConstraintComponent (http://www.w3.org/ns/shacl#SPARQLConstraintComponent):
	Severity: sh:Violation
	Source Shape: Testsparql:Process
	Focus Node: Testsparql:Process_1
	Value Node: Testsparql:Process_1
	Source Constraint: [ sh:message Literal("Invalid process") ; sh:prefixes <http://semanticprocess.x10host.com/Ontology/Testsparql> ; sh:select Literal("SELECT $this 
        WHERE {
			 $this  rdf:type Testsparql:Process.
			$this Testsparql:hasResource ?crane.
			$this Testsparql:hasAssociation ?module.
			?crane Testsparql:Cranecapacity ?cc.
			?module Testsparql:Moduleweight ?mw.
					FILTER (?cc <= ?mw).

     }") ]
	Message: Invalid process

Can you help me understand why this is giving the wrong inference?

ConstraintLoadError: sh:namespace value must be an RDF Literal with type xsd:anyURI.

This may be related to the changes made for #59

Using the script below and the SHACL from http://datashapes.org/schema.ttl, I get the following error:

ConstraintLoadError: sh:namespace value must be an RDF Literal with type xsd:anyURI.
https://www.w3.org/TR/shacl/#sparql-prefixes

However, running pyshacl from the command line, appears to work correctly.

pyshacl -s ./schema_org_validation.ttl ./test_data.ttl

Validation Report
Conforms: False
Results (1):
Constraint Violation in ClassConstraintComponent (http://www.w3.org/ns/shacl#ClassConstraintComponent):
	Severity: sh:Violation
	Source Shape: schema:CommunicateAction-about
	Focus Node: ex:asdgjkj
	Value Node: [ rdf:type sch:GameServer ; sch:playersOnline Literal("42", datatype=xsd:integer) ]
	Result Path: schema:about
	Message: Value does not have class schema:Thing

(I am not include the schema.org schema, hence the validation error)

Python script:

Archive.zip

import rdflib
from pyshacl import validate

data = """
@prefix ex: <http://example.org/> .
@prefix sch: <http://schema.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ex:asdgjkj a sch:CommunicateAction ;
    sch:about [ a sch:GameServer ;
            sch:playersOnline "42"^^xsd:integer ] .
"""

dataGraph = rdflib.Graph().parse( data = data, format = 'ttl' )
print( dataGraph.serialize( format='ttl' ).decode( 'utf8' ) )

shaclData = open( "./schema_org_validation.ttl", "r" ).read()
shaclGraph = rdflib.Graph().parse( data = shaclData, format = 'ttl' )

report = validate( dataGraph, shacl_graph = shaclGraph, abort_on_error = False, meta_shacl = False, debug = False, advanced = True, do_owl_imports = True )

print( report[2] )

super(type, obj): obj must be an instance or subtype of type

Updated the library to pull in some recent changes, and ran into this error: super(type, obj): obj must be an instance or subtype of type. The function below was running fine before the update. Any idea what could be causing this?

try: 
    conforms, v_graph, v_text = validate(places, shacl_graph=places_shape,
                                     data_graph_format=data_file_format,
                                     shacl_graph_format=shapes_file_format,
                                     inference='rdfs', debug=True,
                                     serialize_report_graph=True)
    print(conforms)
    
except Exception as e:
    print(e)
    pass

Resource of http://www.w3.org/ns/shacl#value is empty in validation report

In the report the resource found in a http://www.w3.org/ns/shacl#value is empty, see the json below.

[
  {
    ...
    "@type": [
      "http://www.w3.org/ns/shacl#ValidationResult"
    ],
    "http://www.w3.org/ns/shacl#focusNode": [
      {
        "@id": "http://vangoghmuseum.nl/data/artwork/d0005V1962"
      }
    ],
    ...
    "http://www.w3.org/ns/shacl#value": [
      {
        "@id": "_:N6087b61f1f1d44e08519420c185ba3f2"
      }
    ]
  },
  {
    "@id": "_:N6087b61f1f1d44e08519420c185ba3f2"
  },

This report is the result of a propertyShape with a sh:node constraint. The first validation result in the example contains the information of the shape containing the sh:node. This fine. The value (N6087b61f1f1d44e08519420c185ba3f2) should contain the information of the result for the sh:node. I confirmed this in TopBraid.

validation with sh:closed

Given the Shapes Graph:

@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix ex: <http://example.com/ex#> .

ex:Parent a rdfs:Class ;
    rdfs:isDefinedBy ex: ;
    rdfs:comment "The parent class"@en ;
    rdfs:subClassOf owl:Thing .

ex:ParentShape a sh:NodeShape ;
    rdfs:isDefinedBy ex: ;
    sh:property [
        sh:datatype xsd:string ;
        sh:path ex:name ;
        sh:maxCount 1 ;
        sh:minCount 1 ;
    ] ;
    sh:closed true ; 
    sh:ignoredProperties ( rdf:type ) ;    
    sh:targetClass ex:Parent .

and the Data Graph:

{
    "@context": {
        "@vocab": "http://example.com/ex#"
    },
    "@type": "Parent",
    "name": "Father",
    "dummy": "Dummy value"
}

I expect to see a sh:ClosedConstraintComponent validation failure because of (ex:ParentShape sh:closed, true) in the Shapes Graph and the presence of the property "dummy": "Dummy value" in the Data Graph.

However, using the pyshacl (0.9.5) validate function no such validation failure is generated. Instead the text result is:

Validation Report
Conforms: True

In http://shacl.org/playground/ the expected validation failure is produced.

[Discussion] PySHACL Alternate Modes

PySHACL was originally built to be a basic (but fully standards compliant) SHACL validator. That is, it uses SHACL shapes to check conformance of a data graph, and gives you the result (True/False, plus a ValidationReport).
PySHACL does that job quite well. It can be called from python or from the command line, and it delivers the results users expect.

Over the last 12 months, I've been slowly implementing more of the SHACL Advanced Features spec, and pySHACL is now almost AF-complete.

The Advanced features add capability to SHACL which extends beyond that of just validating. Eg, the SHACL Rules allow you to run SHACL-based entailment on your data graph. SHACL Functions allow you to execute parameterised custom SPARQL Functions over the data graph. Custom Targets allow you to bypass the standard SHACL node-targeting mechanism and use SPARQL to select targets.

These features can use useful to execute validation in a more customisable way, but their major benefit is in the general use outside of just validating a data graph against constraints.

With these new features I see the possibility of PySHACL operating in additional alternative modes, besides that of just validating. Eg, expansion mode could run SHACL-AF Functions and Rules on the data graph, then return the expanded data graph (without validating).

Related to #20

Monolithic file generates report, split .ttl files do not.

I have a gist with all of the relevant files at: https://gist.github.com/James-Hudson3010/2588d9b17dd33e15922122b8b5cf1bd7

If I execute:

$ pyshacl -a -f human employees.ttl

I get the following, correct validation report...

Validation Report
Conforms: False
Results (3):
Constraint Violation in MaxInclusiveConstraintComponent (http://www.w3.org/ns/shacl#MaxInclusiveConstraintComponent):
	Severity: sh:Violation
	Source Shape: hr:jobGradeShape
	Focus Node: d:e4
	Value Node: Literal("8", datatype=xsd:integer)
	Result Path: hr:jobGrade
Constraint Violation in DatatypeConstraintComponent (http://www.w3.org/ns/shacl#DatatypeConstraintComponent):
	Severity: sh:Violation
	Source Shape: hr:jobGradeShape
	Focus Node: d:e3
	Value Node: Literal("3.14", datatype=xsd:decimal)
	Result Path: hr:jobGrade
Constraint Violation in MinCountConstraintComponent (http://www.w3.org/ns/shacl#MinCountConstraintComponent):
	Severity: sh:Violation
	Source Shape: hr:jobGradeShape
	Focus Node: d:e2
	Result Path: hr:jobGrade

However, if I split employees.ttl into three files containing the schema, shape, and instance data and run:

pyshacl -s shape.ttl -e schema.ttl -a -f human instance.ttl

the result is:

Validation Report
Conforms: True

I assume I am calling pyshacl correctly.

Could not install pyshacl using pip

I ma getting the following error when I try to install pyshacl

Could not find a version that satisfies the requirement RDFClosure (from pyshacl) (from versions: )
No matching distribution found for RDFClosure (from pyshacl)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.