Coder Social home page Coder Social logo

semiceu / dcat-ap Goto Github PK

View Code? Open in Web Editor NEW
72.0 54.0 24.0 26.56 MB

This is the issue tracker for the maintenance of DCAT-AP

Home Page: https://joinup.ec.europa.eu/solution/dcat-application-profile-data-portals-europe

JavaScript 1.49% HTML 98.51%
dcat-ap data-portals europe dcat open-data data-specification application-profile

dcat-ap's People

Contributors

addragan avatar aleksandralavreneva avatar barthelemyf avatar bertvannuffelen avatar emidiostani avatar emielpwc avatar fsantiagoec avatar jakubklimek avatar jensscheerlinck avatar makxdekkers avatar natasasofou avatar nishad avatar sethvanhooland avatar williamverbeeck avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dcat-ap's Issues

dcat:Dataset - harmonise the use of dct:accrualPeriodicity

The DCAT-AP usage analysis on the European Data Portal showed the following:

dct:accrualPeriodicity:
• Usage: 44.98%
• Issue: some datasets are using the URIs of the MDR Frequencies NAL while other datasets use the label of the same authority list.
• Is there a need for guidelines to harmonise usage across implementations?

dcat:Distribution - usage of dct:title and dct:format

The DCAT-AP usage analysis on the European Data Portal showed the following:

dct:title and dct:format:
• Usage: 77.17% and 12.17% respectively
• Issue: Some distributions use dct:title instead of dct:format to provide information about the format of the distribution.
• Is there a need for guidelines to explain how to use dct:title and dct:format?

dcat:Dataset - how to encourage the use of dct:description?

The DCAT-AP usage analysis on the European Data Portal showed the following:

dct:description (mandatory property)

  • Usage: 96.46% of the datasets queried
  • Should we somehow encourage addition of description to the 26,328 datasets that don’t use this mandatory property?

Proposal to use sh:targetClass in the SHACL shapes

In a project for the French ministry of Health I am reusing DCAT-AP SHACL shapes.

The current SHACL shapes (at least those mandatory) have shapes which affect exactly the classes indicated. By using sh:targetClass, the shapes can affect subclasses (declared with rdfs:subClassOf in the model) and therefore improving the reusability of the shacl shapes.

The alternative would be define the subclasses directly in the shapes but this is not good for modularity.

Support for data integration

By Bert van Nuffelen, created from comment at #4:

I in general like the number of fields as they have mostly some good definitions and are there to support some usecases.

They are rightly mostly annotated as optional, as the effort to give good values to them (and maintain it) requires effort.
The statistics shown by each of the issues are for me indications of two problems:
a) the definition did not fitted well, and hence not used
b) the data could not be collected (for what ever reason)

Removing the properties to those that most portals are providing has potential impact on the usability of the catalogue.
I think we should consider what are the target audiences for the catalogue and the see if the data provided is aiding to achieve that purpose. Today's open data portals mostly are oriented to accidental visitors who want to know what data exists and a bit browse through it. The catalogues are seldom used as a navigator to the technical audiences which would use the catalogues as their source to combine the actual data.
In order to do so technical properties as checksums, formats, sizes, api descriptions are needed.
And maybe this is a discussion worth: how can DCAT-AP handle more consistently the different dissemination forms and provide a highlevel description for it so that this data integration can be realized.
It would be great if we could come to an extension approach so that this can also lead to good interoperability. I am aware that this has it limits from the perspective of DCAT-AP, but if taking the perspective of the developers, self-descriptive services/distributions are what they ask for. In my opinion, it would be fantastic if it would be clarified how this bridge is made.

dcat:Distribution - remove adms:sample?

The DCAT-AP usage analysis on the European Data Portal showed the following:

adms:sample
• Usage: 0%
• Should these properties be removed/withdrawn/deprecated?

dcat:Distribution - remove dcat:byteSize?

The DCAT-AP usage analysis on the European Data Portal showed the following:

dcat:byteSize:
• Usage: near 0%
• Should these properties be removed/withdrawn/deprecated?

Solve ambiguity issues when properties are used by different classes in JSON-LD

The following properties are affected

dcat:Dataset - remove dct:hasVersion?

The DCAT-AP usage analysis on the European Data Portal showed the following:

dct:hasVersion:
• Usage: 0%
• Should these properties be removed/withdrawn/deprecated?

dcat:Distribution - remove dct:conformsTo?

The DCAT-AP usage analysis on the European Data Portal showed the following:

dct:conformsTo:
• Usage: near 0%
• Should these properties be removed/withdrawn/deprecated?

dcat:Distribution - use of dct:format and dcat:mediaType

The DCAT-AP usage analysis on the European Data Portal showed the following:

dct:format and dcat:mediaType
• Usage: 12.17% and 25.52% respectively
• Issue: dct:format is a recommended property but there are Distributions that use the optional property dcat:mediaType, which is more restrictive (IANA media type).
• Is there a need for guidelines explaining how to use dct:format and dcat:mediaType similar to guideline “How to use accessURL and downloadURL?”, e.g. if specific media type(s) can be provided, duplicate value of the media type(s) in both dcat:mediaType and dct:format?

dcat:Dataset - remove dct:isVersionOf?

The DCAT-AP usage analysis on the European Data Portal showed the following:

dct:isVersionOf:
• Usage: 0%
• Should these properties be removed/withdrawn/deprecated?

dcat:Dataset - remove dct:source?

The DCAT-AP usage analysis on the European Data Portal showed the following:

dct:source:
• Usage: 0%
• Should these properties be removed/withdrawn/deprecated?

dcat:Dataset - remove dct:type?

The DCAT-AP usage analysis on the European Data Portal showed the following:

dct:type:
• Usage: 0%
• Should these properties be removed/withdrawn/deprecated?

dcat:Dataset- remove adms:versionNotes?

The DCAT-AP usage analysis on the European Data Portal showed the following:

adms:versionNotes:
• Usage: 0%
• Should these properties be removed/withdrawn/deprecated?

dcat:Dataset - use of dct:publisher and dcat:contactPoint

The DCAT-AP usage analysis on the European Data Portal showed the following:

dct:publisher and dcat:contactPoint
• Usage: 43.85% and 45.44% respectively; 5.2% use both properties
• How could we promote the usage of these recommended properties, in addition to the guideline “How are publisher and contact point modelled”?.

dcat:Distribution - remove dct:language?

The DCAT-AP usage analysis on the European Data Portal showed the following:

dct:language:
• Usage: near 0%
• Should these properties be removed/withdrawn/deprecated?

dcat:CatalogRecord - remove dct:title?

The DCAT-AP usage analysis on the European Data Portal showed the following:

dct:title:

  • Usage: 0%
  • Should this property be removed/withdrawn/deprecated?

dcat:Dataset - remove owl:versionInfo?

The DCAT-AP usage analysis on the European Data Portal showed the following:

owl:versionInfo:
• Usage: 0%
• Should these properties be removed/withdrawn/deprecated?

dcat:Catalog - remove dct:language?

The DCAT-AP usage analysis on the European Data Portal showed the following:

dct:language
• Usage: 0%
• Should this property be removed/withdrawn/deprecated?

Comments on version 1.2

  1. Section 9 needs to be updated.
  2. Annex II needs to be updated. The MDR Data Theme NAL was published in 2016.
  3. The change log also includes the changes from version 1.0 to 1.1 which looks unnecessary.
  4. The description of the latest update in the change log states that the "property can be repeated for multiple licences" which is incorrect. It should say something like "can be repeated in the case that multiple licence types apply to a licence document".

dcat:Catalog - remove dct:rights?

The DCAT-AP usage analysis on the European Data Portal showed the following:

dct:rights
• Usage: 0%
• Should this property be removed/withdrawn/deprecated?

dcat:Catalog - dct:spatial recommended?

The DCAT-AP usage analysis on the European Data Portal showed the following:

dct:spatial
• Usage: 83.75% with almost all countries represented
• Should this property be made recommended?

licenseDocument

Hi,

In the context of the Flemish Open Data portal we would like to have advice on the creation of a specific instance of a license document.

Q: Is there a LicenseDocument vocabulary which allows to describe a license document by extending an existing license with instances of clauses?

The example case is as follows: the Flemish government has defined 2 licenses which state in short: you can use the data if you
a) mention the source, or
b) pay a fee

Now we would like to create a license document that instantiates the general license document with the specific name of the source (or the fee).

Pointers to how we best address this topic are welcome.

Bert

dcat:Distribution - remove foaf:page?

The DCAT-AP usage analysis on the European Data Portal showed the following:

foaf:page:
• Usage: 0%
• Should these properties be removed/withdrawn/deprecated?

Proposal to use IRI nodes for properties in SHACL shapes instead of blank nodes

With the purpose to reuse the SHACL shapes, it would nice the properties restrictions are written as IRI instead of blank nodes, so then reusers might disable certain shapes in case they don't fit in their application profile.

E.g.
dcat:Distribution
rdf:type sh:NodeShape ;
sh:property dcat:accessURLShape ;

dcat:accessURLShape
rdf:type sh:PropertyShape ;
sh:path dcat:accessURL ;
sh:class rdfs:Resource ;
sh:minCount 1 ;
sh:severity sh:Violation .

such shape requires that the access url should be explicitly indicated a rdfs:Resource whereas most of the time just a URL would be enough.

Using it as IRI could deactivate the shape for their need:

dcat:accessURLShape
sh:deactivated true .

Missing link between checksum class (value) and sum-checked distribution (accessURL?)

When working on https://www.dcat-ap.de/ we encountered a missing link between the binary file described in the distribution and its checksum value in RDF:
If you serialize into RDF - so if you do not use the XML hierarchical structure - it is not clear which checksum value is associated with which file in which distribution. The problem occurs as order of things cannot be granted by the XML processor and Distributions and Checksums are two independent classes with a currently missing link and the rdf:nodeID is optional and not sure what to put into it to create this linkage.

Perhaps the use of the dataset dcterms:identifier or the distributions accessURL ( following the logic in https://joinup.ec.europa.eu/release/dcat-ap-how-use-identifiers-datasets-and-distributions ) as a checksumClass rdf:nodeID statement might be a workaround for this and can be added when reworking the spdx vocabulary or somehow be considered in the current ISA² DCAT-AP review?

dcat:Dataset - remove dct:relation?

The DCAT-AP usage analysis on the European Data Portal showed the following:

dct:relation:
• Usage: 0%
• Should these properties be removed/withdrawn/deprecated?

DCAT: Distribution – suggestion attribute for record – ‘planned availability’

Def:
Uses a list of guaranteed availabilities, like

http://dcat-ap.de/def/plannedAvailability/temporary
http://dcat-ap.de/def/plannedAvailability/experimental
http://dcat-ap.de/def/plannedAvailability/available
http://dcat-ap.de/def/plannedAvailability/stable

Datatype: rdfs:resource

Reasoning for change:
This concept was added to dcat-ap.de and seems to be a good extension for dcat-ap itself; perhaps even W3C dcat.
The concept of indicating a datasets planned lifetime close to other metadata of the dataset seems a good idea for fostering reuse and decoupling the dataset metadata from its technical publishing channel.

dcat:Dataset - remove adms:sample?

The DCAT-AP usage analysis on the European Data Portal showed the following:

adms:sample:
• Usage: 0%
• Should these properties be removed/withdrawn/deprecated?

DCAT: Distribution – Suggestion attribute for record - ‘temporal granularity’

def: the temporal granularity is the temporal resolution of the contained data

Datatype: rdf:Literal String Enum aus: second, minute, hour, day, week, month, quarter, year, 5-years

Reasoning for change:
This #metadata was already present in the former CKAN / OGD schema; is partially needed e.g. by the statistical domain or for metadata attribution of fiscal data. Notice: It is not the same as temporal coverage.

DCAT: Dataset – Suggestion attribute for record - ‘maintainer’

Def:
Person, who maintains a dataset; is responsible for it and its publication

Datatype: rdfs:Literal string

Reasoning for change:
The maintainer (such as the curator) cares for the dataset and is in charge for keeping metadata up to date. This concept was added in dcat-ap.de as the existing ones were found not to be appropriate.

dcat:Distribution - remove dct:rights?

The DCAT-AP usage analysis on the European Data Portal showed the following:

dct:rights:
• Usage: near 0%
• Should these properties be removed/withdrawn/deprecated?

Cardinality dct:type on Licence Document

Bert van Nuffelen, TenForce

Why is the the cardinality for dct:type on licenceDocument set to max 1? For some licenses more than one concept from ADMS:LicenceType controlled vocabulary can apply.

dcat:Distribution - remove spdx:checksum?

The DCAT-AP usage analysis on the European Data Portal showed the following:

spdx:checksum:
• Usage: near 0%
• Should these properties be removed/withdrawn/deprecated?

dcat:Dataset - remove dct:accessRights?

The DCAT-AP usage analysis on the European Data Portal showed the following:

dct:accessRights:
• Usage: 0%
• Should these properties be removed/withdrawn/deprecated?

dcat:Catalog - dcat:themeTaxonomy mandatory?

The DCAT-AP usage analysis on the European Data Portal showed the following:

dcat:themeTaxonomy
• Usage: 100% of the catalogues queried by the European Data Portal
• Should this property be made mandatory?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.