Coder Social home page Coder Social logo

mjanez / ckanext-schemingdcat Goto Github PK

View Code? Open in Web Editor NEW

This project forked from opendatagis/ckanext-facet_scheming

0.0 0.0 2.0 3.63 MB

Improved ckanext-scheming with DCAT, DCAT-AP and GeoDCAT-AP/INSPIRE custom schemas and tools.

Home Page: https://github.com/mjanez/ckan-docker

License: GNU Affero General Public License v3.0

Python 53.27% CSS 10.28% HTML 34.74% JavaScript 1.71%
ckan ckanext-dcat dcat dcat-ap faceted-search geodcat-ap inspire metadata ckan-schema ckanext-scheming

ckanext-schemingdcat's Issues

DCAT - Custom DCAT-AP/GeoDCAT-AP (ES/EU) profiles in schemingdcat

ckanext-dcat

ckanext-scheming

  • Enhance schemas to ensure compatibility with new profiles:

Feature - Add ATOM Feed and refactor controller

  • Refactor all controllers to controller.py:

  • ATOM Feed

    • Add Feed plugin:
      def before_map(self, map):
          with SubMapper(map, controller='ckanext.scheming_dcat.controller:SchemingDcatFeedController') as m:
              m.connect('/feeds/organization/{id}.atom', action='organization')
          map.connect(
              'general', '/feeds/dataset/{pkg_id}.atom',
              controller='ckanext.scheming_dcat.controller:SchemingDcatFeedController',
              action='dataset',
          )
          return map
    • Add FeedController for ATOM Feed:
      class SchemingDcatFeedController(FeedController):
          def general(self):
              data_dict, params = self._parse_url_params()
              data_dict['q'] = '*:*'
      
              item_count, results = _package_search(data_dict)
      
              navigation_urls = self._navigation_urls(params,
                                                      item_count=item_count,
                                                      limit=data_dict['rows'],
                                                      controller='feed',
                                                      action='general')
      
              feed_url = self._feed_url(params,
                                        controller='feed',
                                        action='general')
      
              alternate_url = self._alternate_url(params)
      
              return self.output_feed(
                  results,
                  feed_title=_(u'Open Government Dataset Feed'),
                  feed_description='',
                  feed_link=alternate_url,
                  feed_guid=_create_atom_id(
                      u'/feeds/dataset.atom'),
                  feed_url=feed_url,
                  navigation_urls=navigation_urls
              )
      
          def dataset(self, pkg_id):
              try:
                  context = {'model': model, 'session': model.Session,
                             'user': c.user, 'auth_user_obj': c.userobj}
                  get_action('package_show')(context,
                                              {'id': pkg_id})
              except NotFound:
                  abort(404, _('Dataset not found'))
      
              data_dict, params = self._parse_url_params()
      
              data_dict['fq'] = '{0}:"{1}"'.format('id', pkg_id)
      
              item_count, results = _package_search(data_dict)
      
              navigation_urls = self._navigation_urls(params,
                                                      item_count=item_count,
                                                      limit=data_dict['rows'],
                                                      controller='feed',
                                                      action='dataset',
                                                      id=pkg_id)
      
              feed_url = self._feed_url(params,
                                        controller='feed',
                                        action='dataset',
                                        id=pkg_id)
      
              alternate_url = self._alternate_url(params, id=pkg_id)
      
              return self.output_feed(
                  results,
                  feed_title=_(u'Open Government Dataset Feed'),
                  feed_description='',
                  feed_link=alternate_url,
                  feed_guid=_create_atom_id(
                      u'/feeds/dataset/{0}.atom'.format(pkg_id)),
                  feed_url=feed_url,
                  navigation_urls=navigation_urls
              )
      
          def output_feed(self, results, feed_title, feed_description,
                          feed_link, feed_url, navigation_urls, feed_guid):
      
              author_name = config.get('ckan.feeds.author_name', '').strip() or \
                  config.get('ckan.site_id', '').strip()
              author_link = config.get('ckan.feeds.author_link', '').strip() or \
                  config.get('ckan.site_url', '').strip()
      
              feed = _FixedAtom1Feed(
                  title=feed_title,
                  link=feed_link,
                  description=feed_description,
                  language=u'en',
                  author_name=author_name,
                  author_link=author_link,
                  feed_guid=feed_guid,
                  feed_url=feed_url,
                  previous_page=navigation_urls['previous'],
                  next_page=navigation_urls['next'],
                  first_page=navigation_urls['first'],
                  last_page=navigation_urls['last'],
              )
      
              for pkg in results:
                  feed.add_item(
                      title= h.get_translated(pkg, 'title'),
                      link=h.url_for(controller='package', action='read', id=pkg['id']),
                      description= h.get_translated(pkg, 'notes'),
                      updated=date_str_to_datetime(pkg.get('metadata_modified')),
                      published=date_str_to_datetime(pkg.get('metadata_created')),
                      unique_id=_create_atom_id(u'/dataset/%s' % pkg['id']),
                      author_name=pkg.get('author', ''),
                      author_email=pkg.get('author_email', ''),
                      categories=''.join(e['value']
                                         for e in pkg.get('extras', [])
                                         if e['key'] == lx('keywords')).split(','),
                      enclosure=webhelpers.feedgenerator.Enclosure(
                          self.base_url + url(str(
                              '/api/action/package_show?id=%s' % pkg['name'])),
                          unicode(len(json.dumps(pkg))),   # TODO fix this
                          u'application/json')
                  )
              response.content_type = feed.mime_type
              return feed.writeString('utf-8')

Schema - Improve spatial_uri field

Migrated from: mjanez/ckanext-scheming#10

Steps

GeoDCAT-AP Schema

DCAT-AP Schema

DCAT-AP Info

Property URI Used for Class Vocabulary URI Vocabulary URI Usage Note
dct:spatial Catalogue, Dataset EU Vocabularies Continents Named Authority List, EU Vocabularies Countries Named Authority List, EU Vocabularies Places Named Authority List, Geonames http://publications.europa.eu/resource/authority/continent, http://publications.europa.eu/resource/authority/country, http://publications.europa.eu/resource/authority/place, http://sws.geonames.org/ The EU Vocabularies Name Authority Lists must be used for continents, countries and places that are in those lists; if a particular location is not in one of the mentioned Named Authority Lists, Geonames URIs must be used.

Add temporal_resolution field

  1. ISO 8601: https://www.w3.org/TR/xmlschema11-2/#duration

duration is a datatype that represents durations of time. The concept of duration being captured is drawn from those of [ISO 8601], specifically durations without fixed endpoints. For example, "15 days" (whose most common lexical representation in duration is "'P15D'") is a duration value; "15 days beginning 12 July 1995" and "15 days ending 12 July 1995" are not duration values. duration can provide addition and subtraction operations between duration values and between duration/dateTime value pairs, and can be the result of subtracting dateTime values. However, only addition to dateTime is required for XML Schema processing and is defined in the function ·dateTimePlusDuration·.

Schema - Fix theme/theme_es/theme_eu fields

Maintain consistency in the naming of field_names, utilizing EU INSPIRE themes as a default, and adding suffixes to any national values.

Adaptation is also required:

  • ckan-docker (Solr fields, .env, etc.)
  • ckan-ogc theme/theme_inspire (Harvester/Ingester fields)
  • ckan-pycsw (theme == <gmd:keyword> from Thesaurus.)

Fix - Fix schema/components output_validators for multiple fields

Internally all extra fields are stored as strings. If you are attempting to save and restore other types of data you will need to use output validators.

For example if you use a simple "yes/no" question, you will need to let ckanext-scheming know that this data needs to be stored and retrieved as a boolean. This is acheieved using validators and output_validators keys.

 - field_name: is_camel_friendly
   label: Is this camel friendly?
   required: true
   preset: select
   choices:
     - value: false
       label: "No"
     - value: true
       label: "Yes"
   validators: scheming_required boolean_validator
   output_validators: boolean_validator

Info: https://github.com/ckan/ckanext-scheming#output_validators

Schemas

  • Fix all multiple fields in custom schemas.
    • DCAT
    • DCAT-AP
    • GeoDCAT-AP (EU/ES)

Components of ckan-docker deployment

Fix - organizations/groups multilang fields output

The output of information in the API needs to be fixed to show more information about groups and organisations, or to generate a specific action.

Extensible to tags.

Concretely the multilingual fields.

Example

https://hermes.tragsatec.es/catalogo/api/3/action/group_list?all_fields=True

{
            "approval_status": "approved",
            "created": "2023-10-18T16:12:32.494529",
            "description": "",
            "display_name": "",
            "id": "64b56bd3-27da-4d20-b3c7-4cd3de945789",
            "image_display_url": "https://hermes.tragsatec.es/catalogo/dataset/6439fa33-1fe9-43ce-adf3-3a2e3177246a/resource/523a3d6a-9dbd-4059-b74e-20cce785f9c9/download/af_2.jpg",
            "image_url": "https://hermes.tragsatec.es/catalogo/dataset/6439fa33-1fe9-43ce-adf3-3a2e3177246a/resource/523a3d6a-9dbd-4059-b74e-20cce785f9c9/download/af_2.jpg",
            "is_organization": false,
            "name": "7_alternative-fuels",
            "num_followers": 0,
            "package_count": 10,
            "state": "active",
            "title": "",
            "type": "group"
        }

tags

{
            "id": "96d4a6d8-b9d2-470e-aad6-57689396882d",
            "name": "theme",
            "tags": [
                {
                    "id": "e50c57e9-54c5-4400-806f-3e30b313abd9",
                    "name": "ac",
                    "vocabulary_id": "96d4a6d8-b9d2-470e-aad6-57689396882d",
                    "display_name": "ac"
                },
                {
                    "id": "228741b0-c86c-4ee4-8f8a-3c83e45aaa71",
                    "name": "ad",
                    "vocabulary_id": "96d4a6d8-b9d2-470e-aad6-57689396882d",
                    "display_name": "ad"
                },
                {
                    "id": "a5f92a8d-0f07-4c8b-8241-516fb985d8d2",
                    "name": "af",
                    "vocabulary_id": "96d4a6d8-b9d2-470e-aad6-57689396882d",
                    "display_name": "af"
                },
}

Schema - Add graphic_overview field

Context info

Description:
Graphic that provides an illustration of the dataset

  • Pygeometa MCF Core schema

  • XML Schema ISO19139 gmd:graphicOverview:

    <gmd:identificationInfo>
    ...
      <gmd:graphicOverview>
        <gmd:MD_BrowseGraphic>
          <gmd:fileName>
            <gco:CharacterString>{{ record['identification']['browsegraphic']|e }}</gco:CharacterString>
            </gmd:fileName>
        </gmd:MD_BrowseGraphic>
      </gmd:graphicOverview>
    </gmd:identificationInfo>

Tasks

Basic

  • Add element to schemas.
- field_name: graphic_overview
  label:
    en: Graphic overview of the dataset
    es: Descripción gráfica del conjunto de datos
  display_snippet: link_name.html
  form_placeholder: http://example.com/dataset.jpg
  help_text:
    en: "Graphic that provides an illustration of the dataset."
    es: "Gráfico que ilustra el conjunto de datos."

Enhancements

Bug - CKAN 2.10 - Fix request.args()

2024-03-12 14:23:33,493 WARNI [ckan.lib.maintain] Function params() in module ckan.common has been deprecated since CKAN v2.10.0 and will be removed in a later release of ckan. Use `request.args` instead of `request.params`

Enhanced DCAT-AP profiles to ensure MQA DCAT-AP compliance

References

DCAT-AP:

EU Vocabularies: https://op.europa.eu/en/web/eu-vocabularies/dcat-ap

Validator:

DCAT-AP Country profile:

SHACLs: https://github.com/ISAITB/validator-resources-dcat-ap/blob/baca3adf63d31ee415fa5e769249053ae211414c/resources/config.properties

DCAT-AP Validator Validation Cases

The different cases to validate in the DCAT-AP Validator are based on the level of completeness of the checks and the incorporation of background knowledge (vocabularies). Each case is designed for a specific data exchange scenario.
The following describes each case and recommends which one you should use for a CKAN catalog:

Case 1: DCAT-AP Base Zero (no background knowledge)

Includes all constraints required for technical coherence, excluding range class membership constraints and controlled vocabulary usage.

SHACL Profiles:

Case 2: DCAT-AP Ranges Zero (no background knowledge)

Includes all range class membership constraints.

SHACL Profiles:

Case 3: DCAT-AP Base (with background knowledge)

Extends Case 1 with background knowledge, including all vocabularies used in DCAT-AP.

SHACL Profiles:

Case 4: DCAT-AP Ranges (with background knowledge)

Extends Case 2 with background knowledge, adding validation of range class membership and vocabulary standards compliance.

SHACL Profiles:

Case 5: DCAT-AP Recommendations (with background knowledge)

Includes all constraints related to recommended properties.

SHACL Profiles:

Case 6: DCAT-AP Controlled Vocabularies

Includes all constraints related to controlled vocabularies.

SHACL Profiles:

Case 7: DCAT-AP Full (with background knowledge)

The union of Cases 3, 4, 5, and 6.

SHACL Profiles:

Recommendation:

For most use cases, Case 3: DCAT-AP Base (with background knowledge) is recommended. It provides comprehensive validation of basic coherence and vocabulary standards compliance.
If your CKAN catalog uses controlled vocabularies, consider using Case 6: DCAT-AP Controlled Vocabularies or Case 7: DCAT-AP Full (with background knowledge) for more exhaustive validation.
Remember, the choice of the appropriate validation case depends on your specific needs and data exchange context.

Schema - Improve dcat_type with

Schema - Create a DCAT-AP Schema (EU)

Create a new schema based on the Spanish GeoDCAT-AP YAML for an international context without the mandatory elements of the NTI-RISP standard, only DCAT-AP.

Note
Info about the latest mapping: Schema GeoDCAT-AP.

Steps to complete a ckan_geodcatap_eu.yaml version from ckan_geodcatap.yaml:

Metadata properties:

https://github.com/mjanez/ckanext-scheming/blob/9dcb4e3bacc86efa1accd2853d4a520d740ebf6e/ckanext/scheming/ckan_geodcatap.yaml#L1339-L1857

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.