mjanez / ckanext-schemingdcat Goto Github PK

View Code? Open in Web Editor NEW

This project forked from opendatagis/ckanext-facet_scheming

0.0 0.0 2.0 3.63 MB

Improved ckanext-scheming with DCAT, DCAT-AP and GeoDCAT-AP/INSPIRE custom schemas and tools.

Home Page: https://github.com/mjanez/ckan-docker

License: GNU Affero General Public License v3.0

Python 53.27% CSS 10.28% HTML 34.74% JavaScript 1.71%

ckan ckanext-dcat dcat dcat-ap faceted-search geodcat-ap inspire metadata ckan-schema ckanext-scheming

ckanext-schemingdcat's Introduction

ckanext-schemingdcat. LOD/INSPIRE metadata enhancement for ckanext-scheming

Overview • Installation • Configuration • Schemas • Harvesters • Running the Tests

Overview

This CKAN extension provides functions and templates specifically designed to extend ckanext-scheming and includes DCAT and Harvest enhancements to adapt CKAN Schema to GeoDCAT-AP.

Warning

Requires mjanez/ckanext-dcat, ckan/ckanext-scheming and ckan/ckanext-spatial to work properly.

Tip

It is recommended to use with: ckan-docker deployment or only use ckan-pycsw to deploy a CSW Catalog.

Enhancements:

Could use schemas for ckanext-scheming in the plugin like CKAN GeoDCAT-AP custom schemas
Improve the search functionality in CKAN for custom schemas. It uses the fields defined in a scheming file to provide a set of tools to use these fields for scheming, and a way to include icons in their labels when displaying them. More info: ckanext-schemingdcat
Add improved harvesters for custom metadata schemas integrated with ckanext-harvest in CKAN using mjanez/ckan-ogc.
Add Metadata downloads for Linked Open Data formats (mjanez/ckanext-dcat) and Geospatial Metadata (ISO 19139, Dublin Core, etc. with mjanez/ckan-pycsw)
Add custom i18n translations to datasets, groups, organizations in schemas, e.g: GeoDCAT-AP (ES).¹
Add a set of useful helpers and templates to be used with Metadata Schemas.
Update the base theme of CKAN to use with the enhancements of this extension.
Modern UI inspired on datopian/ckanext-datopian.
LOD/OGC Endpoints based on avalaible profiles (DCAT) and CSW capabilities with mjanez/ckan-pycsw.

Requirements

This plugin is compatible with CKAN 2.9 or later and needs the following plugins to work properly:

# Install latest stable release of:
## ckan/ckanext-scheming: https://github.com/ckan/ckanext-scheming/tags (e.g. release-3.0.0)
pip install -e git+https://github.com/ckan/[email protected]#egg=ckanext-scheming

## mjanez/ckanext-dcat: https://github.com/mjanez/ckanext-dcat/tags (e.g. 1.2.0-geodcatap)
pip install -e git+https://github.com/mjanez/[email protected]#egg=ckanext-dcat
pip install -r https://raw.githubusercontent.com/mjanez/ckanext-dcat/master/requirements.txt

## ckan/ckckanext-spatial: https://github.com/ckan/ckanext-spatial/tags (e.g. v2.1.1)
pip install -e git++https://github.com/ckan/[email protected]/#egg=ckanext-spatial#egg=ckanext-spatial
pip install -r https://raw.githubusercontent.com/ckan/ckanext-spatial/v2.1.1/requirements.txt

## ckan/ckckanext-harvest: https://github.com/ckan/ckanext-harvest/tags (e.g. v1.5.6)
pip install -e git++https://github.com/ckan/[email protected]#egg=ckanext-spatial
pip install -r https://raw.githubusercontent.com/ckan/ckanext-harvest/v1.5.6/requirements.txt

Installation

cd $CKAN_VENV/src/

# Install the scheming_dataset plugin
pip install -e "git+https://github.com/ckan/ckanext-schemingdcat.git#egg=ckanext-schemingdcat"

Configuration

Set the plugin:

# Add the plugin to the list of plugins
ckan.plugins = ... spatial_metadata ... dcat ... schemingdcat

Warning

When using schemingdcat extension,scheming should not appear in the list of plugins loaded in CKAN. But dcat and spatial should.

Scheming DCAT

Set the schemas you want to use with configuration options:

# Each of the plugins is optional depending on your use
ckan.plugins = schemingdcat_datasets schemingdcat_groups schemingdcat_organizations

To use CSW Endpoint in ckanext-schemingdcat:

schemingdcat.geometadata_base_uri = http://localhost:81/csw
ckanext.dcat.base_uri = http://localhost:81/catalog

To use custom schemas in ckanext-scheming:

# module-path:file to schemas being used
scheming.dataset_schemas = ckanext.schemingdcat:schemas/geodcatap/geodcatap_dataset.yaml
scheming.group_schemas = ckanext.schemingdcat:schemas/geodcatap/geodcatap_group.json
scheming.organization_schemas = ckanext.schemingdcat:schemas/geodcatap/geodcatap_org.json

#   URLs may also be used, e.g:
#
# scheming.dataset_schemas = http://example.com/spatialx_schema.yaml

#   Preset files may be included as well. The default preset setting is:
scheming.presets = ckanext.schemingdcat:schemas/geodcatap/geodcatap_presets.json

#   The is_fallback setting may be changed as well. Defaults to false:
scheming.dataset_fallback = false

Harvest

Add the custom Harvesters to the list of plugins as you need:

ckan.plugins = ... spatial_metadata ... dcat ... schemingdcat ... harvest ... schemingdcat_ckan_harvester schemingdcat_csw_harvester ...

Endpoints

You can update the endpoints.yaml file to add your custom OGC/LOD endpoints, only has 2 types of endpoints: lod and ogc, and the profile avalaible in ckanext-dcat Preferably between 4 and 8.

Examples:

LOD endpoint: A Linked Open Data endpoint is a DCAT endpoint that provides access to RDF data. More information about the catalogue endpoint, how to use the endpoint, (e.g. https://{ckan-instance-host}/catalog.{format}?[page={page}]&[modified_since={date}]&[profiles={profile1},{profile2}]&[q={query}]&[fq={filter query}], and more at ckanext-dcat
```
  - name: euro_dcat_ap_2_rdf
    display_name: RDF DCAT-AP
    type: lod
    format: rdf
    image_display_url: /images/icons/endpoints/euro_dcat_ap_2.svg
    description: RDF DCAT-AP Endpoint for european data portals.
    profile: euro_dcat_ap_2
    profile_label: DCAT-AP
    version: null
```

OGC Endpoint: An OGC CSW endpoint provides a standards-based interface to discover, browse, and query metadata about spatial datasets and data services. More info about the endpoint at OGC: Catalogue Servicestandard/cat/

  - name: csw_inspire
    display_name: CSW INSPIRE 2.0.2
    type: ogc
    format: xml
    image_display_url: /images/icons/endpoints/csw_inspire.svg
    description: OGC-INSPIRE Endpoint for spatial metadata.
    profile: spain_dcat
    profile_label: INSPIRE
    version: 2.0.2

Facet Scheming

To configure facets, there are no mandatory sets in the config file for this extension. The following sets can be used:

schemingdcat.facet_list: [list of fields]      # List of fields in scheming file to use to faceting. Use ckan defaults if not provided.
schemingdcat.default_facet_operator: [AND|OR]  # OR if not defined

 schemingdcat.icons_dir: (dir)                  # images/icons if not defined

As an example for facet list, we could suggest:

schemingdcat.facet_list = "theme groups theme_es dcat_type owner_org res_format publisher_name publisher_type frequency tags tag_uri conforms_to spatial_uri"

The same custom fields for faceting can be used when browsing organizations and groups data:

schemingdcat.organization_custom_facets = true
schemingdcat.group_custom_facets = true

This two last settings are not mandatory. You can omit one or both (or set them to false), and the default fields for faceting will be used instead.

Facet Scheming integration with Solr

Clear the index in solr:

ckan -c [route to your .ini ckan config file] search-index clear

Modify the schema file on Solr (schema or managed schema) to add the multivalued fields added in the scheming extension used for faceting. You can add any field defined in the schema file used in the ckanext-scheming extension that you want to use for faceting. You must define each field with these parameters:

type: string - to avoid split the text in tokens, each individually "faceted".
uninvertible: false - as recomended by solr´s documentation
docValues: true - to ease recovering faceted resources
indexed: true - to let ckan recover resources under this facet
stored: true - to let the value to be recovered by queries
multiValued: well... it depends on if it is a multivalued field (several values for one resource) or a regular field (just one value). Use "true" or "false" respectively.

E.g. ckanext-iepnb extension are ready to use these multivalued fields. You have to add this configuration fragment to solr schema in order to use them:

<!-- Extra fields -->
  <field name="tag_uri" type="string" uninvertible="false" docValues="true" indexed="true" stored="true" multiValued="true"/>
  <field name="conforms_to" type="string" uninvertible="false" docValues="true" indexed="true" stored="true" multiValued="true"/>
  <field name="lineage_source" type="string" uninvertible="false" docValues="true" indexed="true" stored="true" multiValued="true"/>
  <field name="lineage_process_steps" type="string" uninvertible="false" docValues="true" indexed="true" stored="true" multiValued="true"/>
  <field name="reference" type="string" uninvertible="false" docValues="true" indexed="true" stored="true" multiValued="true"/>
  <field name="theme" type="string" uninvertible="false" docValues="true" indexed="true" stored="true" multiValued="true"/>
  <field name="theme_es" type="string" uninvertible="false" docValues="true" multiValued="true" indexed="true" stored="true"/>
  <field name="metadata_profile" type="string" uninvertible="false" docValues="true" multiValued="true" indexed="true" stored="true"/>
  <field name="resource_relation" type="string" uninvertible="false" docValues="true" indexed="true" stored="true" multiValued="true"/>

[!NOTE] You can ommit any field you're not going to use for faceting, but the best policy could be to add all values at the beginning.

The extra fields depend on your schema

Be sure to restart Solr after modify the schema.

Restart CKAN.
Reindex solr index:

ckan -c [route to your .ini ckan config file] search-index rebuild-fast

Sometimes solr can issue an error while reindexing. In that case I'd try to restart solr, delete index ("search-index clear"), restart solr, rebuild index, and restart solr again.

Ckan needs to "fix" multivalued fields to be able to recover values correctly for faceting, so this step must be done in order to use faceting with multivalued fields.

Icons

Icons for each field option in the scheming file can be set in multiple ways:

Set a root directory path for icons for each field using the icons_dir key in the scheming file.
If icons_dir is not defined, the directory path is guessed starting from the value provided for the schemingdcat.icons_dir parameter in the CKAN config file, adding the name of the field as an additional step to the path (public/images/icons/{field_name).
For each option, use the icon setting to provide the last steps of the icon path from the field's root path defined before. This value may be just a file name or include a path to add to the icon's root directory.
If icon is not used, a directory and file name are guessed from the option's value.
Icons files are tested for existence when using schemingdcat_schema_icon function to get them. If the file doesn't exist, the function returns None. Icons can be provided by any CKAN extension in its public directory.
Set a default icon for a field using the default_icon setting in the scheming file. You can get it using schemingdcat_schema_get_default_icon function, and it is your duty to decide when and where to get and use it in a template.

New theme

Update the base theme of CKAN to use with the enhancements of this extension.

Schemas

With this plugin, you can customize the group, organization, and dataset entities in CKAN. Adding and enabling a schema will modify the forms used to update and create each entity, indicated by the respective type property at the root level. Such as group_type, organization_type, and dataset_type. Non-default types are supported properly as is indicated throughout the examples.

Are available to use with this extension a number of custom schema, more info: schemas/README.md

Schema Enhancements: We've made several improvements to our schema to provide a better metadata and metadata group management. Here are some of the key changes:

Form Groups: We've introduced the use of form_groups and improve form_pages in our schemas. This allows us to group related fields into the same form, making it easier to navigate and manage metadata.
Metadata Management Improvements: We've improved how we manage metadata in our schema. It's now easier to add, remove, and modify metadata, allowing us to keep our data more organized and accessible.
Metadata Group Updates: We've made changes to how we handle metadata groups (form_groups). It's now easier to group related metadata, helping us keep our data more organized and making it easier to find specific information.

For more details on these enhancements check Form Groups documentation, please refer to the schema files in ckanext/schemingdcat/schemas.

GeoDCAT-AP (ES)

schemas/geodcatp_es with specific extensions for spatial data and GeoDCAT-AP/INSPIRE metadata profiles.

Note

RDF to CKAN dataset mapping: GeoDCAT-AP (ES) to CKAN

DCAT

schemas/dcat based on: DCAT.

Note

RDF to CKAN dataset mapping: DCAT to CKAN

DCAT-AP (EU)

schemas/dcatap based on: DCAT-AP for the european context.

Note

RDF to CKAN dataset mapping: DCAT-AP (EU) to CKAN

GeoDCAT-AP (EU)

schemas/geodcatap based on: GeoDCAT-AP for the european context.

Note

RDF to CKAN dataset mapping: GeoDCAT-AP (EU) to CKAN

Form Groups

Form groups are a way to group related fields together in the same form. This makes it easier to navigate and manage metadata. A form group is defined with the following elements:

form_group_id: A unique identifier for the form group. For example, contact.
label: A human-readable label for the form group. This can be provided in multiple languages. For example:
```
label: 
  en: Contact information 
  es: Información de contacto
```
fa_icon: An optional Font Awesome icon that can be used to visually represent the form group. For example, fa-address-book.

Here is an example of a form group definition:

form_group_id: contact 
label: 
  en: Contact information 
  es: Información de contacto 
fa_icon: fa-address-book

Adding Fields to Form Groups

Fields can be added to a form group by specifying the form_group_id in the field definition. The form_group_id should match the form_group_id of the form group that the field should be part of.

Here is an example of a field that is part of the general_info form group:

field_name: owner_org
label:
  en: Organization
  es: Organización
required: True
help_text:
  en: Entity (organisation) responsible for making the Dataset available.
  es: Entidad (organización) responsable de publicar el conjunto de datos.
preset: dataset_organization
form_group_id: general_info

In this example, the owner_org field will be part of the general_info form group.

Harvesters

Basic using

In production, when gather and consumer processes are running, the following command are used to start and stop the background processes:

ckan harvester run: Starts any harvest jobs that have been created by putting them onto the gather queue. Also checks running jobs - if finished it changes their status to Finished.

To testing harvesters in development, you can use the following command:

ckan harvester run-test {source-id/name}: This does all the stages of the harvest (creates job, gather, fetch, import) without involving the web UI or the queue backends. This is useful for testing a harvester without having to fire up gather/fetch_consumer processes, as is done in production.

[!WARNING] After running the run-test command, you should stop all background processes for gather and consumer to avoid conflicts.

Scheming DCAT CKAN Harvester: CKAN Harvester for custom schemas

The plugin includes a harvester for remote CKAN instances using the custom schemas provided by schemingdcat and ckanext-scheming. This harvester is a subclass of the CKAN Harvester provided by ckanext-harvest and is designed to work with the schemingdcat plugin to provide a more versatile and customizable harvester for CKAN instances.

To use it, you need to add the schemingdcat_ckan_harvester plugin to your options file:

  ckan.plugins = harvest schemingdcat schemingdcat_datasets ... schemingdcat_ckan_harvester

The Scheming DCAT CKAN Harvester supports the same configuration options as the CKAN Harvester, plus the following additional options:

dataset_field_mapping/distribution_field_mapping (Optional): Mapping field names from local to remote instance, all info at: CKAN Harvester Field mapping structure
field_mapping_schema_version (Mandatory if exists dataset_field_mapping/distribution_field_mapping): Schema version of the field_mapping to ensure compatibility with older schemas. The default is 2.
schema (Optional): The name of the schema to use for the harvested datasets. This is the schema_name as defined in the scheming file. The remote and local instances must have the same dataset schema. If not provided, the dataset_field_mapping/distribution_field_mapping is needed to mapping fields.
allow_harvest_datasets (Optional): If true, the harvester will create new records even if the package type is from the harvest source. If false, the harvester will only create records that originate from the instance. Default is false.
remote_orgs (Optional): [WIP]. Only only_local.
remote_groups (Optional): [WIP]. Only only_local.
clean_tags: By default, tags are stripped of accent characters, spaces and capital letters for display. Setting this option to False will keep the original tag names. Default is True.

And example configuration might look like this:

    {
    "api_version": 2,
    "clean_tags": false,
    "default_tags": [{"name": "inspire"}, {"name": "geodcatap"}],
    "default_groups": ["transportation", "hb"],
    "default_extras": {"encoding":"utf8", "harvest_description":"Harvesting from Sample Catalog", "harvest_url": "{harvest_source_url}/dataset/{dataset_id}"},
    "organizations_filter_include": ["remote-organization"],
    "groups_filter_include":[],
    "override_extras":false,
    "user":"harverster-user",
    "api_key":"<REMOTE_API_KEY>",
    "read_only": true,
    "remote_groups": "only_local",
    "remote_orgs": "only_local",
    "schema": "geodcatap",
    "allow_harvest_datasets":false,
    "field_mapping_schema_version":2,
    "dataset_field_mapping": {
      "title": {
          "field_name": "my_title"
        },
      "title_translated": {
          "languages": {
              "en": {
                  "field_name": "my_title-en"
              },
              "es": {
                  "field_name": "my_title"
              }
          }
      },
      "private": {
          "field_name": "private"
      },
      "tag_string": {
          "field_name": ["theme_a", "theme_b", "theme_c"]
      },
      "theme_es": {
          "field_value": "http://datos.gob.es/kos/sector-publico/sector/medio-ambiente"
      },
      "tag_uri": {
          "field_name": "keyword_uri",
          // "field_value" extends the original list of values retrieved from the remote file for all records.
          "field_value": ["https://www.example.org/codelist/a","https://www.example.org/codelist/b", "https://www.example.org/codelist/c"] 
      },
      "my_custom_field": {
          // If you need to map a field in a remote dict to the "extras" dict, use the "extras_" prefix to indicate that the field is there.
          "field_name": "extras_remote_custom_field"
      },
    },
    }

Field mapping structure

The dataset_field_mapping/distribution_field_mapping is structured as follows (multilingual version):

{
  ...
  "field_mapping_schema_version": 2,
  "<dataset_field_mapping>/<distribution_field_mapping>": {
    "<schema_field_name>": {
      "languages": {
        "<language>":  {
          <"field_value": "<fixed_value>/<fixed_value_list>">,/<"field_name": "<excel_field_name>/<excel_field_name_list>">
        },
        ...
      },
      ...
    },
    ...
  }
}

<schema_field_name>: The name of the field in the CKAN schema.
- <language>: (Optional) The language code for multilingual fields. This should be a valid ISO 639-1 language code. This is now nested under the languages key.
<fixed_value>/<fixed_value_list>: (Optional) A fixed value or a list of fixed values that will be assigned to the field for all records.
Field labels: Field name:
- <field_name>/<field_name_list>: (Optional) The name of the field in the remote file or a list of field names.

For fields that are not multilingual, you can directly use field_name without the languages key. For example:

{
  ...
  "field_mapping_schema_version": 2,
  "<dataset_field_mapping>/<distribution_field_mapping>": {
    "<schema_field_name>": {
      <"field_value": "<fixed_value>/<fixed_value_list>">,/<"field_name": "<excel_field_name>/<excel_field_name_list>">
    },
    ...
  }
}

Important

The field mapping can be done either at the dataset level using dataset_field_mapping or at the resource level using distribution_field_mapping. The structure and options are the same for both. The field_mapping_schema_version is 2 by default, but needs to be set to avoid errors.

Field Types

There are two types of fields that can be defined in the configuration:

Regular fields: These fields have a field label to define the mapping or a fixed value for all its records.
- Properties: A field can have one of these three properties:
  - Fixed value fields (field_value): These fields have a fixed value that is assigned to all records. This is defined using the field_value property. If field_value is a list, field_name could be set at the same time, and the field_value extends the list obtained from the remote field.
  - Field labels: Field name:
    - Name based fields (field_name): These fields are defined by their name in the Excel file. This is defined using the field_name property, or if you need to map a field in a remote dict to the extras dict, use the extras_ prefix to indicate that the field is there.
Multilingual Fields (languages): These fields have different values for different languages. Each language is represented as a separate object within the field object (es, en, ...). The language object can have field_value and field_name properties, just like a normal field.

Example Here are some examples of configuration files:

Field names: With field_name to define the mapping based on names of attributes in the remote sheet (my_title, org_identifier, keywords).

{
  "api_version": 2,
  "clean_tags": false,

  ...
  # other properties
  ...

  "field_mapping_schema_version": 2,
  "dataset_field_mapping": {
    "title": {
        "field_name": "my_title"
      },
    "title_translated": {
        "languages": {
            "en": {
                "field_name": "my_title-en"
            },
            "de": {
                "field_value": ""
            },
            "es": {
                "field_name": "my_title"
            }
        }
    },
    "private": {
        "field_name": "private"
    },
    "theme": {
        "field_name": ["theme", "theme_eu"]
    },
    "tag_custom": {
        "field_name": "keywords"
    },
    "tag_string": {
        "field_name": ["theme_a", "theme_b", "theme_c"]
    },
    "theme_es": {
        "field_value": "http://datos.gob.es/kos/sector-publico/sector/medio-ambiente"
    },
    "tag_uri": {
        "field_name": "keyword_uri",
        // "field_value" extends the original list of values retrieved from the remote file for all records.
        "field_value": ["https://www.example.org/codelist/a","https://www.example.org/codelist/b", "https://www.example.org/codelist/c"] 
    },
    "my_custom_field": {
        // If you need to map a field in a remote dict to the "extras" dict, use the "extras_" prefix to indicate that the field is there.
        "field_name": "extras_remote_custom_field"
    }
  }
}

###TODO: Scheming DCAT CSW INSPIRE Harvester A harvester for remote CSW catalogues using the INSPIRE ISO 19139 metadata profile. This harvester is a subclass of the CSW Harvester provided by ckanext-spatial and is designed to work with the schemingdcat plugin to provide a more versatile and customizable harvester for CSW endpoints and GeoDCAT-AP CKAN instances.

To use it, you need to add the schemingdcat_csw_harvester plugin to your options file:

  ckan.plugins = harvest schemingdcat schemingdcat_datasets ... schemingdcat_csw_harvester

Remote Google Sheet/Onedrive Excel metadata upload Harvester

A harvester for remote Google spreadsheets and Onedrive Excel files with Metadata records. This harvester is a subclass of the Scheming DCAT Base Harvester provided by ckanext-schemingdcat to provide a more versatile and customizable harvester for Excel files that have metadata records in them.

To use it, you need to add the schemingdcat_xls_harvester plugin to your options file:

ckan.plugins = harvest schemingdcat schemingdcat_datasets ... schemingdcat_xls_harvester

Remote Google Sheet/Onedrive Excel metadata upload Harvester supports the following options:

storage_type - Mandatory: The type of storage to use for the harvested datasets as onedrive or gspread. Default is onedrive.
dataset_sheet - Mandatory: The name of the sheet in the Excel file that contains the dataset records.
field_mapping_schema_version: Schema version of the field_mapping to ensure compatibility with older schemas. The default is 2.
dataset_field_mapping/distribution_field_mapping: Mapping field names from local to remote instance, all info at: Field mapping structure
credentials: The credentials parameter should be used to provide the authentication credentials. The credentials depends on the storage_type used.
- For onedrive: The credentials parameter should be a dictionary with the following keys: username: A string representing the username. password: A string representing the password.
- For gspread or gdrive: The credentials parameter should be a string containing the credentials in JSON format. You can obtain the credentials by following the instructions provided in the Google Workspace documentation.
distribution_sheet: The name of the sheet in the Excel file that contains the distribution records. If not provided, the harvester will only create records for the dataset sheet.
datadictionary_sheet: The name of the sheet in the Excel file that contains the data dictionary records. If not provided, the harvester will only create records for the dataset sheet.
api_version: You can force the harvester to use either version 1 or 2 of the CKAN API. Default is 2.
default_tags: A list of tags that will be added to all harvested datasets. Tags don't need to previously exist. This field takes a list of tag dicts which allows you to optionally specify a vocabulary. Default is [].
default_groups: A list of group IDs or names to which the harvested datasets will be added to. The groups must exist in the local instance. Default is [].
default_extras: A dictionary of key value pairs that will be added to extras of the harvested datasets. You can use the following replacement strings, that will be replaced before creating or updating the datasets:
- {dataset_id}
- {harvest_source_id}
- {harvest_source_url} Will be stripped of trailing forward slashes (/)
- {harvest_source_title}
- {harvest_job_id}
- {harvest_object_id}
override_extras: Assign default extras even if they already exist in the remote dataset. Default is False (only non existing extras are added).
user: User who will run the harvesting process. Please note that this user needs to have permission for creating packages, and if default groups were defined, the user must have permission to assign packages to these groups.
read_only: Create harvested packages in read-only mode. Only the user who performed the harvest (the one defined in the previous setting or the 'harvest' sysadmin) will be able to edit and administer the packages created from this harvesting source. Logged in users and visitors will be only able to read them.
force_all: By default, after the first harvesting, the harvester will gather only the modified packages from the remote site since the last harvesting Setting this property to true will force the harvester to gather all remote packages regardless of the modification date. Default is False.
clean_tags: By default, tags are stripped of accent characters, spaces and capital letters for display. Setting this option to False will keep the original tag names. Default is True.
source_date_format: By default the harvester uses dateutil to parse the date, but if the date format of the strings is particularly different you can use this parameter to specify the format, e.g. %d/%m/%Y. Accepted formats are: COMMON_DATE_FORMATS

Field mapping structure (Sheets harvester)

The dataset_field_mapping/distribution_field_mapping is structured as follows (multilingual version):

{
  ...
  "field_mapping_schema_version": 2,
  "<dataset_field_mapping>/<distribution_field_mapping>": {
    "<schema_field_name>": {
      "languages": {
        "<language>":  {
          <"field_value": "<fixed_value>/<fixed_value_list>">,/<"field_name": "<excel_field_name>/<excel_field_name_list>">/< "field_position": "<excel_column>/<excel_column_list>">
        },
        ...
      },
      ...
    },
    ...
  }
}

<schema_field_name>: The name of the field in the CKAN schema.
- <language>: (Optional) The language code for multilingual fields. This should be a valid ISO 639-1 language code. This is now nested under the languages key.
<fixed_value>/<fixed_value_list>: (Optional) A fixed value or a list of fixed values that will be assigned to the field for all records.
Field labels: Field position or field name:
- <field_position>/<field_position_list>: (Optional) The position of the field in the remote file, represented as a letter or a list of letters (e.g., "A", "B", "C").
- <field_name>/<field_name_list>: (Optional) The name of the field in the remote file or a list of field names.

For fields that are not multilingual, you can directly use field_name or field_position without the languages key. For example:

{
  ...
  "field_mapping_schema_version": 2,
  "<dataset_field_mapping>/<distribution_field_mapping>": {
    "<schema_field_name>": {
      <"field_value": "<fixed_value>/<fixed_value_list>">,/<"field_name": "<excel_field_name>/<excel_field_name_list>">/< "field_position": "<excel_column>/<excel_column_list>">
    },
    ...
  }
}

Important

Field Types

There are two types of fields that can be defined in the configuration:

Regular fields: These fields have a field label/position to define the mapping or a fixed value for all its records.
- Properties: A field can have one of these three properties:
  - Fixed value fields (field_value): These fields have a fixed value that is assigned to all records. This is defined using the field_value property. If field_value is a list, field_name or field_position could be set at the same time, and the field_value extends the list obtained from the remote field.
  - Field labels: Field position or field name:
    - Position based fields (field_position): These fields are defined by their position in the Excel file. This is defined using the field_position property.
    - Name based fields (field_name): These fields are defined by their name in the Excel file. This is defined using the field_name property.
Multilingual Fields (languages): These fields have different values for different languages. Each language is represented as a separate object within the field object (es, en, ...). The language object can have field_value, field_position and field_name properties, just like a normal field.

Example Here are some examples of configuration files:

Field positions: With field_position to define the mapping based on positions of attributes in the remote sheet (A, B, AA, etc.).

{
  "storage_type": "gspread",
  "dataset_sheet": "Dataset",
  "distribution_sheet": "Distribution",

  ...
  # other properties
  ...

  "field_mapping_schema_version": 2,
  "dataset_field_mapping": {
    "title": {
        "field_position": "A"
      },
    "title_translated": {
        "languages": {
            "en": {
                "field_position": "AC"
            },
            "de": {
                "field_value": ""
            },
            "es": {
                "field_position": "A"
            }
        }
    },
    "private": {
        "field_position": "F"
    },
    "theme": {
        "field_position": ["G", "AA"],
    },
    "tag_custom": {
        "field_position": "B"
    },
    "tag_string": {
        "field_position": ["A", "B", "AC"]
    },
    "theme_es": {
        "field_value": "http://datos.gob.es/kos/sector-publico/sector/medio-ambiente"
    },
    "tag_uri": {
        "field_position": "Z",
        // "field_value" extends the original list of values retrieved from the remote file for all records.
        "field_value": ["https://www.example.org/codelist/a","https://www.example.org/codelist/b", "https://www.example.org/codelist/c"] 
    },
  }
}

Field names: With field_name to define the mapping based on names of attributes in the remote sheet (my_title, org_identifier, keywords).

{
  "storage_type": "gspread",
  "dataset_sheet": "Dataset",
  "distribution_sheet": "Distribution",

  ...
  # other properties
  ...

  "field_mapping_schema_version": 2,
  "dataset_field_mapping": {
    "title": {
        "field_name": "my_title"
      },
    "title_translated": {
        "languages": {
            "en": {
                "field_name": "my_title-en"
            },
            "de": {
                "field_value": ""
            },
            "es": {
                "field_name": "my_title"
            }
        }
    },
    "private": {
        "field_name": "private"
    },
    "theme": {
        "field_name": ["theme", "theme_eu"]
    },
    "tag_custom": {
        "field_name": "keywords"
    },
    "tag_string": {
        "field_name": ["theme_a", "theme_b", "theme_c"]
    },
    "theme_es": {
        "field_value": "http://datos.gob.es/kos/sector-publico/sector/medio-ambiente"
    },
    "tag_uri": {
        "field_name": "keyword_uri",
        // "field_value" extends the original list of values retrieved from the remote file for all records.
        "field_value": ["https://www.example.org/codelist/a","https://www.example.org/codelist/b", "https://www.example.org/codelist/c"] 
    },
  }
}

Important

All *_translated fields need their fallback non-suffix field as simple field, e.g:

...
   "title": {
        "field_position": "A"
     },
   "title_translated": {
       "languages": {
           "en": {
               "field_value": ""
           },
           "es": {
               "field_position": "A"
           }
      }
   },
...

Running the Tests

To run the tests:

pytest --ckan-ini=test.ini ckanext/schemingdcat/tests

An improvement to [ckanext-fluent] (https://github.com/ckan/ckanext-fluent) to allow more versatility in multilingual schema creation and metadata validation. ↩

ckanext-schemingdcat's People

Contributors

Forkers

apteksdi pabrojast

ckanext-schemingdcat's Issues

Schema - Add hvd_category (High Value Datasets)

To do

Update metadata element in schemas.
Update schemas info.

Info

mjanez/ckanext-dcat@d9a823e
https://semiceu.github.io/DCAT-AP/releases/2.2.0-hvd/

i18n - Fix translations

To do

Improve multilingual jinja texts with i18n. Info: https://docs.ckan.org/en/2.9/extensions/translating-extensions.html

Fixes

Filter by location

Templates - Improve

Fix duplicate attribution in widget map. #44
Update the facet templates to include labels of the schema URIs like: tag_uri, conforms_to, etc. instead of raw URI names.

https://github.com/mjanez/ckanext-scheming_dcat/blob/919e4bd63284071e3d4ee6ca014bd039fbd40af9/ckanext/scheming_dcat/schemas/geodcatap_es/geodcatap_es_dataset.yaml#L645-L658
Improve metadata forms:

Feature - Add ATOM Feed and refactor controller

Refactor all controllers to controller.py:
- package_controller.py

ATOM Feed

Add Feed plugin:

def before_map(self, map):
    with SubMapper(map, controller='ckanext.scheming_dcat.controller:SchemingDcatFeedController') as m:
        m.connect('/feeds/organization/{id}.atom', action='organization')
    map.connect(
        'general', '/feeds/dataset/{pkg_id}.atom',
        controller='ckanext.scheming_dcat.controller:SchemingDcatFeedController',
        action='dataset',
    )
    return map

Add FeedController for ATOM Feed:

class SchemingDcatFeedController(FeedController):
    def general(self):
        data_dict, params = self._parse_url_params()
        data_dict['q'] = '*:*'

        item_count, results = _package_search(data_dict)

        navigation_urls = self._navigation_urls(params,
                                                item_count=item_count,
                                                limit=data_dict['rows'],
                                                controller='feed',
                                                action='general')

        feed_url = self._feed_url(params,
                                  controller='feed',
                                  action='general')

        alternate_url = self._alternate_url(params)

        return self.output_feed(
            results,
            feed_title=_(u'Open Government Dataset Feed'),
            feed_description='',
            feed_link=alternate_url,
            feed_guid=_create_atom_id(
                u'/feeds/dataset.atom'),
            feed_url=feed_url,
            navigation_urls=navigation_urls
        )

    def dataset(self, pkg_id):
        try:
            context = {'model': model, 'session': model.Session,
                       'user': c.user, 'auth_user_obj': c.userobj}
            get_action('package_show')(context,
                                        {'id': pkg_id})
        except NotFound:
            abort(404, _('Dataset not found'))

        data_dict, params = self._parse_url_params()

        data_dict['fq'] = '{0}:"{1}"'.format('id', pkg_id)

        item_count, results = _package_search(data_dict)

        navigation_urls = self._navigation_urls(params,
                                                item_count=item_count,
                                                limit=data_dict['rows'],
                                                controller='feed',
                                                action='dataset',
                                                id=pkg_id)

        feed_url = self._feed_url(params,
                                  controller='feed',
                                  action='dataset',
                                  id=pkg_id)

        alternate_url = self._alternate_url(params, id=pkg_id)

        return self.output_feed(
            results,
            feed_title=_(u'Open Government Dataset Feed'),
            feed_description='',
            feed_link=alternate_url,
            feed_guid=_create_atom_id(
                u'/feeds/dataset/{0}.atom'.format(pkg_id)),
            feed_url=feed_url,
            navigation_urls=navigation_urls
        )

    def output_feed(self, results, feed_title, feed_description,
                    feed_link, feed_url, navigation_urls, feed_guid):

        author_name = config.get('ckan.feeds.author_name', '').strip() or \
            config.get('ckan.site_id', '').strip()
        author_link = config.get('ckan.feeds.author_link', '').strip() or \
            config.get('ckan.site_url', '').strip()

        feed = _FixedAtom1Feed(
            title=feed_title,
            link=feed_link,
            description=feed_description,
            language=u'en',
            author_name=author_name,
            author_link=author_link,
            feed_guid=feed_guid,
            feed_url=feed_url,
            previous_page=navigation_urls['previous'],
            next_page=navigation_urls['next'],
            first_page=navigation_urls['first'],
            last_page=navigation_urls['last'],
        )

        for pkg in results:
            feed.add_item(
                title= h.get_translated(pkg, 'title'),
                link=h.url_for(controller='package', action='read', id=pkg['id']),
                description= h.get_translated(pkg, 'notes'),
                updated=date_str_to_datetime(pkg.get('metadata_modified')),
                published=date_str_to_datetime(pkg.get('metadata_created')),
                unique_id=_create_atom_id(u'/dataset/%s' % pkg['id']),
                author_name=pkg.get('author', ''),
                author_email=pkg.get('author_email', ''),
                categories=''.join(e['value']
                                   for e in pkg.get('extras', [])
                                   if e['key'] == lx('keywords')).split(','),
                enclosure=webhelpers.feedgenerator.Enclosure(
                    self.base_url + url(str(
                        '/api/action/package_show?id=%s' % pkg['name'])),
                    unicode(len(json.dumps(pkg))),   # TODO fix this
                    u'application/json')
            )
        response.content_type = feed.mime_type
        return feed.writeString('utf-8')

Fix - Shorten urls in snippets

publisher_info template
contact_info template

Enhanced DCAT-AP profiles to ensure MQA DCAT-AP compliance

References

DCAT-AP:

https://github.com/SEMICeu/DCAT-AP/tree/master/releases

https://semiceu.github.io/DCAT-AP/releases/3.0.0/#validation-of-dcat-ap

EU Vocabularies: https://op.europa.eu/en/web/eu-vocabularies/dcat-ap

Validator:

https://www.itb.ec.europa.eu/shacl/dcat-ap/upload

https://github.com/ISAITB/validator-resources-dcat-ap/tree/master#

DCAT-AP Country profile:

https://github.com/diggsweden/DCAT-AP-SE

https://github.com/opendata-swiss/dcat_ap_ch

SHACLs: https://github.com/ISAITB/validator-resources-dcat-ap/blob/baca3adf63d31ee415fa5e769249053ae211414c/resources/config.properties

DCAT-AP Validator Validation Cases

The different cases to validate in the DCAT-AP Validator are based on the level of completeness of the checks and the incorporation of background knowledge (vocabularies). Each case is designed for a specific data exchange scenario.
The following describes each case and recommends which one you should use for a CKAN catalog:

Case 1: DCAT-AP Base Zero (no background knowledge)

Includes all constraints required for technical coherence, excluding range class membership constraints and controlled vocabulary usage.

SHACL Profiles:

2.1.1
3.0.0

Case 2: DCAT-AP Ranges Zero (no background knowledge)

Includes all range class membership constraints.

SHACL Profiles:

2.1.1
3.0.0

Case 3: DCAT-AP Base (with background knowledge)

Extends Case 1 with background knowledge, including all vocabularies used in DCAT-AP.

SHACL Profiles:

2.1.1: shapes and imports
3.0.0: shapes and imports

Case 4: DCAT-AP Ranges (with background knowledge)

Extends Case 2 with background knowledge, adding validation of range class membership and vocabulary standards compliance.

SHACL Profiles:

2.1.1: range and imports
3.0.0: range and imports

Case 5: DCAT-AP Recommendations (with background knowledge)

Includes all constraints related to recommended properties.

SHACL Profiles:

2.1.1: shapes recommended and imports
3.0.0: shapes recommended and imports

Case 6: DCAT-AP Controlled Vocabularies

Includes all constraints related to controlled vocabularies.

SHACL Profiles:

2.1.1: vocabularies shape and imports
3.0.0: vocabularies shape and mdr imports

Case 7: DCAT-AP Full (with background knowledge)

The union of Cases 3, 4, 5, and 6.

SHACL Profiles:

2.1.1: shapes, shapes recommended, imports, range and deprecateduris
3.0.0: shapes, shapes recommended, imports, range and deprecateduris

Recommendation:

For most use cases, Case 3: DCAT-AP Base (with background knowledge) is recommended. It provides comprehensive validation of basic coherence and vocabulary standards compliance.
If your CKAN catalog uses controlled vocabularies, consider using Case 6: DCAT-AP Controlled Vocabularies or Case 7: DCAT-AP Full (with background knowledge) for more exhaustive validation.
Remember, the choice of the appropriate validation case depends on your specific needs and data exchange context.

Schema - Add graphic_overview field

Context info

Description:
Graphic that provides an illustration of the dataset

Pygeometa MCF Core schema

XML Schema ISO19139 gmd:graphicOverview:

<gmd:identificationInfo>
...
  <gmd:graphicOverview>
    <gmd:MD_BrowseGraphic>
      <gmd:fileName>
        <gco:CharacterString>{{ record['identification']['browsegraphic']|e }}</gco:CharacterString>
        </gmd:fileName>
    </gmd:MD_BrowseGraphic>
  </gmd:graphicOverview>
</gmd:identificationInfo>

Tasks

Basic

Add element to schemas.

- field_name: graphic_overview
  label:
    en: Graphic overview of the dataset
    es: Descripción gráfica del conjunto de datos
  display_snippet: link_name.html
  form_placeholder: http://example.com/dataset.jpg
  help_text:
    en: "Graphic that provides an illustration of the dataset."
    es: "Gráfico que ilustra el conjunto de datos."

Add element to ckan-pycsw ckan GeoDCAT-AP schema
Update ckan-pycsw pygeometa INSPIRE ISO19139 schema

Enhancements

Add a custom snippet to upload image from web or use /form_snippet/upload.html

Schema - Fix theme/theme_es/theme_eu fields

Maintain consistency in the naming of field_names, utilizing EU INSPIRE themes as a default, and adding suffixes to any national values.

theme: INSPIRE Themes in all schemas. Spatial data themes, as defined in the Annexes of the INSPIRE Directive, compatibles with DCAT and also DCAT-AP/GeoDCAT-AP. (E.g: https://www.w3.org/TR/vocab-dcat-3/#ex-spatial-coverage-bbox)
theme_{code}: For instance, theme_eu for MDR Themes or theme_es for spanish NTI-RISP vocabulary themes.

Adaptation is also required:

ckan-docker (Solr fields, .env, etc.)
ckan-ogc theme/theme_inspire (Harvester/Ingester fields)
ckan-pycsw (theme == <gmd:keyword> from Thesaurus.)

Schema - Improve languages codelist

Based on the controlled list, expand the list of available metadata languages http://publications.europa.eu/resource/authority/language

Schemas:

https://github.com/mjanez/ckanext-scheming_dcat/blob/main/ckanext/scheming_dcat/schemas/README.md

Example:
https://github.com/mjanez/ckanext-scheming_dcat/blob/919e4bd63284071e3d4ee6ca014bd039fbd40af9/ckanext/scheming_dcat/schemas/geodcatap_es/geodcatap_es_dataset.yaml#L2202-L2218

CKAN 2.10 Update

Tip

Document common issues when upgrading extensions from CKAN 2.9 to CKAN 2.10.

Develop feature/form-tabs

Fix forms and validations to include tabs instead of stages in package_form templates.

Add temporal_resolution field

ISO 8601: https://www.w3.org/TR/xmlschema11-2/#duration

duration is a datatype that represents durations of time. The concept of duration being captured is drawn from those of [ISO 8601], specifically durations without fixed endpoints. For example, "15 days" (whose most common lexical representation in duration is "'P15D'") is a duration value; "15 days beginning 12 July 1995" and "15 days ending 12 July 1995" are not duration values. duration can provide addition and subtraction operations between duration values and between duration/dateTime value pairs, and can be the result of subtracting dateTime values. However, only addition to dateTime is required for XML Schema processing and is defined in the function ·dateTimePlusDuration·.

Schema - Improve spatial_uri field

Migrated from: mjanez/ckanext-scheming#10

Steps

GeoDCAT-AP Schema

Add generic continents
to: spatial_uri field.
Add generic icon for non-spanish uris spatial_uris.html template.
Add icons for countries.

DCAT-AP Schema

Add generic continents
and european countries
to: spatial_uri field.

DCAT-AP Info

Property URI	Used for Class	Vocabulary URI	Vocabulary URI	Usage Note
dct:spatial	Catalogue, Dataset	EU Vocabularies Continents Named Authority List, EU Vocabularies Countries Named Authority List, EU Vocabularies Places Named Authority List, Geonames	http://publications.europa.eu/resource/authority/continent, http://publications.europa.eu/resource/authority/country, http://publications.europa.eu/resource/authority/place, http://sws.geonames.org/	The EU Vocabularies Name Authority Lists must be used for continents, countries and places that are in those lists; if a particular location is not in one of the mentioned Named Authority Lists, Geonames URIs must be used.

Schema - Improve dcat_type with

Improve DCAT schemas to include all codelist elements from MARC Genre Terms: https://docs.google.com/spreadsheets/d/1p4h5y82-jiH68qGY_WeCV6HuyJKmuyC1XLjVbwjsAdg/edit#gid=1165291034

id	label	language
http://id.loc.gov/vocabulary/marcgt/art	Article	en
http://id.loc.gov/vocabulary/marcgt/bda	Bibliographic data	en
http://id.loc.gov/vocabulary/marcgt/bib	Bibliography	en
http://id.loc.gov/vocabulary/marcgt/doc	Document (computer)	en
http://id.loc.gov/vocabulary/marcgt/dtb	Database	en
http://id.loc.gov/vocabulary/marcgt/gov	Government publication	en
http://id.loc.gov/vocabulary/marcgt/jou	Journal	en
http://id.loc.gov/vocabulary/marcgt/leg	Legislation	en
http://id.loc.gov/vocabulary/marcgt/rpt	Reporting	en
http://id.loc.gov/vocabulary/marcgt/stp	Standard or specification	en
http://id.loc.gov/vocabulary/marcgt/ter	Technical report	en

Harvester - Custom INSPIRE CSW Catalogs harvesters

Adapt custom OGC harvester for INSPIRE Endpoints.

Fix - Fix schema/components output_validators for multiple fields

Internally all extra fields are stored as strings. If you are attempting to save and restore other types of data you will need to use output validators.

For example if you use a simple "yes/no" question, you will need to let ckanext-scheming know that this data needs to be stored and retrieved as a boolean. This is acheieved using validators and output_validators keys.
 - field_name: is_camel_friendly
   label: Is this camel friendly?
   required: true
   preset: select
   choices:
     - value: false
       label: "No"
     - value: true
       label: "Yes"
   validators: scheming_required boolean_validator
   output_validators: boolean_validator
Info: https://github.com/ckan/ckanext-scheming#output_validators

Schemas

Fix all multiple fields in custom schemas.
- DCAT
- DCAT-AP
- GeoDCAT-AP (EU/ES)

Components of ckan-docker deployment

ckan-docker: Check API actions like package_show.
ckan-ogc: Check ingested JSON Lists.
ckan-pycsw: Check harvested JSON Lists.

Schema - Create a DCAT-AP Schema (EU)

Create a new schema based on the Spanish GeoDCAT-AP YAML for an international context without the mandatory elements of the NTI-RISP standard, only DCAT-AP.

Note
Info about the latest mapping: Schema GeoDCAT-AP.

Steps to complete a ckan_geodcatap_eu.yaml version from ckan_geodcatap.yaml:

Metadata properties:

Update theme_es of the NTI-RISP (ES) to theme_eu (MDR Data Themes) https://github.com/mjanez/ckanext-scheming/blob/9dcb4e3bacc86efa1accd2853d4a520d740ebf6e/ckanext/scheming/ckan_geodcatap.yaml#L142-L495
Update multi-value from codelist spatial_uri of the NTI-RISP (ES) to a stored as text value. For example to use geonames values: https://www.geonames.org/2510769/kingdom-of-spain.html

dct:spatial

Vocabs: DR Continents Named Authority List , MDR Countries Named Authority List , MDR Places Named Authority List , Geonames.

Definition: The MDR Name Authority Lists must be used for continents, countries and places that are in those lists; if a particular location is not in one of the mentioned Named Authority Lists, Geonames URIs must be used.

DCAT-AP

https://github.com/mjanez/ckanext-scheming/blob/9dcb4e3bacc86efa1accd2853d4a520d740ebf6e/ckanext/scheming/ckan_geodcatap.yaml#L1339-L1857

Add european languages from MDR Authority Language https://github.com/mjanez/ckanext-scheming/blob/9dcb4e3bacc86efa1accd2853d4a520d740ebf6e/ckanext/scheming/ckan_geodcatap.yaml#L513-L532
#2

Bug - CKAN 2.10 - Fix request.args()

2024-03-12 14:23:33,493 WARNI [ckan.lib.maintain] Function params() in module ckan.common has been deprecated since CKAN v2.10.0 and will be removed in a later release of ckan. Use `request.args` instead of `request.params`

DCAT - Custom DCAT-AP/GeoDCAT-AP (ES/EU) profiles in schemingdcat

ckanext-dcat

ckanext-scheming

Enhance schemas to ensure compatibility with new profiles:
- geodcat_ap_es to GeoDCAT-AP 3.0.0
- geodcat_ap to GeoDCAT-AP 3.0.0
- dcat_ap to DCAT-AP 3.0.0

Concretely the multilingual fields.

Example

https://hermes.tragsatec.es/catalogo/api/3/action/group_list?all_fields=True

{
            "approval_status": "approved",
            "created": "2023-10-18T16:12:32.494529",
            "description": "",
            "display_name": "",
            "id": "64b56bd3-27da-4d20-b3c7-4cd3de945789",
            "image_display_url": "https://hermes.tragsatec.es/catalogo/dataset/6439fa33-1fe9-43ce-adf3-3a2e3177246a/resource/523a3d6a-9dbd-4059-b74e-20cce785f9c9/download/af_2.jpg",
            "image_url": "https://hermes.tragsatec.es/catalogo/dataset/6439fa33-1fe9-43ce-adf3-3a2e3177246a/resource/523a3d6a-9dbd-4059-b74e-20cce785f9c9/download/af_2.jpg",
            "is_organization": false,
            "name": "7_alternative-fuels",
            "num_followers": 0,
            "package_count": 10,
            "state": "active",
            "title": "",
            "type": "group"
        }

`tags`

{
            "id": "96d4a6d8-b9d2-470e-aad6-57689396882d",
            "name": "theme",
            "tags": [
                {
                    "id": "e50c57e9-54c5-4400-806f-3e30b313abd9",
                    "name": "ac",
                    "vocabulary_id": "96d4a6d8-b9d2-470e-aad6-57689396882d",
                    "display_name": "ac"
                },
                {
                    "id": "228741b0-c86c-4ee4-8f8a-3c83e45aaa71",
                    "name": "ad",
                    "vocabulary_id": "96d4a6d8-b9d2-470e-aad6-57689396882d",
                    "display_name": "ad"
                },
                {
                    "id": "a5f92a8d-0f07-4c8b-8241-516fb985d8d2",
                    "name": "af",
                    "vocabulary_id": "96d4a6d8-b9d2-470e-aad6-57689396882d",
                    "display_name": "af"
                },
}

Schema - Transform contact, publisher, author and maintainer to list of dicst (repeating subfields)

Ref: ckanext-dcat/dcat_ap_full

DCAT - Improve codelists

Allow download codelists of EU Publications. See ckan-mqa

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

mjanez / ckanext-schemingdcat Goto Github PK

ckanext-schemingdcat's Introduction

ckanext-schemingdcat. LOD/INSPIRE metadata enhancement for ckanext-scheming

Overview

Requirements

Installation

Configuration

Scheming DCAT

Harvest

Endpoints

Facet Scheming

Facet Scheming integration with Solr

Icons

New theme

Schemas

GeoDCAT-AP (ES)

DCAT

DCAT-AP (EU)

GeoDCAT-AP (EU)

Form Groups

Adding Fields to Form Groups

Harvesters

Basic using

Scheming DCAT CKAN Harvester: CKAN Harvester for custom schemas

Field mapping structure

Field Types

Remote Google Sheet/Onedrive Excel metadata upload Harvester

Field mapping structure (Sheets harvester)

Field Types

Running the Tests

Footnotes

ckanext-schemingdcat's People

Contributors

Forkers

ckanext-schemingdcat's Issues

To do

Info

To do

Fixes

References

DCAT-AP Validator Validation Cases

Case 1: DCAT-AP Base Zero (no background knowledge)

Case 2: DCAT-AP Ranges Zero (no background knowledge)

Case 3: DCAT-AP Base (with background knowledge)

Case 4: DCAT-AP Ranges (with background knowledge)

Case 5: DCAT-AP Recommendations (with background knowledge)

Case 6: DCAT-AP Controlled Vocabularies

Case 7: DCAT-AP Full (with background knowledge)

Recommendation:

Context info

Tasks

Basic

Enhancements

Steps

GeoDCAT-AP Schema

DCAT-AP Schema

DCAT-AP Info

ckanext-dcat

ckanext-scheming

Example

tags

Recommend Projects

Recommend Topics

Recommend Org

`tags`