sdmx-twg / sdmx-ml Goto Github PK

View Code? Open in Web Editor NEW

11.0 11.0 4.0 3.67 MB

This repository is used for maintaining the SDMX-ML format specification

format iso sdmx specification standard xml

sdmx-ml's People

Contributors

Stargazers

Watchers

Forkers

glenntav sdmx3mdt fuminzhang4809

sdmx-ml's Issues

New SDMX-ML Message with default dimensions on top (old issue name: Disabling of Dimensions)

It is common in any one series that not all dimensions play an active role – some are “not applicable”, but all dimensions must be present and contain a value in the data set. There is no syntactic way of depicting that a Dimension plays no role
The stop-gap solution of arbitrary codes (_Z, _T) aren't supported by the information model.

Geo component and concept role name

(in case it is not fixed already)
It is about an inconsistency between public draft Technical Notes and a sample here in sdmx-ml concerning the concept role id.
(relates to #9 )

The Technical Notes line 1172 notes that the role is GEO_FEATURE_SET:

1171 Any Component used for representing a Geographical Feature Set, i.e., used to
1172 describe geographical characteristics, must have a “GEO_FEATURE_SET” role. Its
1173 Representation would be of textType="GeospatialInformation".

But in the sdmx-ml sample samples/Geospatial/geospatial_geocomponents.xml#L58 it has role GEO.

Possible error in SDMX to VTL date/time conversion specifications

In Section 6, page 94, it is stated that the GregorianYear, GregorianYearMonth and GregorianMonth must be converted to a VTL date.

Since either the day or the month parts are not present, this poses a challenge in choosing an exact date.

Perhaps the converted VTL type should be time_period?

That would allow to represent the entire year and/or month, with formatting adherent to the SDMX representation.

Unclear representation of geospatial artefacts

As far as I can tell, geospatial artefacts have multiple representations allowed by the schema definitions, all of these paths seem valid:

/m:Structure/m:Structures/s:Codelists[n]/s:Codelist[n]/s:GeoFeatureSetCode[n]/@value
/m:Structure/m:Structures/s:GeographicCodelists[n]/s:GeographicCodelist[n]/s:GeoFeatureSetCode[n]/@value

/m:Structure/m:Structures/s:Codelists[n]/s:Codelist[n]/s:GeoGridCode[n]/s:GeoCell
/m:Structure/m:Structures/s:GeoGridCodelists[n]/s:GeoGridCodelist[n]/s:GeoGridCode[n]/s:GeoCell

The SDMX 3.0.0 Registry specification also does not define special URN prefixes for geospatial artefacts, suggesting they use codelist.Codelist which would suggest the representation under /m:Structure/m:Structures/s:Codelists[n]/ is the intended one, but the examples use the specialized representation, in which case using codelist.Codelist seems inconsistent and one would expect they use URN prefixes of codelist.GeographicCodelist and codelist.GeoGridCodelist.

(In doubt, please handle this as a public review comment on SDMX 3.1 once the comment period begins.)

Migrate the documentation to readthedocs.org

Decision has been taken to migrate the documentation to readthedocs.org.

We should take this opportunity to consolidate and improve the current documentation.
This repository will details the XML format part of the full documentation glued together in sdmx-im.

SDMX 3.0: implement "feature 010 Improve mapping by enhancing the Structure Set artefact" for SDMX-ML messages

https://metadatatechnology.com/sdmx3/designs/010/baseline/SDMX3_StructureSet_Feature_Solution_v2.1.0.docx

Possibly bug/breaking change in parent items regular expression

Hello,

Just in case it is a bug, It seems that the item Parent restriction doesn't accept values starting with a number, e.g. CL_REF_AREA.

https://github.com/sdmx-twg/sdmx-ml/blob/master/schemas/SDMXStructureCodelist.xsd#L77

sdmx-ml/schemas/SDMXCommonReferences.xsd

Line 1597 in 29f1a3d

<xs:simpleType name="SingleNCNameIDType">

Value '1' is not facet-valid with respect to pattern '[A-Za-z][A-Za-z0-9_\-]*(\.[A-Za-z][A-Za-z0-9_\-]*)*' for type 'SingleNCNameIDType'.line/column: 2/1846343cvc-type.3.1.3: The value '1' of element 'str:Parent'

But at the same time it is allowed to have a code that starts with a number
https://github.com/sdmx-twg/sdmx-ml/blob/master/schemas/SDMXStructureBase.xsd#L49

SDMX 3.0: implement "feature 029 Improving API data queries" for SDMX-ML messages

https://metadatatechnology.com/sdmx3/designs/029/baseline/029%20Improving%20API%20data%20queries%20v1.0.docx

If the 'context' is wildcarded (or has multiple selections) and the same data are defined for DSD, Dataflow and PA, would the data then have to be returned multiple times, each time in its context? If not, which context should be used by default or in preference?

Delete messages

Handling incremental deletion of messages in SDMX-ML is described in Section 3A, Part IV, page 69.

The last sentence in that section looks sub-optimal. It reads as follows: "Finally, to delete a data attribute or observation value it is recommended that the value to be deleted be supplied; however, it is only required that any valid value be provided."

This looks sub-optimal considering that, in order to delete a particular attribute value, all I need to know is the attribute ID and the key of the element to which that attribute is attached. This simple, logical way is how SDMX-EDI works by the way. So, why, in SDMX-ML, are we asked to supply the attribute value? Worse, why is it OK to supply "any valid value", which is even more confusing?

For structure specific message, the XML specification allows empty attribute values (e.g. CONF_STATUS=""), so there is no technical reason why the attribute value must be provided.

For generic messages, the current syntax is as follows:

<generic:Value id="BIS_TOPIC" value="ABBA"/>

In that case as well, it would be sufficient in delete messages to write:

<generic:Value id="BIS_TOPIC" />

The only reason we could think of is that the schema generated for structure specific messages would need to be dependent on the action, i.e. there would be one schema for delete messages and one schema for the other action types. This can easily be addressed in the RESTful API though, by adding an action parameter to schema queries.

Maybe that this could be addressed within the scope of SDMX 3.0?

Review isMultiLingual default in Data attributes/measures

Hi,

In SDMX 2.1 the default for isMultiLingual XML attribute was true but it was only used by Metadata attributes.

In SDMX 3.0.0 it continues to default to true in XML, but now it is also used by data attributes/measures. Which means that in SDMX 3.0.0 all data attributes that have text type String by default allow multilingual values.

Is this the expected behavior ?

The reason I ask is because it complicates backwards compatibility with SDMX 2.1 but also JSON v1/v2 which defaults to false.

Schema Typo

sdmx-ml/schemas/SDMXStructure.xsd

Line 230 in 29f1a3d

<xs:selector xpath="structure:MetadataConstraing"/>

MetadataConstraing should be MetadataConstraint

Metadata in Dataset question

Hello,

I am trying to build some examples for Dataset that includes metadata attributes for SDMX 3.0.0.
So in the Dataset I see I can include the <Metadata> level under Dataset, Series, Group and Observation levels.
In the SDMXDataStructureSpecific.xsd , Metadata has type http://www.sdmx.org/resources/sdmxml/schemas/v3_0/metadata/generic: MetadataSetType e.g.

https://github.com/sdmx-twg/sdmx-ml/blob/master/schemas/SDMXDataStructureSpecific.xsd#L161

This means for each <Metadata> we need to provide id, agencyID, Name, Metadataflow Ref and Target as they are mandatory for MetadataSetType.
e.g.

     <Series FREQ="A" CURRENCY="CAD"  TIME_FORMAT="P1Y" >
            <Obs TIME_PERIOD="1999" MEASURE1="123.456" />
            <Obs TIME_PERIOD="2000"  MEASURE1="13.456" />
            <Obs TIME_PERIOD="2001" MEASURE1="34.56" />
            <Metadata agencyID="EXAMPLE" id="SERIES_LEVEL_MS">
                <common:Name xml:lang="en">Metadata at series level</common:Name>
                <ref:Metadataflow>urn:sdmx:org.sdmx.infomodel.metadatastructure.Metadataflow=EXAMPLE:TEST_MDF(1.0.0-draft)</ref:Metadataflow>
                <ref:Target>urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=EXAMPLE:DF_EXAMPLE(2.0.0-draft)</ref:Target>
                <ref:Attribute id="CONTACT">
                    <ref:Attribute id="CONTACT_ORGANISATION">
                        <ref:Value>BIG_ORG</ref:Value>
                    </ref:Attribute>
                </ref:Attribute>
            </Metadata>
        </Series>

Is this correct ?

Or this applies only for datasets that use DSD that doesn't reference a MSD & Metadata attributes ? E.g. they are used adhoc like annotations ?

Just in case in the example at page 14 at https://sdmx.org/wp-content/uploads/SDMX_3-0-0_Major_Changes_FINAL-1_0.pdf it doesn't include any of the maintainable mandatory XML attributes/elements.

SDMX 3.0: implement "feature 013 Reorganising constraints" for SDMX-ML messages

https://metadatatechnology.com/sdmx3/designs/013/approved/TF-2%20Reorganising%20Constraints%20v0.1.0.docx

Data message to fully replace data

Allow for an easy way to request an SDMX store to delete all data within a specific scope prior to adding the submitted data.
This could be through a new action type or through a more flexible wildcard syntax in the data itself.
It is to be clarified, if such changes are to be logged and available through the updatedAfter or includeHistory parameters, in case the implementation supports those parameters.

SDMX 3.0: implement "feature 002 Support reference metadata in the Restful APIs" for SDMX-ML messages

https://metadatatechnology.com/sdmx3/designs/002/draft/002%20Reference%20Metadata%20API%20V2.1.0%20no%20markup.docx

SDMX 3.0: implement "feature 028 Simplify DSD dimensions" for SDMX-ML messages

https://metadatatechnology.com/sdmx3/designs/028/approved/Simplify%20DSD%20Dimension%20definition%20v0.0.2.docx

Harmonize indenting across XSD files

Received from IMF

Currently there are a mix of spaces and tabs across the XSD which can break the indenting in various systems.

SDMX 3.0: implement "feature 006 Discriminated union of codelists" for SDMX-ML messages

https://metadatatechnology.com/sdmx3/designs/006/approved/SDMX3%20Discriminated%20union%20of%20codelists%20v29062020.docx

SDMX 3.0: implement "feature 005 Codelist extension / composition" for SDMX-ML messages

https://metadatatechnology.com/sdmx3/designs/005/approved/005%20Code%20List%20Extension%20V1.0.0.docx

GeospatialInformation and WKT question

Hello

(My apologies if this is not the correct place to post questions about SDMX 3.0.0)

I am having problems understanding the GeospatialInformation text type as described in the Technical Notes in order to produce a sample dataset. If I understood correctly the format is one of the following:

{ <WKT> } <: free text>?
<crs code>?, <precision>?: { <WKT> } <: free text>?

is this correct ?

For example given the following WKT

POLYGON ((5.756835937499999 49.46633911082605, 6.51763916015625 49.46633911082605, 6.51763916015625 49.84860975344834, 5.756835937499999 49.84860975344834, 5.756835937499999 49.46633911082605))

and the DSD from the sample the expected format in a dataset would look something like the following ?

<Series INDICATOR="COB1" AREA="{( POLYGON ((5.756835937499999 49.46633911082605, 6.51763916015625 49.46633911082605, 6.51763916015625 49.84860975344834, 5.756835937499999 49.84860975344834, 5.756835937499999 49.46633911082605)) ) }: Part of Luxembourg and surrounding areas">
            <Obs TIME_PERIOD="1999" OBS_VALUE="1.583993822393823"  />
</Series>

SDMX 3.0: implement "feature 004 Standardise geospatial data exchange" for SDMX-ML messages

https://metadatatechnology.com/sdmx3/designs/004/approved/SDMX%20Geospatial%20Information%2020201014+EG+DB+EG.docx

Merge SDMXMessageFooter.xsd and SDMXMessage.xsd

Received from IMF

Currently the header specifications are stored in SDMXMessage along with message specifications while footer specifications are split into a separate file.

We’d like to treat headers and footers similarly, simplify the XSD structures, as well as the SDMX message structure.

Because of this split today many SDMX files need to declare an additional namespace just for footers.

Simplify DSD Dimensions

After agreeing on keeping one way of defining Dimensions within a DSD (see here) we need to specify the Dimension definition in XML.

In addition, managing the dataset formats that will be available has to be decided. Considering the deprecation of the TimeDimension, it seems that the time-series specific dataset formats (generic and structure specific) are no longer relevant.

Handling of missing values

From @sosna
According to previous SDMX User Guides, NaN should be used (The guide from 2012 (i.e. published after SDMX 2.1) stated “The first in the first has an OBS_VALUE of “NaN”. This is an XML expression that declared a value of “not a number”, thus allowing a “missing value” to be declared"). More recent versions of the Guide have seen significant rewriting and that section has disappeared in the process (I don’t know why). The SDMX-ML 2.0 spec states: “In some of the SDMX-ML documents, an Observation is required (as in the Utility format) or it is desirable to indicate that a numerical value does not exist. While this information may be captured in an Observation-level attribute such as OBS_STATUS, with a code indicating that the value for the observation is missing, there is also a way to reliably indicate this state in the data itself. For this purpose, missing observation values – when included in an SDMX-ML data file – should be indicated using “NaN”. In XML, this indicates “not a number”, but is still valid in numeric fields. This avoids having to use a number (such as “-9999999” or “0”), along with a status code of “missing” (or similar construct) to indicate missing numeric values”.

The appropriate approach for missing values in SDMX-XML should be clarified.
The solution should also distinguish between “values to be set to 'missing'” and “values not to be changed when appending”.

Related ticket in SDMX-JSON: sdmx-twg/sdmx-json#122
Related Ticket in SDMX-CSV: sdmx-twg/sdmx-csv#27