Coder Social home page Coder Social logo

sdmx-ml's People

Contributors

dosse avatar sdmx3mdt avatar tzaphkiel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sdmx-ml's Issues

Geo component and concept role name

Hi

(in case it is not fixed already)
It is about an inconsistency between public draft Technical Notes and a sample here in sdmx-ml concerning the concept role id.
(relates to #9 )

The Technical Notes line 1172 notes that the role is GEO_FEATURE_SET:

1171 Any Component used for representing a Geographical Feature Set, i.e., used to
1172 describe geographical characteristics, must have a “GEO_FEATURE_SET” role. Its
1173 Representation would be of textType="GeospatialInformation".

But in the sdmx-ml sample samples/Geospatial/geospatial_geocomponents.xml#L58 it has role GEO.

Possible error in SDMX to VTL date/time conversion specifications

In Section 6, page 94, it is stated that the GregorianYear, GregorianYearMonth and GregorianMonth must be converted to a VTL date.

Since either the day or the month parts are not present, this poses a challenge in choosing an exact date.

Perhaps the converted VTL type should be time_period?

That would allow to represent the entire year and/or month, with formatting adherent to the SDMX representation.

Unclear representation of geospatial artefacts

As far as I can tell, geospatial artefacts have multiple representations allowed by the schema definitions, all of these paths seem valid:

/m:Structure/m:Structures/s:Codelists[n]/s:Codelist[n]/s:GeoFeatureSetCode[n]/@value
/m:Structure/m:Structures/s:GeographicCodelists[n]/s:GeographicCodelist[n]/s:GeoFeatureSetCode[n]/@value

/m:Structure/m:Structures/s:Codelists[n]/s:Codelist[n]/s:GeoGridCode[n]/s:GeoCell
/m:Structure/m:Structures/s:GeoGridCodelists[n]/s:GeoGridCodelist[n]/s:GeoGridCode[n]/s:GeoCell

The SDMX 3.0.0 Registry specification also does not define special URN prefixes for geospatial artefacts, suggesting they use codelist.Codelist which would suggest the representation under /m:Structure/m:Structures/s:Codelists[n]/ is the intended one, but the examples use the specialized representation, in which case using codelist.Codelist seems inconsistent and one would expect they use URN prefixes of codelist.GeographicCodelist and codelist.GeoGridCodelist.

(In doubt, please handle this as a public review comment on SDMX 3.1 once the comment period begins.)

Migrate the documentation to readthedocs.org

Decision has been taken to migrate the documentation to readthedocs.org.

We should take this opportunity to consolidate and improve the current documentation.
This repository will details the XML format part of the full documentation glued together in sdmx-im.

Possibly bug/breaking change in parent items regular expression

Hello,

Just in case it is a bug, It seems that the item Parent restriction doesn't accept values starting with a number, e.g. CL_REF_AREA.

https://github.com/sdmx-twg/sdmx-ml/blob/master/schemas/SDMXStructureCodelist.xsd#L77

<xs:simpleType name="SingleNCNameIDType">

Value '1' is not facet-valid with respect to pattern '[A-Za-z][A-Za-z0-9_\-]*(\.[A-Za-z][A-Za-z0-9_\-]*)*' for type 'SingleNCNameIDType'.line/column: 2/1846343cvc-type.3.1.3: The value '1' of element 'str:Parent'

But at the same time it is allowed to have a code that starts with a number
https://github.com/sdmx-twg/sdmx-ml/blob/master/schemas/SDMXStructureBase.xsd#L49

Delete messages

Handling incremental deletion of messages in SDMX-ML is described in Section 3A, Part IV, page 69.

The last sentence in that section looks sub-optimal. It reads as follows: "Finally, to delete a data attribute or observation value it is recommended that the value to be deleted be supplied; however, it is only required that any valid value be provided."

This looks sub-optimal considering that, in order to delete a particular attribute value, all I need to know is the attribute ID and the key of the element to which that attribute is attached. This simple, logical way is how SDMX-EDI works by the way. So, why, in SDMX-ML, are we asked to supply the attribute value? Worse, why is it OK to supply "any valid value", which is even more confusing?

For structure specific message, the XML specification allows empty attribute values (e.g. CONF_STATUS=""), so there is no technical reason why the attribute value must be provided.

For generic messages, the current syntax is as follows:

<generic:Value id="BIS_TOPIC" value="ABBA"/>

In that case as well, it would be sufficient in delete messages to write:

<generic:Value id="BIS_TOPIC" />

The only reason we could think of is that the schema generated for structure specific messages would need to be dependent on the action, i.e. there would be one schema for delete messages and one schema for the other action types. This can easily be addressed in the RESTful API though, by adding an action parameter to schema queries.

Maybe that this could be addressed within the scope of SDMX 3.0?

Review isMultiLingual default in Data attributes/measures

Hi,

In SDMX 2.1 the default for isMultiLingual XML attribute was true but it was only used by Metadata attributes.

In SDMX 3.0.0 it continues to default to true in XML, but now it is also used by data attributes/measures. Which means that in SDMX 3.0.0 all data attributes that have text type String by default allow multilingual values.

Is this the expected behavior ?

The reason I ask is because it complicates backwards compatibility with SDMX 2.1 but also JSON v1/v2 which defaults to false.

Metadata in Dataset question

Hello,

I am trying to build some examples for Dataset that includes metadata attributes for SDMX 3.0.0.
So in the Dataset I see I can include the <Metadata> level under Dataset, Series, Group and Observation levels.
In the SDMXDataStructureSpecific.xsd , Metadata has type http://www.sdmx.org/resources/sdmxml/schemas/v3_0/metadata/generic: MetadataSetType e.g.

https://github.com/sdmx-twg/sdmx-ml/blob/master/schemas/SDMXDataStructureSpecific.xsd#L161

This means for each <Metadata> we need to provide id, agencyID, Name, Metadataflow Ref and Target as they are mandatory for MetadataSetType.
e.g.

     <Series FREQ="A" CURRENCY="CAD"  TIME_FORMAT="P1Y" >
            <Obs TIME_PERIOD="1999" MEASURE1="123.456" />
            <Obs TIME_PERIOD="2000"  MEASURE1="13.456" />
            <Obs TIME_PERIOD="2001" MEASURE1="34.56" />
            <Metadata agencyID="EXAMPLE" id="SERIES_LEVEL_MS">
                <common:Name xml:lang="en">Metadata at series level</common:Name>
                <ref:Metadataflow>urn:sdmx:org.sdmx.infomodel.metadatastructure.Metadataflow=EXAMPLE:TEST_MDF(1.0.0-draft)</ref:Metadataflow>
                <ref:Target>urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=EXAMPLE:DF_EXAMPLE(2.0.0-draft)</ref:Target>
                <ref:Attribute id="CONTACT">
                    <ref:Attribute id="CONTACT_ORGANISATION">
                        <ref:Value>BIG_ORG</ref:Value>
                    </ref:Attribute>
                </ref:Attribute>
            </Metadata>
        </Series>

Is this correct ?

Or this applies only for datasets that use DSD that doesn't reference a MSD & Metadata attributes ? E.g. they are used adhoc like annotations ?

Just in case in the example at page 14 at https://sdmx.org/wp-content/uploads/SDMX_3-0-0_Major_Changes_FINAL-1_0.pdf it doesn't include any of the maintainable mandatory XML attributes/elements.

Data message to fully replace data

Allow for an easy way to request an SDMX store to delete all data within a specific scope prior to adding the submitted data.
This could be through a new action type or through a more flexible wildcard syntax in the data itself.
It is to be clarified, if such changes are to be logged and available through the updatedAfter or includeHistory parameters, in case the implementation supports those parameters.

GeospatialInformation and WKT question

Hello

(My apologies if this is not the correct place to post questions about SDMX 3.0.0)

I am having problems understanding the GeospatialInformation text type as described in the Technical Notes in order to produce a sample dataset. If I understood correctly the format is one of the following:

  1. { <WKT> } <: free text>?
  2. <crs code>?, <precision>?: { <WKT> } <: free text>?

is this correct ?

For example given the following WKT

POLYGON ((5.756835937499999 49.46633911082605, 6.51763916015625 49.46633911082605, 6.51763916015625 49.84860975344834, 5.756835937499999 49.84860975344834, 5.756835937499999 49.46633911082605))

and the DSD from the sample the expected format in a dataset would look something like the following ?

<Series INDICATOR="COB1" AREA="{( POLYGON ((5.756835937499999 49.46633911082605, 6.51763916015625 49.46633911082605, 6.51763916015625 49.84860975344834, 5.756835937499999 49.84860975344834, 5.756835937499999 49.46633911082605)) ) }: Part of Luxembourg and surrounding areas">
            <Obs TIME_PERIOD="1999" OBS_VALUE="1.583993822393823"  />
</Series>

Merge SDMXMessageFooter.xsd and SDMXMessage.xsd

Received from IMF

Currently the header specifications are stored in SDMXMessage along with message specifications while footer specifications are split into a separate file.

We’d like to treat headers and footers similarly, simplify the XSD structures, as well as the SDMX message structure.

Because of this split today many SDMX files need to declare an additional namespace just for footers.

Simplify DSD Dimensions

After agreeing on keeping one way of defining Dimensions within a DSD (see here) we need to specify the Dimension definition in XML.

In addition, managing the dataset formats that will be available has to be decided. Considering the deprecation of the TimeDimension, it seems that the time-series specific dataset formats (generic and structure specific) are no longer relevant.

Handling of missing values

From @sosna
According to previous SDMX User Guides, NaN should be used (The guide from 2012 (i.e. published after SDMX 2.1) stated “The first in the first has an OBS_VALUE of “NaN”. This is an XML expression that declared a value of “not a number”, thus allowing a “missing value” to be declared"). More recent versions of the Guide have seen significant rewriting and that section has disappeared in the process (I don’t know why). The SDMX-ML 2.0 spec states: “In some of the SDMX-ML documents, an Observation is required (as in the Utility format) or it is desirable to indicate that a numerical value does not exist. While this information may be captured in an Observation-level attribute such as OBS_STATUS, with a code indicating that the value for the observation is missing, there is also a way to reliably indicate this state in the data itself. For this purpose, missing observation values – when included in an SDMX-ML data file – should be indicated using “NaN”. In XML, this indicates “not a number”, but is still valid in numeric fields. This avoids having to use a number (such as “-9999999” or “0”), along with a status code of “missing” (or similar construct) to indicate missing numeric values”.

The appropriate approach for missing values in SDMX-XML should be clarified.
The solution should also distinguish between “values to be set to 'missing'” and “values not to be changed when appending”.

Related ticket in SDMX-JSON: sdmx-twg/sdmx-json#122
Related Ticket in SDMX-CSV: sdmx-twg/sdmx-csv#27

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.