sdmx-twg / sdmx-ml Goto Github PK
View Code? Open in Web Editor NEWThis repository is used for maintaining the SDMX-ML format specification
This repository is used for maintaining the SDMX-ML format specification
It is common in any one series that not all dimensions play an active role – some are “not applicable”, but all dimensions must be present and contain a value in the data set. There is no syntactic way of depicting that a Dimension plays no role
The stop-gap solution of arbitrary codes (_Z, _T) aren't supported by the information model.
Hi
(in case it is not fixed already)
It is about an inconsistency between public draft Technical Notes and a sample here in sdmx-ml concerning the concept role id.
(relates to #9 )
The Technical Notes line 1172 notes that the role is GEO_FEATURE_SET
:
1171 Any Component used for representing a Geographical Feature Set, i.e., used to
1172 describe geographical characteristics, must have a “GEO_FEATURE_SET” role. Its
1173 Representation would be of textType="GeospatialInformation".
But in the sdmx-ml sample samples/Geospatial/geospatial_geocomponents.xml#L58 it has role GEO
.
In Section 6, page 94, it is stated that the GregorianYear, GregorianYearMonth and GregorianMonth must be converted to a VTL date.
Since either the day or the month parts are not present, this poses a challenge in choosing an exact date.
Perhaps the converted VTL type should be time_period?
That would allow to represent the entire year and/or month, with formatting adherent to the SDMX representation.
As far as I can tell, geospatial artefacts have multiple representations allowed by the schema definitions, all of these paths seem valid:
/m:Structure/m:Structures/s:Codelists[n]/s:Codelist[n]/s:GeoFeatureSetCode[n]/@value
/m:Structure/m:Structures/s:GeographicCodelists[n]/s:GeographicCodelist[n]/s:GeoFeatureSetCode[n]/@value
/m:Structure/m:Structures/s:Codelists[n]/s:Codelist[n]/s:GeoGridCode[n]/s:GeoCell
/m:Structure/m:Structures/s:GeoGridCodelists[n]/s:GeoGridCodelist[n]/s:GeoGridCode[n]/s:GeoCell
The SDMX 3.0.0 Registry specification also does not define special URN prefixes for geospatial artefacts, suggesting they use codelist.Codelist
which would suggest the representation under /m:Structure/m:Structures/s:Codelists[n]/
is the intended one, but the examples use the specialized representation, in which case using codelist.Codelist
seems inconsistent and one would expect they use URN prefixes of codelist.GeographicCodelist
and codelist.GeoGridCodelist
.
(In doubt, please handle this as a public review comment on SDMX 3.1 once the comment period begins.)
Decision has been taken to migrate the documentation to readthedocs.org.
We should take this opportunity to consolidate and improve the current documentation.
This repository will details the XML format part of the full documentation glued together in sdmx-im.
Hello,
Just in case it is a bug, It seems that the item Parent restriction doesn't accept values starting with a number, e.g. CL_REF_AREA.
https://github.com/sdmx-twg/sdmx-ml/blob/master/schemas/SDMXStructureCodelist.xsd#L77
sdmx-ml/schemas/SDMXCommonReferences.xsd
Line 1597 in 29f1a3d
Value '1' is not facet-valid with respect to pattern '[A-Za-z][A-Za-z0-9_\-]*(\.[A-Za-z][A-Za-z0-9_\-]*)*' for type 'SingleNCNameIDType'.line/column: 2/1846343cvc-type.3.1.3: The value '1' of element 'str:Parent'
But at the same time it is allowed to have a code that starts with a number
https://github.com/sdmx-twg/sdmx-ml/blob/master/schemas/SDMXStructureBase.xsd#L49
If the 'context' is wildcarded (or has multiple selections) and the same data are defined for DSD, Dataflow and PA, would the data then have to be returned multiple times, each time in its context? If not, which context should be used by default or in preference?
Handling incremental deletion of messages in SDMX-ML is described in Section 3A, Part IV, page 69.
The last sentence in that section looks sub-optimal. It reads as follows: "Finally, to delete a data attribute or observation value it is recommended that the value to be deleted be supplied; however, it is only required that any valid value be provided."
This looks sub-optimal considering that, in order to delete a particular attribute value, all I need to know is the attribute ID and the key of the element to which that attribute is attached. This simple, logical way is how SDMX-EDI works by the way. So, why, in SDMX-ML, are we asked to supply the attribute value? Worse, why is it OK to supply "any valid value", which is even more confusing?
For structure specific message, the XML specification allows empty attribute values (e.g. CONF_STATUS=""), so there is no technical reason why the attribute value must be provided.
For generic messages, the current syntax is as follows:
<generic:Value id="BIS_TOPIC" value="ABBA"/>
In that case as well, it would be sufficient in delete messages to write:
<generic:Value id="BIS_TOPIC" />
The only reason we could think of is that the schema generated for structure specific messages would need to be dependent on the action, i.e. there would be one schema for delete messages and one schema for the other action types. This can easily be addressed in the RESTful API though, by adding an action parameter to schema
queries.
Maybe that this could be addressed within the scope of SDMX 3.0?
Hi,
In SDMX 2.1 the default for isMultiLingual
XML attribute was true
but it was only used by Metadata attributes.
In SDMX 3.0.0 it continues to default to true
in XML, but now it is also used by data attributes/measures. Which means that in SDMX 3.0.0 all data attributes that have text type String
by default allow multilingual values.
Is this the expected behavior ?
The reason I ask is because it complicates backwards compatibility with SDMX 2.1 but also JSON v1/v2 which defaults to false
.
sdmx-ml/schemas/SDMXStructure.xsd
Line 230 in 29f1a3d
MetadataConstraing should be MetadataConstraint
Hello,
I am trying to build some examples for Dataset that includes metadata attributes for SDMX 3.0.0.
So in the Dataset I see I can include the <Metadata>
level under Dataset, Series, Group and Observation levels.
In the SDMXDataStructureSpecific.xsd , Metadata has type http://www.sdmx.org/resources/sdmxml/schemas/v3_0/metadata/generic: MetadataSetType
e.g.
https://github.com/sdmx-twg/sdmx-ml/blob/master/schemas/SDMXDataStructureSpecific.xsd#L161
This means for each <Metadata>
we need to provide id, agencyID, Name, Metadataflow Ref and Target as they are mandatory for MetadataSetType.
e.g.
<Series FREQ="A" CURRENCY="CAD" TIME_FORMAT="P1Y" >
<Obs TIME_PERIOD="1999" MEASURE1="123.456" />
<Obs TIME_PERIOD="2000" MEASURE1="13.456" />
<Obs TIME_PERIOD="2001" MEASURE1="34.56" />
<Metadata agencyID="EXAMPLE" id="SERIES_LEVEL_MS">
<common:Name xml:lang="en">Metadata at series level</common:Name>
<ref:Metadataflow>urn:sdmx:org.sdmx.infomodel.metadatastructure.Metadataflow=EXAMPLE:TEST_MDF(1.0.0-draft)</ref:Metadataflow>
<ref:Target>urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=EXAMPLE:DF_EXAMPLE(2.0.0-draft)</ref:Target>
<ref:Attribute id="CONTACT">
<ref:Attribute id="CONTACT_ORGANISATION">
<ref:Value>BIG_ORG</ref:Value>
</ref:Attribute>
</ref:Attribute>
</Metadata>
</Series>
Is this correct ?
Or this applies only for datasets that use DSD that doesn't reference a MSD & Metadata attributes ? E.g. they are used adhoc like annotations ?
Just in case in the example at page 14 at https://sdmx.org/wp-content/uploads/SDMX_3-0-0_Major_Changes_FINAL-1_0.pdf it doesn't include any of the maintainable mandatory XML attributes/elements.
Allow for an easy way to request an SDMX store to delete all data within a specific scope prior to adding the submitted data.
This could be through a new action type or through a more flexible wildcard syntax in the data itself.
It is to be clarified, if such changes are to be logged and available through the updatedAfter or includeHistory parameters, in case the implementation supports those parameters.
Received from IMF
Currently there are a mix of spaces and tabs across the XSD which can break the indenting in various systems.
Hello
(My apologies if this is not the correct place to post questions about SDMX 3.0.0)
I am having problems understanding the GeospatialInformation
text type as described in the Technical Notes in order to produce a sample dataset. If I understood correctly the format is one of the following:
{ <WKT> } <: free text>?
<crs code>?, <precision>?: { <WKT> } <: free text>?
is this correct ?
For example given the following WKT
POLYGON ((5.756835937499999 49.46633911082605, 6.51763916015625 49.46633911082605, 6.51763916015625 49.84860975344834, 5.756835937499999 49.84860975344834, 5.756835937499999 49.46633911082605))
and the DSD from the sample the expected format in a dataset would look something like the following ?
<Series INDICATOR="COB1" AREA="{( POLYGON ((5.756835937499999 49.46633911082605, 6.51763916015625 49.46633911082605, 6.51763916015625 49.84860975344834, 5.756835937499999 49.84860975344834, 5.756835937499999 49.46633911082605)) ) }: Part of Luxembourg and surrounding areas">
<Obs TIME_PERIOD="1999" OBS_VALUE="1.583993822393823" />
</Series>
Received from IMF
Currently the header specifications are stored in SDMXMessage
along with message specifications while footer specifications are split into a separate file.
We’d like to treat headers and footers similarly, simplify the XSD structures, as well as the SDMX message structure.
Because of this split today many SDMX files need to declare an additional namespace just for footers.
After agreeing on keeping one way of defining Dimensions within a DSD (see here) we need to specify the Dimension definition in XML.
In addition, managing the dataset formats that will be available has to be decided. Considering the deprecation of the TimeDimension
, it seems that the time-series specific dataset formats (generic and structure specific) are no longer relevant.
From @sosna
According to previous SDMX User Guides, NaN should be used (The guide from 2012 (i.e. published after SDMX 2.1) stated “The first in the first has an OBS_VALUE of “NaN”. This is an XML expression that declared a value of “not a number”, thus allowing a “missing value” to be declared"). More recent versions of the Guide have seen significant rewriting and that section has disappeared in the process (I don’t know why). The SDMX-ML 2.0 spec states: “In some of the SDMX-ML documents, an Observation is required (as in the Utility format) or it is desirable to indicate that a numerical value does not exist. While this information may be captured in an Observation-level attribute such as OBS_STATUS, with a code indicating that the value for the observation is missing, there is also a way to reliably indicate this state in the data itself. For this purpose, missing observation values – when included in an SDMX-ML data file – should be indicated using “NaN”. In XML, this indicates “not a number”, but is still valid in numeric fields. This avoids having to use a number (such as “-9999999” or “0”), along with a status code of “missing” (or similar construct) to indicate missing numeric values”.
The appropriate approach for missing values in SDMX-XML should be clarified.
The solution should also distinguish between “values to be set to 'missing'” and “values not to be changed when appending”.
Related ticket in SDMX-JSON: sdmx-twg/sdmx-json#122
Related Ticket in SDMX-CSV: sdmx-twg/sdmx-csv#27
Reference Metadata attributes may only have one single value because each attribute value may have different child attributes.
See full description in closed ticket sdmx-twg/sdmx-json#123
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.