meaningful-data / sdmxthon Goto Github PK
View Code? Open in Web Editor NEWLibrary with SDMX to Pandas, Pandas to SDMX, SDMX validation and SDMX metadata validation
Home Page: https://docs.sdmxthon.meaningfuldata.eu/
License: Apache License 2.0
Library with SDMX to Pandas, Pandas to SDMX, SDMX validation and SDMX metadata validation
Home Page: https://docs.sdmxthon.meaningfuldata.eu/
License: Apache License 2.0
Add Changelog file and github link using Read The Docs theme
Implement methods to write XML from this classes and then submit them to FMR.
Hi, thank you for this library.
I am trying to read an sdmx file (at this url: 'https://www.i14y.admin.ch/api/CodeLists/CL_HGDE_KT/exports/SDMX-ML/2.1?annotations=false') and I get errors due to the date string format. The library only allows the format %Y-%m-%d but I have date time format of the type 1978-12-31T23:00:00 in my file. Would it be possible to update the library to allow this format ?
Something like this in 'set_date_from_string' in model/utils.py
def set_date_from_string(value: str, format_: str = "%Y-%m-%dT%H:%M:%S"):
"""Generic function to format a string to datetime
Args: value: The value to be validated.
format_: A regex pattern to validate if the string has a specific format
Returns:
A datetime object
Raises:
ValueError: If the value violates the format constraint.
"""
if value is None:
return None
for fmt in (format_, "%Y-%m-%d", "%Y-%m-%dT%H:%M:%S"):
try:
return datetime.strptime(value, fmt)
except ValueError:
pass
raise ValueError(f"Wrong date string format. The formats {format_} "
f"or %Y-%m-%d or %Y-%m-%dT%H:%M:%S "
f"should be followed. {str(value)} passed")
Thank you :)
Fix bug on get_codes as it raises an unhandled exception if the attribute descriptor is None.
message = read_sdmx('https://stats.bis.org/api/v1/data/BIS,WS_CBPOL_D,1.0/all/all?lastNObservations=1&detail=full')
dataflow_code = list(message.content.keys())[0]
dataset = message.content[dataflow_code]
print(dataset.dataflow)
returns None, shouldn't it return the dataflow?
Implement method on DataSet class to validate data using SDMX-CSV from the Pandas Dataframe.
Signature of method should allow the definition of the FMR host (domain and port, on separated arguments. Port must be an integer between 1 and 65535). Even on remote it is asynchronous, the process should wait until the validation has status completed and returns the response from FMR (in the future we may change this)
Endpoints to be used:
https://fmrwiki.sdmxcloud.org/Asynchronous_Data_Validation_and_Transformation_Web_Service
Process:
https://fmrwiki.sdmxcloud.org/Data_Validation_Web_Service#Dataset_with_Errors
Add the webservices docstrings in functions and a brief description with .rst files.
Now, it is mandatory to have parameters (for instance, for BIS, Agency Code), when most of the times we want to query a specific option (e.g., for BIS, if nothing is passed, BIS agency should be selected)
Add support for SDMX-CSV in get_pandas_df and get_datasets. Make sure the limitations of these methods are still present (data file checking and same output)
Add tests for the following items:
Adapt read_sdmx to digest files and use convenience method to read SDMX-ML, SDMX-CSV and SDMX-JSON
Running the following code returns a NoneType
. Is this expected?
import sdmxthon
url = "https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/dataflow/ESTAT/MET_EDAT_LFSE4/1.0?detail=full"
message = sdmxthon.read_sdmx(url)
resource = message.payload[list(message.payload)[0]]
df = resource[list(resource)[0]]
df.structure
I would expect it returns <s:Structure>
, for example:
<Ref id="MET_EDAT_LFSE4" version="18.0" agencyID="ESTAT" package="datastructure" class="DataStructure"/>
Is this correct or is there another way to handle it?
Thank you very much in advance.
Change message payload to retrieve the sole object if only one of one kind is found:
Add flag use_dataset_id on get_pandas_df
Check error sdmxthon/parsers/data_read.py", line 135, in create_dataset
df = pd.DataFrame(dataset[OBS]).replace(np.nan, '')
Should be able to go through when Dataframe has only 1 record.
Add functionality to validate a dataset column with the codes of a Codelist, passed as argument to the function.
It must support unique_id as argument and download of the codelist from the WebService, if available, as well as passing the Codelist object.
For reader, use lxml.etree to iterate over data and get pandas dataframe. Add Streaming option.
For writer, use this: https://stackoverflow.com/a/43567758
Explore how to handle streaming reading and writing.
Running the following does not return any code
import sdmxthon message = sdmxthon.read_sdmx("https://www.ilo.org/sdmx/rest/codelist/ILO/CL_AREA/1.0?detail=full") message.content["Codelists"]['ILO:CL_AREA(1.0)'].items
However, running the same code for Eurostat seems to work properly:
message = sdmxthon.read_sdmx("https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/codelist/ESTAT/GEO/1.0?detail=full") message.content["Codelists"]['ESTAT:GEO(1.0)'].items
Could you kindly help with this issue? Thank you
message = read_sdmx('https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/data/BOP_C6_A/?lastNObservations=1', validate=True)
If a dataset does not have a structure associated, calling the semantic_validation method should raised a specific exception
Revision dataset attributes bug on _check_DA_keys when removing a key in the Dataset attributes.
Revision SDMX-CSV v2 to include action column (default value I -> Inform)
Implement method in Message class that allows a Structure message to be submitted into FMR.
Documentation on FMR endpoint: https://fmrwiki.sdmxcloud.org/Submit_Structures_Web_Service
Requires basic authentication and an admin account.
Change read_sdmx to use webservices by downloading any possible metadata we may find using the query builder.
I'm following the notebook for SDMX comprehensive example, but I keep getting an error while initializing the Dataset object. (See attached)
Any thoughts on how this can be resolved?
Add support for XML files as string on all methods, focusing on read_sdmx
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.