Coder Social home page Coder Social logo

sdmx's People

Contributors

amattioc avatar bowerth avatar charphi avatar dependabot[bot] avatar lmorandini avatar ma-bdi avatar pdgilbert avatar sosna avatar viguice avatar vpinna80 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sdmx's Issues

Allow for specification of encoding in getCodes() (RJSDMX)

When I request Codes from Eurostat in R using RJSDMX, non-Ascii characters are not displayed correctly. See the following example:

> c_GEO <- getCodes("EUROSTAT","demo_r_pjanaggr3","GEO")
> c_GEO[[2]] # should return "Thüringen"
[1] "Thüringen"

I ran the result through all available encodings, but didn't succeed to retrieve the correct value

> str <- sapply(iconvlist(),function(x){iconv(c_GEO[[2]], from=x, to="")[[1]]})
> grep("Thüringen",str,value = T)
named character(0)

Would it be possible to pass an encoding as argument to getCodes() and similar functions to remedy this situation?

I am aware that this might be a problem of my specific setup, however I am unsure which parameters to report. Please let me know if you need additional session-info.

Apart from that: Great package! Thanks for the effort!

Selecting Value for FREQ or REF_AREA dimension renders Worldbank IDs invalid

With Worldbank as provider, it is possible to create valid queries only as long as no selection for FREQ or REF_AREA has been carried out. For example testing the query WDI/.BX_KLT_DINV_WD_GD_ZS. identifies 248 time series while WDI/.BX_KLT_DINV_WD_GD_ZS.GBR returns an error.

I don't know if this is a problem with the World bank (Beta provider) or if this is related to RJSDMX.

minor enhancement of Wiki

On the RJSDMX Wiki page the install_github() command should be changed from

install_github(repo = "SDMX", username = "amattioc", subdir = "RJSDMX")

to

install_github(repo = "amattioc/SDMX", subdir = "RJSDMX")

as the username parameter has been depracated.

This is a clearly minor issue and filing an issue somewhat excessive. What would be your suggestion for reporting textual changes in the Wiki? I am still new to git / github and do not know yet how to address these things properly in the given framework.

suppress messages in an R session

It should be possible to completely suppress messages in an R session, but I think the java code is printing output even in the case when the configuration file sets levels to WARNING. (Perhaps level OFF should be available?)

z <- try(getSDMX("OECD", 'G20_PRICES.CAB.CP.IXOB.M'), silent=TRUE)
Dec 05, 2014 11:46:49 AM it.bankitalia.reri.sia.sdmx.client.RestSdmxClient runQuery
SEVERE: Connection failed. HTTP error code : 404, message: Not Found
SDMX meaning: No results matching the query.

The warning does not need to be printed, it is available to the R programmer:

attr(z,"condition")$message
[1] "it.bankitalia.reri.sia.util.SdmxException: Connection failed. HTTP error code : 404, message: Not Found\nSDMX meaning: No results matching the query."

sdmxHelp enhancement request

It would be a lot more useful if there was a way to show remaining dimensions of a DSD available on the sdmxHelp app once one dimension was selected. For example, if I am looking at the ECB, and a specific dataflow, if I limit it to quarterly FREQ, it would be nice to at least get a a full codelist of remaining dimensions available, but also to know that other dimensions become unavailable when you select quarterly frequency. As it is now, there's no real way to determine if you have constructed a valid dataflow code, as each dimension of a DSD is given separately. It's not a problem on most of the other Euro government websites because their web interface is usable, but the ECB's web interface for searching available codes is so clunky as to be unusable for serious researchers who need multiple series. BTW, thanks for this package, I've been trying to find some other way of getting code definitions by installing all the SDMX tools provided through sdmx.org and the documentation is not very helpful.

make getSDMX() return error code for non-successful retrieval

Sometimes getSDMX() is not able to retrieve the requested data. For Eurostat this is usually the case in the following two scenarios:

There is no data available for the specified ID e.g.
getSDMX(provider = "EUROSTAT",id = "teilm310.A.NSA.JOBRATE.TOTAL.C.")

The amount of data is beyond a specific size limit e.g.
getSDMX(provider = "EUROSTAT",id = "nrg_110a.A.KTOE.2410..")

In both cases I get an error message that reads The query: XXX did not match any time series on the provider.

While this is ok for downloading a single dataset it is a problem when I want to download a list of dataflow-ids using lapply(). Once an error is encountered, the lapply() breaks. Would it be possible to return an error code in these cases which I could use in a condition to avoid breaking the lapply() code?

An example of this situation is illustrated in the following gist
https://gist.github.com/Tungurahua/a1aa7044a3a46c8b8eec
in which for a list of dataflows the some information is extracted.

In both cases the EUROSTAT server returns a html message containing an error message. It would be ideal if this message could be returned in the R console. But as usual I don't know if this behaviour can be generalized over different REST-providers.

Best
Albrecht

How to un-select a dimension?

Is there a way to un-select a dimension? Once I have selected e.g. an annual frequency, the only way to clear the frequency dimension is to select another data flow. The expected behaviour would be that Ctrl + Click not only clears the selection field but also the selection in the REST-ID. This does not work if a single item is to be de-selected. This is especially cumbersome when working with WB indicators as they do not distinguish individual indicators by FlowID but by the SERIES dimension.

See example below:

Annual frequency selected turns up in REST-ID
snip01

Annual frequency de-selected does not clear in REST-ID
snip02

Default proxy

Handle a new conf key for default proxy, useful with single proxy environments

Manage SDMX Compact/Generic data files

Add a new function for importing SDMX data files from disk or from network URL. Useful for providers that do not provide web services but disseminate SDMX data.

SDMXHelper2 - generic issue for tracking the evolution of the new helper

For data providers with a large number of flows (e.g. Eurostat/ILO) it is almost impossible to find a specific flow as they are in random order. From a user perspective I could think of the following improvements:

  • sort dataflows alphabetically by code
  • display the dataflows in a grid that's sortable either by code or description. Display in a grid would although improve readability.
  • Make the list of dataflows searchable

However, I don't know how difficult each of them would be to implement.

Using the package in R I think that getProviders() and getFlows() do a good job for identifying a dataflow or a group of dataflows. However, the selection of dimensions and codes is where a gui really comes in helpful. Thus another idea would be to allow the helper to accept a combination of data-provider / dataflow as input parameters. Something like sdmxHelper(provider="EUROSTAT", flow="nrg105") which would open the helper with this selection set.

Java OECD client returns HTTP error 400

I was testing your Java components, but it seems to me OECD client doesn't work. I tried this simple request:

try {
    GenericSDMXClient client = SDMXClientFactory.createClient("OECD");
    client.getDataflows();
} catch (SdmxException e) {
    e.printStackTrace();
}

and I got

III 18, 2015 10:39:39 DOP. it.bancaditalia.oss.sdmx.client.RestSdmxClient runQuery
INFO: Contacting web service with query: http://stats.oecd.org/restsdmx/sdmx.ashx//GetDataStructure/ALL
III 18, 2015 10:39:39 DOP. it.bancaditalia.oss.sdmx.client.RestSdmxClient runQuery
SEVERE: Connection failed. HTTP error code : 400, message: Bad Request
SDMX meaning: there is a problem with the syntax of the query
it.bancaditalia.oss.sdmx.util.SdmxException: Connection failed. HTTP error code : 400, message: Bad Request
SDMX meaning: there is a problem with the syntax of the query
    at it.bancaditalia.oss.sdmx.client.RestSdmxClient.runQuery(RestSdmxClient.java:288)
    at it.bancaditalia.oss.sdmx.client.custom.DotStat.getDataflows(DotStat.java:109)
    at eu.keyup.kejml.sdmx.App.main(App.java:14)

ABS wildcard white space error

works:
tts <- getSDMX('ABS', 'CPI.1.50.10001.10.Q')
names(tts)
[1] "CPI.1.50.10001.10.Q"

fails:
tts <- getSDMX('ABS', 'CPI.1.*.10001.10.Q')
Dec 12, 2014 2:10:41 PM it.bankitalia.reri.sia.sdmx.client.custom.RestSdmx20Client getTimeSeries
SEVERE: Exception caught parsing results from call to provider ABS
Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, :
it.bankitalia.reri.sia.util.SdmxException: Exception. Class: javax.xml.stream.XMLStreamException .Message: ParseError at [row,col]:[1,63]
Message: White spaces are required between publicId and systemId.

getFlows('IMF') not working.

getFlows('IMF')
Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, :
it.bancaditalia.oss.sdmx.util.SdmxException: Connection failed. HTTP error code : 404, message: Not Found
SDMX meaning: No results matching the query.

It appears this web-services query from the SDMX Helper Tool no longer works:
http://sdmxws.imf.org/RestSDMX2/sdmx.ashx/GetKeyFamily/ALL

This link seems to suggest no RESTful way of obtaining this list.
http://sdmxws.imf.org/Gateway/Help.aspx

Any thoughts on how to get this? Thanks.

updatedAfter support

I was wondering if RJSDMX has the possibility of handling sdmx queries with a updatedAfter parameter? I haven't been able to find any worked examples.
Does it require that the publisher is already using version 2.1 of SDMX? The link below is a replica of section 7 of the official SDMX standards, which suggests that you should be able to run a query that just returns the deltas since the last query ... any new INSERTS, UPDATES and DELETES.
https://github.com/sdmx-twg/sdmx-rest/blob/master/v2_1/ws/rest/docs/4_4_data_queries.md

As a use-case, today's IMF global growth revisions - could a query into their REST service effectively return just those values that have been revised, if you used the last forecast date as 'updatedAfter' parameter.

Thanks for your insight.

Scope of RJSDMX

This is a question related to my sidenote in #46. Have you developed an idea for the overall scope of the SDMX package and RSDMX package in general? Is it intended as a low-level entry point to SDMX data which other packaged are supposed to build upon, of is the intention to create a one-stop SDMX-data-provider?

I am asking this as for the sources I work with (EUROSTAT) I think it would make sense to wrap the retrieved data (including dictionaries) into an object structure (S3, S4). I think this would have the following advantages:

  • Plot-labels could be extracted from the same object as the data
  • A seperate slot could be used to store meta-data like the date and time of download
  • The structure could leveraged by a fortify method for ggplot2
  • Allow for an overall reproducible manner to extract an visualize Eurostat data.

Since this might make sense for Eurostat data (which is mostly of annual resolution) it could be problematic for data with higher resolution like exchange rates, so I don't know if such an object structure should be tackled in RJSDMX or maybe rather in a downstream package tailored to a specific data provider (e.g. RJEUROSTAT).

If you have any thoughts on that matter I would be keen to hear your opinion. My overview of the SDMX process as well as R-programming is currently quite limited, so I would be happy to get an expert opinion.

Best
Albrecht

Does ILO server work?

I have not been able to get anything from ILO:
require("RJSDMX")

z <- getFlows('ILO')
Using central configuration: /home/paul/.SdmxClient
Dec 05, 2014 11:05:16 AM it.bankitalia.reri.sia.sdmx.client.RestSdmxClient runQuery
SEVERE: Connection failed. HTTP error code : 502, message: Bad Gateway

Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, :
it.bankitalia.reri.sia.util.SdmxException: Connection failed. HTTP error code : 502, message: Bad Gateway

tts <- getSDMX("ILO", 'EAP_TEAP_SEX_AGE_NB.AUS...*')
Dec 05, 2014 11:20:31 AM it.bankitalia.reri.sia.sdmx.client.RestSdmxClient runQuery
SEVERE: Connection failed. HTTP error code : 404, message: Not Found
SDMX meaning: No results matching the query.
Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, :
it.bankitalia.reri.sia.util.SdmxException: Connection failed. HTTP error code : 404, message: Not Found
SDMX meaning: No results matching the query.

empty result should throw error

In the case of no series being returned I think rather than returning an empty list it may be better to throw an error, "series ___ not found" or "no data available for series ___", if those can be distinguished. e.g.

z <- getSDMX('IMF', 'PGI.CA.BIS.FOSLB.A.L_M')
str(z)
Named list()

length(z)
[1] 0

sdmxHelp() does not clear dimensions when moving to unretrievable dataflow

If you switch from a functioning dataflow1 to a erroneous dataflow2 the "Select Dimensions" retains the "Select Dimensions" of dataflow1. The field should be empty instead.
To illustrate the behaviour, select Eurostat as provider, sort Code ID alphabetically and change the code ID from sts_trtugr_q to t2020_10. As you can see in the following screenshots, the Code at the beginning of the REST-ID stays unchanged as well as the content of the "Select Dimensions" box.

Note that t2020_10 returns a server 404 error which seems to be due to an error at Eurostat that relates to all t2020 indicators at the moment. The behavior will not be reproducable if the server is reachable again.

Graying out Dimensions and Code selection boxes if server is unreachable would be a great feat.

screen shot 2015-03-24 at 21 29 18
screen shot 2015-03-24 at 21 30 27

Edit | Copy should always grab REST ID

I could be wrong here, but I think the right behaviour for the Edit | Copy Menu should always be to move the current REST-ID to the clipboard. Right now it returns the ID if it has been selected before.

sdmxhelp() does not start on OsX

sdmxhelp() does not work under OsX. When I run the command I get the following error message:

> sdmxHelp()
Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl,  : 
  java.awt.HeadlessException

If I understand it correctly R tries to run RJSDMX/java/SDMX.jar. I found the.jar` but couldn't run it from the terminal:

> RJSDMX/java$ java -jar SDMX.jar
no main manifest attribute, in SDMX.jar

I am stabbing in the dark here so it is quite possible that I am completely on the wrong path. Can you provide any hints how to get sdmxhelp() to work?

With best regards
Albrecht

Build Failed: Unexpected element "{}html" {antlib:org.a pache.tools.ant}html

This seems to be a pretty obscure issue. I've failed to resolve it through Google and following similar instances of the issue for others. But any idea why I might be getting the below error message when trying to build the Java library?

c:\E\New\MATLAB\work\SDMX>ant SDMX
Buildfile: c:\E\New\MATLAB\work\SDMX\build.xml

BUILD FAILED
c:\E\New\MATLAB\work\SDMX\build.xml:6: Unexpected element "{}html" {antlib:org.apache.tools.ant}html

Total time: 0 seconds

getSDMX bug in request having one empty series

There is an error in this:
tts0 <- getSDMX('IMF', 'PGI.CA....')
[1] "Number of observations and time slots equal to zero,
or not matching: 0 0"
Error in FUN(X[[534L]], ...) : object 'tts' not found

This is happening in the call
result = lapply(X = rList, FUN = convertSingleTS)
rList[[534L]] seems to have no data so (I'm guessing) in
convertSingleTS it prints the message "Number of observations..."
and does not assign tts, but then tries to return tts.

I think you may want an empty series here, so everything else retrieved is not discarded. If not, this really should be an error message not a print statement.

possibility of 'getSDMXDelta' function for usage in an ETL system

@amattioc As per your suggestion in #25, having a dedicated function to deriving relevant metadata and data from a SDMX query that specifies 'updatedAfter' and 'includeHistory' parameters. In #25 I highlight the absense of any contextual metadata being returned from a query that includes these parameters. The use-case is mainly for a system that is looking for the latest changes to a dataset that have been applied since a previous query (set as the updatedAfter base date).

There is some relevant discussion on the implementation of these here: sdmx-twg/sdmx-rest#17

In conjunction with the data values, a user might specify what other meta-data pertaining to those values should be returned. Perhaps as a default, it would include the following:
'action', 'validFromDate', 'OBS_STATUS', 'OBS_CONF' and 'OBS_COM'.

Perhaps others have additional ideas.

Thanks. Colum

sdmxHelp() not starting under windows

sdmxHelp() no longer starts under window.

> sdmxHelp()
Warning message:
running command 'java -classpath C:/Program Files/R/R-3.1.2/library/RJSDMX/java/SDMX.jar    
it.bancaditalia.oss.sdmx.helper.SDMXHelper' had status 127 

I could open last weeks versions without problem

Also, how can I start sdmxHelp2() under Windows. Setting of a system command does not work here as java is not recognized as a command in the shell.

Any hints would be appreciated.

Best regards
Albrecht

query with + or |

It would be nice if a query with + or | would return results in the order of the query specification. (Enhancement request)

adding new Provider

I am trying to add FAO sdmx as a new provider using addProvider(). However I am unsure what an "endpoint" is in this context. My first attempt has been:

# no Error received
addProvider("FAO","http://data.fao.org/sdmx")

# lists FAO as new source
getProviders()

# throws error
getFlows("FAO")

I suppose the url "http://data.fao.org/sdmx" is wrong. I tried to view the other Sources, but I understand that they are somewhere in the java code (?). Is there some resource that explains in more detail what I have to look out for when entering a new "endpoint"? Any hints would be appreciated.

ILO takes an extra year off when end is specified for annual data

tts <- getSDMX("ILO", "DF_YI_ALL_EMP_TEMP_SEX_AGE_NB/YI.MEX.A.463.EMP_TEMP_NB.SEX_F.AGE_10YRBANDS_TOTAL", start="1995", end="2012")
2012 specified but:
end(tts[[1]])
[1] 2011

This does not happen on OECD:
tts <- getSDMX('OECD', '7HA_A_Q.CAN.AF411LI.ST.C.A', start="2001", end="2009")
end(tts[[1]])
[1] 2009

uncertain if INEGI works

tts <- getSDMX("INEGI", 'DF_COMTRADE.Q.MX.TOTAL.CAN...USD.Z.CAN.*')
returns an empty list, as does everything else I have tried with fields replaced by *, up to next which is very slow and fails with

tts <- getSDMX("INEGI", 'DF_COMTRADE..........')
Dec 04, 2014 8:12:40 PM it.bankitalia.reri.sia.sdmx.client.RestSdmxClient runQuery
SEVERE: Exception caught calling provider INEGI
Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, :
it.bankitalia.reri.sia.util.SdmxException: Exception. Class: java.net.SocketException .Message: Connection reset

I have not managed to get any data from INEGI, do you have an example that works?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.