Coder Social home page Coder Social logo

cdo-api-py's Introduction

PyPI version

cdo-api-py

Python interface to cdo api, which is described in full detail here Built to allow quick and easy query for weather data to pandas dataframe objects.

NOTICE

In communication with the team maintaining the API this package points to, I've learned this service is scheduled for deprication, though no announcement has been made. The new services are located here and If no suitable python interface is available for this service I may write one myself time permitting.

Installation

pip install cdo-api-py

or for python3

pip3 install cdo-api-py

Example Use

To start, you'll need a token, which you can request here.

Import a few libraries and instantiate a client. default_units and default_limit are optional keyword arguments.

from cdo_api_py import Client
import pandas as pd
from datetime import datetime
from pprint import pprint
token = "my_token_here"     # be sure not to share your token publicly
my_client = Client(token, default_units=None, default_limit=1000)

# See issue #3, explicit use of the `units` argument returns <500>, this is a bug in the API.
# when they've fixed the bug 'metric' or 'standard' should work as below:
# my_client = Client(token, default_units='metric', default_limit=1000)

Once a client has been initialized, we can define a few variables to outline what we really want. Since this repo is just a python client to interface with the CDO api, the user has the option to use keyword arguments that are passed directly to the API and aren't detailed here, so you may need to browse the options available for the dataset of choice.

The example we will use is the very common GHCN-Daily (ghcnd) weather set. We have found the north, south, east, and west lat/lon coordinates that describe the bounding box of the general Washington DC area. Next we define the dates we're interested in (optional) and the dataset id. As an added step, we really want specific values from the dataset so lets save those in a list as well as datatypeid (optional).

extent = {
    "north": 39.14,
    "south": 38.68,
    "east": -76.65,
    "west": -77.35,
}

startdate = datetime(2016, 12, 1)
enddate = datetime(2016, 12, 31)

datasetid='GHCND'
datatypeid=['TMIN', 'TMAX', 'PRCP']

Now we pass all these into a single function call to our client my_client to find stations of interest. We can use return_dataframe=True to automatically assemble the information into a dataframe.

stations = my_client.find_stations(
    datasetid=datasetid,
    extent=extent,
    startdate=startdate,
    enddate=enddate,
    datatypeid=datatypeid,
    return_dataframe=True)
pprint(stations)

Now that we have a list of stations that have data useful to us, we can iterate through the list of stations and pass the stationid argument to a get_data_by_station method.

for rowid, station in stations.iterrows():  # remember this is a pandas dataframe!
    station_data = my_client.get_data_by_station(
        datasetid=datasetid,
        stationid=station['id'],
        startdate=startdate,
        enddate=enddate,
        return_dataframe=True,
        include_station_meta=True   # flatten station metadata with ghcnd readings
    )
    pprint(station_data)

We can modify this slightly to concatenate all the small dataframes into one big dataframe and save it as a CSV.

big_df = pd.DataFrame()
for rowid, station in stations.iterrows():  # remember this is a pandas dataframe!
    station_data = my_client.get_data_by_station(
        datasetid=datasetid,
        stationid=station['id'],
        startdate=startdate,
        enddate=enddate,
        return_dataframe=True,
        include_station_meta=True   # flatten station metadata with ghcnd readings
    )
    pprint(station_data)
    big_df = pd.concat([big_df, station_data])

print(big_df)
big_df = big_df.sort_values(by='date').reset_index()
big_df.to_csv('dc_ghcnd_example_output.csv')

see all the example code here: DC weather data example

It may take a bit of manual searching to familiarize yourself with the NOAA CDO offerings, but once you figure out the arguments you'd like to use, this client should make it quite easy to automate weather data retrievals. There are many requirements and limits as to the nature of requests that the server will allow, and this client will automatically determine if a request must be split up into multiple smaller pieces and create them, send them, and piece the results back together into a single coherent response without any additional effort.


You can explore the endpoints available, either at the CDO documentation site or quickly with

pprint(my_client.list_endpoints())

# returned at time of writing
{'data': 'A datum is an observed value along with any ancillary attributes at '
         'a specific place and time.',
 'datacategories': 'A data category is a general type of data used to group '
                   'similar data types.',
 'datasets': 'A dataset is the primary grouping for data at NCDC',
 'datatypes': 'A data type is a specific type of data that is often unique to '
              'a dataset.',
 'locationcategories': 'A location category is a grouping of similar '
                       'locations.',
 'locations': 'A location is a geopolitical entity.',
 'stations': 'A station is a any weather observing platform where data is '
             'recorded.'}

At the time of writing, there are about 11 available datasets, they are ['GHCND', 'GSOM', 'GSOY', 'NEXRAD2', 'NEXRAD3', 'NORMAL_ANN', 'NORMAL_DLY', 'NORMAL_HLY', 'NORMAL_MLY', 'PRECIP_15', 'PRECIP_HLY']. View the full details with:

pprint(my_client.list_datasets())

There are more than 1000 datatypes, but you can see them all with

pprint(my_client.list_datatypes())

TODO:

  • Another example or two for non GHCND
  • Build a gh-pages branch with sphinx

cdo-api-py's People

Contributors

jwely avatar remram44 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

cdo-api-py's Issues

The library does not support python 2

From readme, I thought the library supports python 2 and 3. But after I installed the library under python 2.7, I found 'from cdo_api_py import Client' produced a lot of errors related to python 2, e.g., 'yield from' and 'join([..., *args]).

It would be great if you can kindly modify the code to support python 2. Otherwise, change readme to inform potential users of this characteristic. it would save people a lot of time, in my opinion.

Cannot access January/February data

Using the suggested my_client.get_data_by_station() method, data is only pulled for Feb 28-Dec 31 of any given calendar year, even if entire year was specified with startdate and endnote.

Using an alternative method with requests, I am able to access all data (see below), but this has more limited capabilities.
resp = requests.get(f"https://www.ncei.noaa.gov/access/services/data/v1?dataset=daily-summaries&dataTypes={datatypeid}&stations={stationid}&startDate={start_date}&endDate={end_date}&units=standard", headers={"Token": token})

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.