Coder Social home page Coder Social logo

Climatology functions about esmlab HOT 4 OPEN

ncar avatar ncar commented on July 1, 2024
Climatology functions

from esmlab.

Comments (4)

ahuang11 avatar ahuang11 commented on July 1, 2024

I have a to_seasons function which converts integer months into seasonal acronyms:

import numpy as np
from operator import itemgetter
MONTHS_STR = 'JFMAMJJASOND' * 3

def to_seasons(months, interval=3, rolling=True):
    """
    Converts months as integers into groups of seasonal abbrieviations.
    
    Args:
        months (arr): months represented as integers
        interval (int): the number of months to aggregate together
        rolling (bool): whether to use a sliding window of seasons

    Examples:
        >>> print(to_seasons([1, 2, 3, 4]))
        ['JFM', 'FMA', 'MAM', 'AMJ']

        >>> print(to_seasons([1, 2, 3, 4], interval=2))
        ['JF' 'FM' 'MA' 'AM']

        >>> print(to_seasons([1, 2, 3, 4], rolling=False))
        ['DJF', 'DJF', 'MAM', 'MAM']

        >>> print(to_seasons([1, 2, 3, 4], interval=2, rolling=False))
        ['DJ' 'FM' 'FM' 'AM']
    """
    month_arr = np.array(months).astype(int)
    if rolling:
        season_mapping = {
            m + 1: MONTHS_STR[m:m+interval] for m in range(12)}
        return np.array(itemgetter(*month_arr)(season_mapping))
    else:
        season_arr = np.array(
            [MONTHS_STR[m:m+interval] for m in range(
                11, 23, interval)])
        # selecting 11 to 23 because I want to start with DJF
        return season_arr[(month_arr // interval) % (12 // interval)]
print('### rolling=True')
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
to_seasons([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], rolling=True)

print('### rolling=False')
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
to_seasons([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], rolling=False)
### rolling=True
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
array(['JFM', 'FMA', 'MAM', 'AMJ', 'MJJ', 'JJA', 'JAS', 'ASO', 'SON',
       'OND', 'NDJ', 'DJF'], dtype='<U3')
### rolling=False
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
array(['DJF', 'DJF', 'MAM', 'MAM', 'MAM', 'JJA', 'JJA', 'JJA', 'SON',
       'SON', 'SON', 'DJF'], dtype='<U3')

These can be used to create rolling/grouped seasonal mean functions.

def month_to_season_rolling(ds, interval=3):
    ds_roll_sson = ds.rolling(
        time=interval, center=True).mean().dropna('time')
    ds_roll_sson.coords['month'] = ds_roll_sson['time.month']
    ds_roll_sson['season'] = ('time', to_seasons(
        ds_roll_sson['month'], interval=interval))
    return ds_roll_sson

def month_to_season_group(ds, interval=3):
    ds_group_sson = ds.copy()
    ds_group_sson.coords['month'] = ds_group_sson['time.month']
    ds_group_sson['season'] = ('time', to_seasons(
        ds_group_sson['month'], interval=interval, rolling=False))
    ds_group_sson = ds_group_sson.groupby('season').mean(xr.ALL_DIMS)
    return ds_group_sson

ds = xr.tutorial.open_dataset('air_temperature').resample(time='1MS').mean()
print('### rolling')
month_to_season_rolling(ds)
print('### group')
month_to_season_group(ds)
### rolling
<xarray.Dataset>
Dimensions:  (lat: 25, lon: 53, time: 22)
Coordinates:
  * time     (time) datetime64[ns] 2013-02-01 2013-03-01 ... 2014-11-01
  * lat      (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0
  * lon      (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0
    month    (time) int64 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11
Data variables:
    air      (time, lat, lon) float32 244.49214 244.36497 ... 298.7022 298.74713
    season   (time) <U3 'FMA' 'MAM' 'AMJ' 'MJJ' ... 'ASO' 'SON' 'OND' 'NDJ'
### group
<xarray.Dataset>
Dimensions:  (season: 4)
Coordinates:
  * season   (season) object 'DJF' 'JJA' 'MAM' 'SON'
Data variables:
    air      (season) float32 273.63284 289.18707 278.98642 283.02075

Does this meet the following criteria? I suppose it's missing a how keyword which accepts left, center, and right?

month_to_season() : Computes a user-specified three-month seasonal mean (DJF, JFM, FMA, MAM, AMJ, MJJ, JJA, JAS, ASO, SON, OND, NDJ).

from esmlab.

andersy005 avatar andersy005 commented on July 1, 2024

@ahuang11, thank you for putting this together!

Does this meet the following criteria?

This looks great to me. It would be valuable to have this in esmlab.
Would you be interested in putting together a pull request to esmlab to do this?

@matt-long, any thoughts?

from esmlab.

ahuang11 avatar ahuang11 commented on July 1, 2024

Sure! However, I think I should figure out how to implement how first?

e.g.

how = 'left'
1 -> JFM

how = 'center'
1 -> DJF

how = 'right'
1 -> NDJ

Currently, it's kind of inconsistent with rolling=False and rolling=True because

rolling=False is center: 1 -> DJF

while

rolling=True is left: 1 -> JFM

Or am I overcomplicating this? Maybe I could just drop the month coordinates? Or what's the standard?

from esmlab.

ahuang11 avatar ahuang11 commented on July 1, 2024

Okay, I implemented left/center/right, but I dislike how I implemented it. Doesn't really make any intuitive sense; just brute force / experimentation. Also wondering if to_seasons is more of a metpy tool, and then month_to_season_group_average and month_to_season_rolling_average should be under esmlab?

import numpy as np
from operator import itemgetter

# multiplying by 3 to mirror cyclic-y
# it's relatively easier than many if statements to handle edges
MONTHS_STR = 'JFMAMJJASOND' * 3
HOW_how_offset = {'left': -1, 'center': 0, 'right': 1}


def to_seasons(months, interval=3, how='left', rolling=True):
    """
    Converts months as integers into groups of seasonal abbrieviations.

    Args:
        months (arr): months represented as integers
        interval (int): the number of months to aggregate together
        how (str): left, center, or right, i.e. JFM, DJF, or NDJ
        rolling (bool): whether to use a sliding window of seasons

    Examples:
        >>> print(to_seasons([1, 2, 3, 4]))
        ['JFM', 'FMA', 'MAM', 'AMJ']

        >>> print(to_seasons([1, 2, 3, 4], how='center'))
        ['DJF', 'JFM', 'FMA', 'MAM']

        >>> print(to_seasons([1, 2, 3, 4], interval=2))
        ['JF' 'FM' 'MA' 'AM']

        >>> print(to_seasons([1, 2, 3, 4], rolling=False))
        ['JFM', 'JFM', 'JFM', 'AMJ']
    """
    month_arr = np.array(months).astype(int)

    if interval == 1:
        # simple mapping; no need to calculate offsets if it's only 1 month
        # need to subtract 1 though because 1 = January = index 0
        return np.array(itemgetter(*month_arr - 1)(MONTHS_STR))
    elif interval == 0:
        raise ValueError('The interval cannot be 0!')
    elif interval > 4 and not rolling:
        # it doesn't make sense to have overlapping
        raise NotImplementedError(f'Intervals > 4 is unreasonable '
                                  'for rolling=False!')
    elif interval > 7 and rolling:
        raise NotImplementedError(f'Intervals > 7 is not supported '
                                  'for rolling=True!')

    how_offset = HOW_OFFSET[how.lower()]

    interval_offset = (3 - interval) // 3
    if rolling:
        interval_offset -= (interval - 1) % 2
    elif how != 'left' and interval > 3:
        interval_offset += interval_offset % 2 + how_offset

    if interval > 3:
        interval_offset -= how_offset * (interval // 3)

    total_offset = how_offset - interval_offset

    # selecting 11 to 23 because I want to start with DJF for center
    # if I start at 0, there would be list index out of range error
    ini = 11 - total_offset
    end = 23 - total_offset

    if rolling:
        # + 1 because January -> 1
        season_map = {
            m + 1: MONTHS_STR[m:m + interval]
            for m in range(ini, end)
        }
        # itemgetter is the built-in equivalent to doing pd.Series().map()
        return np.array(itemgetter(*month_arr + ini)(season_map))
    else:
        season_arr = np.array(
            [MONTHS_STR[m:m + interval]
             for m in range(ini, end, interval)]
        )
        index = ((month_arr + how_offset) // interval) % (12 // interval)
        index = np.clip(index, 0, len(season_arr) - 1)
        return season_arr[index]

It's quite fast though: 2920 months in 1.69 ms.

ds = xr.tutorial.open_dataset('air_temperature')
%timeit to_seasons(ds['time.month'])
# 1.69 ms ± 18.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

110 ms for 2920 * 100 months.

ds = xr.tutorial.open_dataset('air_temperature')
%timeit to_seasons(ds['time.month'].values.tolist() * 100)

110 ms ± 834 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

The tests:

DATA = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
HOWS = ['left', 'center', 'right']

expected = ['J', 'F', 'M', 'A', 'M', 'J', 'J', 'A', 'S', 'O', 'N', 'D']
for how in HOWS:
    actual = to_seasons(DATA, rolling=True, interval=1, how=how)
    np.testing.assert_array_equal(actual, expected)

expecteds = {
    'left': ['JFM', 'FMA', 'MAM', 'AMJ', 'MJJ', 'JJA',
             'JAS', 'ASO', 'SON', 'OND', 'NDJ', 'DJF'],
    'center': ['DJF', 'JFM', 'FMA', 'MAM', 'AMJ', 'MJJ',
               'JJA', 'JAS', 'ASO', 'SON', 'OND', 'NDJ'],
    'right': ['NDJ', 'DJF', 'JFM', 'FMA', 'MAM', 'AMJ',
              'MJJ', 'JJA', 'JAS', 'ASO', 'SON', 'OND']
}
for how in HOWS:
    actual = to_seasons(DATA, rolling=True, interval=3, how=how)
    np.testing.assert_array_equal(actual, expecteds[how])

expecteds = {
    'left': ['JFMAM', 'FMAMJ', 'MAMJJ', 'AMJJA', 'MJJAS', 'JJASO',
             'JASON', 'ASOND', 'SONDJ', 'ONDJF', 'NDJFM', 'DJFMA'],
    'center': ['NDJFM', 'DJFMA', 'JFMAM', 'FMAMJ', 'MAMJJ', 'AMJJA',
               'MJJAS', 'JJASO', 'JASON', 'ASOND', 'SONDJ', 'ONDJF'],
    'right': ['SONDJ', 'ONDJF', 'NDJFM', 'DJFMA', 'JFMAM', 'FMAMJ',
              'MAMJJ', 'AMJJA', 'MJJAS', 'JJASO', 'JASON', 'ASOND']
}
for how in HOWS:
    actual = to_seasons(DATA, rolling=True, interval=5, how=how)
    np.testing.assert_array_equal(actual, expecteds[how])

expecteds = {
    'left': [
        'JFMAMJJ', 'FMAMJJA', 'MAMJJAS', 'AMJJASO', 'MJJASON', 'JJASOND',
        'JASONDJ', 'ASONDJF', 'SONDJFM', 'ONDJFMA', 'NDJFMAM', 'DJFMAMJ'
    ],
    'center': [
        'ONDJFMA', 'NDJFMAM', 'DJFMAMJ', 'JFMAMJJ', 'FMAMJJA', 'MAMJJAS',
        'AMJJASO', 'MJJASON', 'JJASOND', 'JASONDJ', 'ASONDJF', 'SONDJFM'
    ],
    'right': [
        'JASONDJ', 'ASONDJF', 'SONDJFM', 'ONDJFMA', 'NDJFMAM', 'DJFMAMJ',
        'JFMAMJJ', 'FMAMJJA', 'MAMJJAS', 'AMJJASO', 'MJJASON', 'JJASOND'
    ]
}
for how in HOWS:
    actual = to_seasons(DATA, rolling=True, interval=7, how=how)
    np.testing.assert_array_equal(actual, expecteds[how])


expected = ['J', 'F', 'M', 'A', 'M', 'J', 'J', 'A', 'S', 'O', 'N', 'D']
for how in HOWS:
    actual = to_seasons(DATA, rolling=False, interval=1, how=how)
    np.testing.assert_array_equal(actual, expected)

expecteds = {
    'left': ['JFM', 'JFM', 'JFM', 'AMJ', 'AMJ', 'AMJ',
             'JAS', 'JAS', 'JAS', 'OND', 'OND', 'OND'],
    'center': ['DJF', 'DJF', 'MAM', 'MAM', 'MAM', 'JJA',
               'JJA', 'JJA', 'SON', 'SON', 'SON', 'DJF'],
    'right': ['NDJ', 'FMA', 'FMA', 'FMA', 'MJJ', 'MJJ',
              'MJJ', 'ASO', 'ASO', 'ASO', 'NDJ', 'NDJ']
}
for how in HOWS:
    actual = to_seasons(DATA, rolling=False, interval=3, how=how)
    np.testing.assert_array_equal(actual, expecteds[how])
    

expecteds = {
    'left': ['JFMA', 'JFMA', 'JFMA', 'JFMA', 'MJJA', 'MJJA',
             'MJJA', 'MJJA', 'SOND', 'SOND', 'SOND', 'SOND'],
    'center': ['DJFM', 'DJFM', 'DJFM', 'AMJJ', 'AMJJ', 'AMJJ',
               'AMJJ', 'ASON', 'ASON', 'ASON', 'ASON', 'DJFM'],
    'right': ['NDJF', 'NDJF', 'MAMJ', 'MAMJ', 'MAMJ', 'MAMJ',
              'JASO', 'JASO', 'JASO', 'JASO', 'NDJF', 'NDJF']
}
for how in HOWS:
    actual = to_seasons(DATA, rolling=False, interval=4, how=how)
    np.testing.assert_array_equal(actual, expecteds[how])

from esmlab.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.