Coder Social home page Coder Social logo

Comments (7)

andersy005 avatar andersy005 commented on August 17, 2024

Changing aspects of ds in compute_ann_mean(ds) is unexpected behavior (to me).

We probably shouldn't alter the dataset time index without letting the user know.

For a temporary fix after computing compute_ann_mean, you can try the following:

import esmlab

ds = esmlab.utils.time.uncompute_time_var(ds, 'time')

I will patch this issue some time today.

from esmlab.

matt-long avatar matt-long commented on August 17, 2024

@klindsay28 thanks for raising this issue. I think we need to decide what the desired behavior is. At present, esmlab is geared toward operating on datasets where time has not been decoded, but returning datasets with time decoded.

Time must be decoded internally to enable application of the groupby methods that are the core of the data-manipulations.

I think I agree that a more intuitive behavior would be to restore the time axis to its un-decoded state by default.

from esmlab.

klindsay28 avatar klindsay28 commented on August 17, 2024

Can the internal decoding of the time variable be done on a copy of the function arguments, instead of on the function arguments themselves? This would avoid the need to restore the time axis, because it was never changed it in the first place.

Following up on the suggestion of @andersy005, I took a look at the related function esmlab.utils.time.compute_time_var. I see that in addition to returning an xr.Dataset with an encoded time coordinate, it also encodes the time coordinate in dset, its xr.Dataset argument. I would not have guessed this from the function name. Maybe I would have if the function were named encode_time_var.

from esmlab.

matt-long avatar matt-long commented on August 17, 2024

@klindsay28, first, regarding the name of the function, I see your point, though we are doing more than encoding time: the function does actually compute time as the mid-point of the time_bounds.

We are using the xarray.core.groupby functionality for much of the computation. Your suggestion of not modifying the original dataset is a good one, but as far as I know this won't work with groupby. We would need to compute the groupby objects using the modified time axis and then subsequently apply them, but I don't think that's supported explicitly.

One option would be to keep the original time_coord_var on the dataset:

  1. rather than replacing "time", we would use xarray's set_coords and reset_coords methods to make our "new" time object the time coordinate (and the old time coordinate a variable defined on that coordinate);
  2. then apply the groupby using this coordinate;
  3. then set/reset to restore the old time coordinate.

I have mixed feelings about this: it's nice to have the dataset returned with all the functionality of time-handling enabled (i.e., with time decoded). However, I recognize that the current behavior is somewhat mysterious and counterintuitive. I think carrying the old time coord along for the computation is a good idea regardless, so maybe we can have a flag (or better yet a config option) for how to return time.

from esmlab.

klindsay28 avatar klindsay28 commented on August 17, 2024

It looks like the calls to groupby in compute_ann_mean use the argument group=time.year. The docs for groupby at xarray.Dataset.groupby
state that group 'must be the name of a variable contained in this dataset'. There are no stated restrictions that it needs to be a coordinate.

I wonder if compute_time_var could add to dset a new variable computed_time_mid that is encoded, and this could be used in the groupby calls.

from esmlab.

matt-long avatar matt-long commented on August 17, 2024

good point!

from esmlab.

andersy005 avatar andersy005 commented on August 17, 2024

I wonder if compute_time_var could add to dset a new variable computed_time_mid that is encoded, and this could be used in the groupby calls.

This sounds like a great approach and could come in handy for other purposes too.

from esmlab.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.