Coder Social home page Coder Social logo

Comments (16)

matt-long avatar matt-long commented on July 20, 2024 1

I think our focus should remain on an end-to-end workflow and usability in the near term, but keep performance thru parallelism on the radar.

We could consider prototyping an MPI implementation as a standalone script, analogous to that shown here.

@andersy005, you are correct. The weights files are sparse matrices and are handled well by scipy.sparse.

from esmlab-regrid.

andersy005 avatar andersy005 commented on July 20, 2024

The esmlab.regrid function failed.

I am curious to know what kind of error (MemoryError, etc) or is it just too slow?

from esmlab-regrid.

matt-long avatar matt-long commented on July 20, 2024

Pretty sure it was a memory error, but I don't recall the specific message. I had to use several nodes to get over the memory hurdle with MPI.

from esmlab-regrid.

andersy005 avatar andersy005 commented on July 20, 2024

Per xesmf documentation: https://xesmf.readthedocs.io/en/latest/limitations.html

xESMF currently only runs in serial. Parallel options are being investigated.

JiaweiZhuang/xESMF#3

I just found about it

from esmlab-regrid.

matt-long avatar matt-long commented on July 20, 2024

We are currently using xESMF, but don't have to. ESMPy does support MPI:
http://www.earthsystemmodeling.org/esmf_releases/last_built/esmpy_doc/html/examples.html?highlight=mpi

though it's not clear how to integrate with dask.

from esmlab-regrid.

andersy005 avatar andersy005 commented on July 20, 2024

though it's not clear how to integrate with dask.

Introducing MPI, ESMPy's complicated interface :) , integrating these with Xarray and Dask would definitely be a conundrum.

I am curious, what is the highest priority for esmlab-regrid? Is it usability? Performance? Do we want users to be able to perform regridding with one line of code? Because if usability is not the highest priority, it would be worth looking into MPI and ESMPy functionality

from esmlab-regrid.

andersy005 avatar andersy005 commented on July 20, 2024

It looks like Dask's folks are looking into this kind of workflow: Running Dask and MPI programs together an experiment

from esmlab-regrid.

andersy005 avatar andersy005 commented on July 20, 2024

@matt-long, Correct me if I'm wrong. This kind of parallelism is only needed when generating the weights. Once you have the weights, you don't need ESMPy/MPI machinery anymore. To apply the weights which is a matrix multiplication would be done without this heavy machinery, and this could be achieved with Scipy/Dask/Xarray, right?

from esmlab-regrid.

andersy005 avatar andersy005 commented on July 20, 2024

@matt-long, was the work you were doing to generate WEIGHT_FILE=/glade/work/mclong/esmlab-regrid/etopo1_to_POP_tx0.1v3_conservative.nc connected to the content of this notebook https://gist.github.com/matt-long/87630e97dc787ffc27b33e944dcd1473 ?

from esmlab-regrid.

matt-long avatar matt-long commented on July 20, 2024

Yes

from esmlab-regrid.

andersy005 avatar andersy005 commented on July 20, 2024

Since you are not using xesmf and ESMF/ESMPy, and the code deals with raw NumPy, I was thinking of exploring some optimization with numba and dask. Do you see any value in this or am I missing anything before I end up going down a rabbit hole :) ?

from esmlab-regrid.

matt-long avatar matt-long commented on July 20, 2024

By "connected" I mean that that code was used in the same project. It does not compute the weight files, but rather only the grid file. It's fast enough as is, I'd say. Not a high priority for optimization.

from esmlab-regrid.

andersy005 avatar andersy005 commented on July 20, 2024

By "connected" I mean that that code was used in the same project. It does not compute the weight files, but rather only the grid file.

Good point. Does this mean that the failing component is _gen_weights method?

def _gen_weights(self, overwrite_existing):
""" Generate regridding weights """
grid_file_dir = esmlab.config.get('regrid.gridfile-directory')
weights_dir = f'{grid_file_dir}/weights'

from esmlab-regrid.

matt-long avatar matt-long commented on July 20, 2024

Yes.

from esmlab-regrid.

andersy005 avatar andersy005 commented on July 20, 2024

Thank you for the clarification! Speaking of high priority, is there anything on your plate I can help with? :)

from esmlab-regrid.

JiaweiZhuang avatar JiaweiZhuang commented on July 20, 2024

Not sure if related to JiaweiZhuang/xESMF#29. Parallel weight generation is very hard (if possible at all) to rewrite in a non-MPI way. But after the weights are generated, applying them to data using dask is much easier.

My plan is to clearly separate between "weight generation" and "weight application" phases:

  • The later phase doesn't depend on ESMF/ESMPy (don't even need to have it installed), and it is easy to rewrite with pure dask/xarray/scipy/numba/cython or whatever modern Python libraries. Chunking in lev/time can be trivially implemented (xESMF v0.2 already supports it), chunking in horizontal (for extremely large grids) still seems doable, as it is just a parallel sparse matrix multiplication problem.
  • Parallelizing the first phase probably has to rely on ESMPy-MPI, as no one would want to reinvent the wheel that ESMF already has (and has been developed for decades). Although configuring MPI is much more annoying than configuring Dask, this laborious task only needs to be done once. The weights can be reused and even shared between platforms & users. Public clouds actually have decent supports for MPI (think about all the cloud-HPC business), so in principle every one should be able to generate giant regridding weight files, even without access to NCAR supercomputers.

Such separation will be much clearer after resolving JiaweiZhuang/xESMF#11. My plan is to have a "mini-xesmf" installation that doesn't depend on ESMPy -- it will just construct a complete regridder from existing weight files, generated from a ESMPy program running elsewhere (potentially a huge MPI run, potentially with a xesmf wrapper for better usability).

from esmlab-regrid.

Related Issues (3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.