Coder Social home page Coder Social logo

Comments (11)

danielkoll avatar danielkoll commented on July 2, 2024 1

Interesting! Yes, the Absorption_Crosssections script is in urgent need of some rewriting and cleaning up. The code is essentially Ray Pierrehumbert's old PyTran script, and could probably be made much more efficient by leveraging modern Python capabilities like numpy. Many of the functions in there can, and should be, decluttered. If you have time to look at it, that'd be a great contribution!

from pyrads.

danielkoll avatar danielkoll commented on July 2, 2024 1

ClimateGraphics should go. These are all pre-numpy/matplotlib legacy libraries.

from pyrads.

danielkoll avatar danielkoll commented on July 2, 2024 1

In terms of rewriting Absorption_Crosssections: the important parts are loadSpectralLines() and computeAbsorption(). loadSpectralLines() reads in all the individual spectral lines from a hitran file, using a big for-loop. computeAbsorption() then loops over all the saved lines and adds up their contribution to the absorption crossection 'absGrid'.

Currently loadSpectralLines() is nice in that it requires minimal memory (reading one spectral line at a time), but the downside is that it appends that data to an ever-growing list. It might be possible to speed up loadSpectralLines() by first loading the entire hitran file into memory, then only keeping the subsection of lines we're actually interested in? One issue with this is that spectral files can become ridiculously large -- e.g., the full H2O line list from HITRAN2016 is only ~23MB, but other line lists can be >>100MB, so loading all lines isn't always an option.

computeAbsorption() then loops over the lines, calculates a lorentz line shape for each one, and adds the contribution of that line to the overall absorption grid. Since this only operates on line data that's already stored in memory, there might be a way of replacing the for-loop with numpy?

Alternatively, loadSpectralLines() and computeAbsorption() could be good targets for numba.

from pyrads.

danielkoll avatar danielkoll commented on July 2, 2024

Here is a minimal script that reproduces the error:

import numpy as np
import pyrads

from pyrads.Absorption_Crosssections_HITRAN2016 import getKappa_HITRAN
from pyrads.Absorption_Crosssections_HITRAN2016_numba import getKappa_HITRAN_numba

n0,n1,dn = 350.,400.,20.
n = np.arange(n0,n1,dn)
T = 300.
p = 1e5
pself = 1e3

kappa0 = getKappa_HITRAN(n,n0,n1,dn,"CO2", broadening="mixed",
press=p,press_self=pself,temp=T,
lineWid=25.,cutoff_option="fixed",remove_plinth=True)
kappa0_numba = getKappa_HITRAN_numba(n,n0,n1,dn,"CO2", broadening="mixed",
press=p,press_self=pself,temp=T,
lineWid=25.,cutoff_option="fixed",remove_plinth=True)
print( "kappaCO2 (no numba)=",kappa0 )
print( "kappaCO2 (numba)=",kappa0_numba )
print( "\n" )

kappa0 = getKappa_HITRAN(n,n0,n1,dn,"H2O", broadening="mixed",
press=p,press_self=pself,temp=T,
lineWid=25.,cutoff_option="fixed",remove_plinth=True)
kappa0_numba = getKappa_HITRAN_numba(n,n0,n1,dn,"H2O", broadening="mixed",
press=p,press_self=pself,temp=T,
lineWid=25.,cutoff_option="fixed",remove_plinth=True)
print( "kappaH2O (no numba)=",kappa0 )
print( "kappaH2O (numba)=",kappa0_numba )
print( "\n" )

from pyrads.

danielkoll avatar danielkoll commented on July 2, 2024

Resulting output is:

kappaCO2 (no numba)= [5.80495351e-05 3.91946490e-06 0.00000000e+00]
kappaCO2 (numba)= [5.80495351e-05 3.91946490e-06 0.00000000e+00]

kappaH2O (no numba)= [22725.4463143 7063.90460525 0. ]
kappaH2O (numba)= [5.80495351e-05 3.91946490e-06 0.00000000e+00]

The values produced by getKappa_HITRAN_numba() are identical for CO2 and H2O.

from pyrads.

danielkoll avatar danielkoll commented on July 2, 2024

@AndrewWilliams3142: getKappa_HITRAN_numba seems to produce incorrect results when calling it for a second time with another gas species. Presumably an issue with numba's caching? Manually setting @jit(cache=False) doesn't fix things though.

from pyrads.

AndrewILWilliams avatar AndrewILWilliams commented on July 2, 2024

Hi @ddbkoll ! I'll take a look now, it could well be something wrong with my numba interface (still getting used to it).

+1 for the tests!

from pyrads.

AndrewILWilliams avatar AndrewILWilliams commented on July 2, 2024

@ddbkoll very weird! I thiiiink this might be to do with how numba deals with global instances, see here.

Global variables are treated as constants. The cache will remember the value in the global variable used at compilation. On cache load, the cached function will not rebind to the new value of the global variable.

One such global variable is molName! So I think that when numba compiles the function for the first time it's remembering this, which is why you always get the values for CO2 (or whatever else you ran first).

Absorption_Crosssections_HITRAN2016_numba.py clearly needs a bit more work! On this topic, I was going to ask, there are lots of functions in Absorption_Crosssections_HITRAN2016.py (and the numba version) which aren't used and just seem to be for plotting? If so, I think it might make sense to either remove these functions (to reduce clutter and make debugging easier) or put them in a separate file.

from pyrads.

AndrewILWilliams avatar AndrewILWilliams commented on July 2, 2024

Sweet! Yeah I recognized a few of the functions from PyTran actually. I'll start thinking about a clean-up this week! (Need to make some lists of what's being used)

Another question, do you use the stuff in ClimateGraphics/ClimateGraphics_MPL etc? As far as I can tell most of this is redundant now that matplotlib is so advanced anyways, so there might be an argument for removing it from the package.

from pyrads.

AndrewILWilliams avatar AndrewILWilliams commented on July 2, 2024

Interesting! Thanks for the overview!

Maybe I'm missing something here, but in DATA/HITRAN_DATA/HITRAN2016/ThermalOnly_0-5000cm.MainIsotopesOnly the files are all quite small? for example 01_hit16.par is only 5.3Mb? I'm a bit confused by these files though, I'm not entirely sure how to load the entire thing in haha

from pyrads.

AndrewILWilliams avatar AndrewILWilliams commented on July 2, 2024

Another option could be to call loadSpectralLines() in the OpticalThickness calculation, before it loops over the temperatures/pressures. Currently it loads in the same spectral lines every iteration of the loop. I'm just putting together a PR now which should tidy up lots of the code

from pyrads.

Related Issues (13)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.