Comments (11)
Otherwise a kernel along the lines of
using KernelAbstractions.Extras.LoopInfo: @unroll
@kernel invert_column!(ψh, qh, S⁻¹, nz)
i, j = @index(Global, NTuple)
@unroll for k = 1:nz
@inbounds ψh[i, j, k] = 0
@unroll for m = 1:nz
@inbounds ψh[i, j, k] += S⁻¹[i, j][k, m] * qh[i, j, m]
end
end
end
might work, alternatively. Or maybe my indices are screwed up --- whichever is correct.
Nothing is too difficult, it's just a matter of trying it out.
from geophysicalflows.jl.
The first step is to write a kernel, which will look something like
@kernel invert_column!(ψh, qh, S⁻¹)
i, j = @index(Global, NTuple)
@inbounds ψh[i, j] .= S⁻¹[i, j] * qh[i, j]
end
The next step is to create a work layout over which the kernel is launched. If we restrict attention to models that always have more than 32 grid points, we can use something like
# Larger workgroups are generally more efficient. For more generality, we could put an if statement that incurs
# different behavior when either nkl or nl are less than 16
workgroup = 16, 16
# The size determines how many times the kernel is run
worksize = grid.nkr, grid.nl
# This (and its useage below) will ensure the kernel is not run _before_ the data in qh is available
barrier = Event(dev)
# Creates a loop over the specified worksize, using workgroup to organize the computation
loop_invert_column! = invert_column!(dev, workgroup, worksize)
# Launch the kernel
event = loop_invert_column!(ψh, qh, params.invS, dependencies=barrier)
# This will ensure that no other operations occur until the kernel has finished
wait(dev, event)
from geophysicalflows.jl.
One thing I am not totally sure about is whether KernelAbstractions
will compile away the matrix multiplication in @inbounds ψh[i, j] .= S⁻¹[i, j] * qh[i, j]
. I think that it will. If not, we may have to unroll our own loop.
from geophysicalflows.jl.
By the way, I think this optimization also requires the columns of ψh[i, j]
to be stored as StaticArrays
. It looks like ψh
is a 3D array right now. Other parts of the code may also have to converted to kernels if this change is made, since broadcasting over the 3D array would no longer work.
from geophysicalflows.jl.
With this last suggestion would x, y FFTs work nicely?
from geophysicalflows.jl.
With this last suggestion would x, y FFTs work nicely?
Oof, good point.
Hmm, maybe we need to hand-write the matrix matrix multiply then. Not sure.
from geophysicalflows.jl.
yes it's been coming to haunt us either way...
(I remember a similar discussion some months ago...)
from geophysicalflows.jl.
Something like
@kernel invert_column!(ψh, qh, S⁻¹)
i, j = @index(Global, NTuple)
ψh_column = view(ψh, i, j, :)
qh_column = view(qh, i, j, :)
@inbounds ψh_column .= S⁻¹[i, j] * qh_column
end
might work.
from geophysicalflows.jl.
I should resurrect this..
from geophysicalflows.jl.
What about https://github.com/mcabbott/Tullio.jl to the rescue? (just a random thought)
from geophysicalflows.jl.
There's probably a lot of solutions! I think I gave two, but there might be more.
from geophysicalflows.jl.
Related Issues (20)
- `MultiLayerQG` tests fail on Julia v1.6
- 3D Navier-Stokes HOT 1
- Better docstring for module `Problem` constructors
- Avoid scalar operations every time-step
- Should we enforce dealiasing before any calculation of nonlinear terms? HOT 7
- No need to import `FFTW` in examples
- `Plots` no longer accepts functions as `clims` HOT 7
- MultilayerQG example crashed at about 35000 steps HOT 7
- Optimized PV inversion for two layer case in `MultilayerQG` HOT 4
- Demonstrate save/load output functionality in an example
- Convert diagnostics output for `MultiLayerQG` from Arrays to Tuples? HOT 1
- م
- Potential missing factor in `MultiLayerQG.energies` HOT 1
- construct a random field with prescribed spectral slope HOT 2
- enstrophy and cuda problem HOT 7
- Should we be passing `forcing_spectrum` as argument into `calcF!`? HOT 2
- Option to specify PV gradients explicitly rather than computing them spectrally HOT 4
- Sign error for mean PV gradient `Qy` in `MultiLayerQG` module HOT 11
- Add citations in the Docs via DocumenterCitations.jl
- Mistake in stretching matrix in multilayerqg.jl HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from geophysicalflows.jl.