Comments (7)
I've altered my script so that it permits single-line IF statements inside KERNELS regions. I've also done some experimenting and identified some files that I can now process that I couldn't previously. This makes the profile more informative if nothing else:
Most of the remaining white space is due to either global sums (especially in stp_ctl in stpctl.f90) or the packed halo exchanges (lbc_lnk_ptr in lbclnk.f90).
from psycloned_nemo.
An update: the support for the NVTX profiling API is now on master in PSyclone.
from psycloned_nemo.
I've (manually) tweaked traldf_iso and tra_nxt_vvl to put KERNELS in more sensible/performant locations. They've now disappeared from the profile :-)
from psycloned_nemo.
I've (manually) optimised the global sums in stp_ctl - the source of the white-space on the RHS of the profiles before this one. I've also introduced a heuristic that puts KERNELS inside loops over levels when they contain 2 or more loops. The latter is essential in a couple of the big kernels but the overall performance benefit is questionable. Still, it's only a small change to the script :-)
from psycloned_nemo.
Realised I had a bug in the script that meant that KERNELS were not being put in lower branches of CASE statements. Also realised that PSyclone can now process icetab.f90, however resulting code is slower...
from psycloned_nemo.
from psycloned_nemo.
Have got NEMO compiling with the latest version of PSyclone and PGI 19.10. Since the profiling API has changed I've created a new branch (profiling_new_api
) in this repo. You will need to build the latest version of the nvidia wrapper library distributed with PSyclone. (See description of this Issue.) Resulting code is fastest yet:
from psycloned_nemo.
Related Issues (3)
- Version without ice HOT 4
- Hackathon II - MPI HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from psycloned_nemo.