Coder Social home page Coder Social logo

parthenon-hpc-lab / athenapk Goto Github PK

View Code? Open in Web Editor NEW
39.0 12.0 10.0 9.4 MB

AthenaPK: a performance portable version of Athena++ built on Parthenon and Kokkos

License: BSD 3-Clause "New" or "Revised" License

CMake 1.51% C++ 74.63% Python 19.22% Jupyter Notebook 4.64%
adaptive-mesh-refinement amr astrophysics kokkos magnetohydrodynamics mhd performance-portability parthenon cfd computational-fluid-dynamics

athenapk's People

Contributors

benwibking avatar forrestglines avatar jmstone avatar kayarre avatar pgrete avatar tbhaxor avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

athenapk's Issues

Check `prim`/`cons` handling for sources

Depending on the type of source (split/unsplit), we need to take special care of the content in the the prim and cons registers and their update (so that multiple split sources get the right input [original/updated] and generate the correct output).

Unable compute output in the Kelvin-Helmholtz problem with iprob=1

I have made only the following change in the kh-shear-lecoanet_2d.in input file

<problem/kh>
- iprob  = 4
+ iprob  = 1
+ drat   = 1.3
amp    = 0.1             # amplitude of initial vy perturbation

When I run the function using

OUTDIR=$(mktemp -d)
build/bin/athenaPK -i inputs/kh-shear-lecoanet_2d.in -d "$OUTDIR"

I am getting the following error

### PARTHENON ERROR
  Condition:   u_d > 0.0 || density_floor_ > 0.0
  Message:     Got negative density. Consider enabling first-order flux correction or setting a reasonble density floor.
  File:        /mnt/Projects/athenapk/src/eos/adiabatic_hydro.cpp
  Line number: 68
### PARTHENON ERROR
  Condition:   u_d > 0.0 || density_floor_ > 0.0
  Message:     Got negative density. Consider enabling first-order flux correction or setting a reasonble density floor.
  File:        /mnt/Projects/athenapk/src/eos/adiabatic_hydro.cpp
  Line number: 68
### PARTHENON ERROR
  Condition:   u_d > 0.0 || density_floor_ > 0.0
  Message:     Got negative density. Consider enabling first-order flux correction or setting a reasonble density floor.
  File:        /mnt/Projects/athenapk/src/eos/adiabatic_hydro.cpp
  Line number: 68
### PARTHENON ERROR
  Condition:   (null)
[h3ll:103321] *** Process received signal ***
[h3ll:103321] Signal: Segmentation fault (11)
[h3ll:103321] Signal code:  (128)
[h3ll:103321] Failing at address: (nil)
[h3ll:103321] [ 0] /usr/lib/libc.so.6(+0x38f50)[0x7f78e4051f50]
[h3ll:103321] [ 1] /usr/lib/libcuda.so.1(+0x3dc4d5)[0x7f78e4ddc4d5]
[h3ll:103321] [ 2] /usr/lib/libcuda.so.1(+0x2763bc)[0x7f78e4c763bc]
[h3ll:103321] [ 3] /usr/lib/libcuda.so.1(+0x34db9f)[0x7f78e4d4db9f]
[h3ll:103321] [ 4] /usr/lib/libcuda.so.1(+0x295798)[0x7f78e4c95798]
[h3ll:103321] [ 5] /usr/lib/libc.so.6(+0x85bb5)[0x7f78e409ebb5]
[h3ll:103321] [ 6] /usr/lib/libc.so.6(+0x107d90)[0x7f78e4120d90]
[h3ll:103321] *** End of error message ***

Static mesh refinement and GLMMHD fail on Delta GPU A40s

I've run into a bug on Delta GPU while trying out the A40's using static mesh refinement GLMMHD. The code runs into negative densities in the conserved to primitive step even when fluid evolution and all source terms are disabled. The failure occurs at different times and on different blocks.

Example Failure
cycle=28 time=1.3124988050904851e-01 dt=4.6874957324660183e-03 zone-cycles/wsec_step=7.51e+06 wsec_total=1.01e+00 wsec_step=3.87e-02
### PARTHENON ERROR
  Condition:   u_d > 0.0 || density_floor_ > 0.0
  Message:     Got negative density. Consider enabling first-order flux correction or setting a reasonble density floor.
  File:        /u/glines/code/athenapk-project/athenapk/src/eos/adiabatic_glmmhd.cpp
  Line number: 80
### PARTHENON ERROR
  Condition:   u_d > 0.0 || density_floor_ > 0.0
  Message:     Got negative density. Consider enabling first-order flux correction or setting a reasonble density floor.
  File:        /u/glines/code/athenapk-project/athenapk/src/eos/adiabatic_glmmhd.cpp
  Line number: 80
### PARTHENON ERROR
  Condition:   u_d > 0.0 || density_floor_ > 0.0
  Message:     Got negative density. Consider enabling first-order flux correction or setting a reasonble density floor.
  File:        /u/glines/code/athenapk-project/athenapk/src/eos/adiabatic_glmmhd.cpp
  Line number: 80
### PARTHENON ERROR
  Condition:   u_d > 0.0 || density_floor_ > 0.0
  Message:     Got negative density. Consider enabling first-order flux correction or setting a reasonble density floor.
  File:        /u/glines/code/athenapk-project/athenapk/src/eos/adiabatic_glmmhd.cpp
  Line number: 80
### PARTHENON ERROR
  Condition:   u_d > 0.0 || density_floor_ > 0.0
  Message:     Got negative density. Consider enabling first-order flux correction or setting a reasonble density floor.
  File:        /u/glines/code/athenapk-project/athenapk/src/eos/adiabatic_glmmhd.cpp
  Line number: 80
### PARTHENON ERROR
  Condition:   u_d > 0.0 || density_floor_ > 0.0
  Message:     Got negative density. Consider enabling first-order flux correction or setting a reasonble density floor.
  File:        /u/glines/code/athenapk-project/athenapk/src/eos/adiabatic_glmmhd.cpp
  Line number: 80
### PARTHENON ERROR
  Condition:   u_d > 0.0 || density_floor_ > 0.0
  Message:     Got negative density. Consider enabling first-order flux correction or setting a reasonble density floor.
  File:        /u/glines/code/athenapk-project/athenapk/src/eos/adiabatic_glmmhd.cpp
  Line number: 80
### PARTHENON ERROR
  Condition:   u_d > 0.0 || density_floor_ > 0.0
  Message:     Got negative density. Consider enabling first-order flux correction or setting a reasonble density floor.
  File:        /u/glines/code/athenapk-project/athenapk/src/eos/adiabatic_glmmhd.cpp
  Line number: 80
:0: : block: [1096,0,0], thread: [0,2,0] Assertion `Got negative density. Consider enabling first-order flux correction or setting a reasonble density floor.` failed.
:0: : block: [1096,0,0], thread: [0,3,0] Assertion `Got negative density. Consider enabling first-order flux correction or setting a reasonble density floor.` failed.
:0: : block: [1096,0,0], thread: [0,14,0] Assertion `Got negative density. Consider enabling first-order flux correction or setting a reasonble density floor.` failed.
:0: : block: [1096,0,0], thread: [0,15,0] Assertion `Got negative density. Consider enabling first-order flux correction or setting a reasonble density floor.` failed.
:0: : block: [1096,0,0], thread: [0,146,0] Assertion `Got negative density. Consider enabling first-order flux correction or setting a reasonble density floor.` failed.
:0: : block: [1096,0,0], thread: [0,147,0] Assertion `Got negative density. Consider enabling first-order flux correction or setting a reasonble density floor.` failed.
:0: : block: [1096,0,0], thread: [0,158,0] Assertion `Got negative density. Consider enabling first-order flux correction or setting a reasonble density floor.` failed.
:0: : block: [1096,0,0], thread: [0,159,0] Assertion `Got negative density. Consider enabling first-order flux correction or setting a reasonble density floor.` failed.

Using this parameter file with a linear wave:
glmmd_smr_linear_wave3d.txt

More outputs:

out_1.txt
out_2.txt
out_3.txt

With this environtment ( loading the anaconda3_gpu openmpi/4.1.4, cuda/11.7.0, and hdf5 modules.)

$ module list

Currently Loaded Modules:
  1) cue-login-env/1.0   2) gcc/11.2.0   3) modtree/gpu   4) default   5) cudnn/8.4.1.50   6) anaconda3_gpu/4.13.0   7) ucx/1.12.1   8) openmpi/4.1.4   9) cuda/11.7.0  10) hdf5/1.12.2

Smaller meshblocks seem to increase the frequency of the error. In the above example I used 8x8x8 meshblocks and it never ran to completion, using 16x16x16 meshblocks it failed 1 out of 5 times. Changing the fluid method to euler or removing the static mesh refinement zone seems to avoid the issue, although it might just reduce the likely hood of the issue.

The problem occurs with a RelWithDebInfo build and not with a Debug build, making this tricky to debug.

No CUDA aware MPI

I am trying to run AthenaPK on a machine that does not support CUDA aware MPI (does not have Infiband).
I am compiling with
cmake -S. -Bbuild-gpu -DKokkos_ARCH_SKX=ON -DKokkos_ENABLE_CUDA=ON -DKokkos_ARCH_AMPERE80=ON -D CMAKE_CXX_COMPILER=${PWD}/external/Kokkos/bin/nvcc_wrapper -DPARTHENON_ENABLE_HOST_COMM_BUFFERS=O

so I already use -DPARTHENON_ENABLE_HOST_COMM_BUFFERS=O but still I am getting the following error message when running

--------------------------------------------------------------------------
By default, for Open MPI 4.0 and later, infiniband ports on a device
are not used by default.  The intent is to use UCX for these devices.
You can override this policy by setting the btl_openib_allow_ib MCA parameter
to true.

  Local host:              freyag08
  Local adapter:           hfi1_0
  Local port:              1

--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: There was an error initializing an OpenFabrics device.

  Local host:   freyag08
  Local device: hfi1_0
--------------------------------------------------------------------------
Kokkos::Cuda::initialize ERROR: likely mismatch of architecture

Backtrace:
                        [0x88a053]
                        [0x8813e8]
                        [0x88141b]
                        [0x89634c]
                        [0x899fa8]
                        [0x87cfce]
                        [0x7cf4ef]
                        [0x43dd68]
__libc_start_main [0x149fd7a3f2bd]
                        [0x45320a]
[freyag08:09950] *** Process received signal ***
[freyag08:09950] Signal: Aborted (6)
[freyag08:09950] Signal code:  (-6)
[freyag08:09950] [ 0] /lib64/libpthread.so.0(+0x168c0)[0x149fd878c8c0]
[freyag08:09950] [ 1] /lib64/libc.so.6(gsignal+0x10d)[0x149fd7a54cdb]
[freyag08:09950] [ 2] /lib64/libc.so.6(abort+0x177)[0x149fd7a56375]
[freyag08:09950] [ 3] /u/maxbg/athenapk/athenapk-public/build-gpu/bin/athenaPK[0x881420]
[freyag08:09950] [ 4] /u/maxbg/athenapk/athenapk-public/build-gpu/bin/athenaPK[0x89634c]
[freyag08:09950] [ 5] /u/maxbg/athenapk/athenapk-public/build-gpu/bin/athenaPK[0x899fa8]
[freyag08:09950] [ 6] /u/maxbg/athenapk/athenapk-public/build-gpu/bin/athenaPK[0x87cfce]
[freyag08:09950] [ 7] /u/maxbg/athenapk/athenapk-public/build-gpu/bin/athenaPK[0x7cf4ef]
[freyag08:09950] [ 8] /u/maxbg/athenapk/athenapk-public/build-gpu/bin/athenaPK[0x43dd68]
[freyag08:09950] [ 9] /lib64/libc.so.6(__libc_start_main+0xef)[0x149fd7a3f2bd]
[freyag08:09950] [10] /u/maxbg/athenapk/athenapk-public/build-gpu/bin/athenaPK[0x45320a]
[freyag08:09950] *** End of error message ***
srun: error: freyag08: task 0: Aborted
srun: launch/slurm: _step_signal: Terminating StepId=485895.1

submodules and athenapk are up to date.

Modules loaded are

  1. cuda/11.4 2) gcc/11 3) openmpi/4 4) hdf5-mpi/1.12.1

[Cluster] Negative pressure

Hey,

I'm currently starting my PhD project and am setting up a simulation starting from the cluster.in environment. I've slightly modified the default set up so that I start with a perturbed density field (basically a clumpy atmosphere). When I let the simulation run, everything goes fine (the gas is cooling down, and start collapsing due to gravity).

However at some point the pressure reach negative values in some cells for a reason I could not identify. I added some couts in the code to check for the values of the variables involved in the computation of the pressure, and it look like the energy density is balancing the kinetic term so that they cancel out (w_p being defined in the code as gm1*(u_e - e_k)) and the result is slightly negative.

Any idea where this could come from ? I work with periodic boundary conditions by the way.

Thanks a lot !

Regression tests fail because it can't load phdf

Inside a virtualenv with Python 3.9.6 on HPCC:

*****************************************************************
Preparing Test Case Step 39
*****************************************************************

*****************************************************************
Running Driver
*****************************************************************

Command to execute driver
/mnt/home/wibkingb/athenapk/build/bin/athenaPK -i /mnt/home/wibkingb/athenapk/inputs/sod.in parthenon/mesh/nx1=64 parthenon/meshblock/nx1=64 parthenon/time/integrator=rk3 hydro/reconstruction=wenoz parthenon/mesh/nghost=3 hydro/riemann=hllc parthenon/output0/id=39 problem/sod/rho_l=1.4 problem/sod/pres_l=1.0 problem/sod/u_l=0.1 problem/sod/rho_r=1.0 problem/sod/u_r=0.1 problem/sod/pres_r=1.0 problem/sod/x_discont=0.5 parthenon/time/tlim=2.0 parthenon/output0/dt=2.0
Running with coverage
False
*****************************************************************
Analysing Driver Output
*****************************************************************

Couldn't find module to read Parthenon hdf5 files.
Traceback (most recent call last):
  File "/mnt/home/wibkingb/athenapk/external/parthenon/tst/regression/run_test.py", line 204, in <module>
    main(**vars(args))
  File "/mnt/home/wibkingb/athenapk/external/parthenon/tst/regression/run_test.py", line 93, in main
    raise TestError("Test " + test_manager.test + " failed")
__main__.TestError: Test riemann_hydro failed

Hydrostatic equilibrium

Hey,

I'm trying to set up an hydrostatic equilibrium so that its stable on scales of 1e8-1e9 yrs. I made some tries with the already implemented initial conditions provided in the input folder (hse.in), which is by default set with a time limit of 1e6 yrs.

When trying to extend it to a few 1e7 yrs, I quickly observe a square pattern appearing in the box at roughly 50 Myr, which leads to a dramatic collapse of the profile (see file attached). I tried to increase resolution, change the boundary conditions or some profile parameters but did not manage to extend stability much beyond 10 Myr. Can I have any hint on how you manage to sustain equilibrium on larger timescales ?

Thanks by advance !

density.10.mp4

Adding "physics-based" reconstruction

For various things, it would be useful to reconstruct a different set of variables other than the primitive variables. For example, we might want to reconstruct (density, velocity, temperature) instead, or we might we to reconstruct using the characteristic waves.

How can this best be accomplished with the current machinery? Can we do this cell-by-cell, e.g., with an inline function within the reconstruction kernel?

enhanced first-order flux correction

Two things are not clear to me:

  1. Does first-order flux correction (as implemented here) drop to first order fluxes for each stage separately? (i.e., does it still do higher-order time integration if you attempt to use it with RK integrators?)

  2. What happens if there is a bad cell at the edge of a MeshBlock? Are the new fluxes communicated to neighboring MeshBlocks?

bad XDMF metadata after first component

When running a lower resolution version of the cloud problem (-i inputs/cloud_lowres.in) on V100 at commit 0584231

[wibkingb@dev-amd20-v100 inputs]$ diff cloud.in cloud_lowres.in
16c16
< nx1 = 192
---
> nx1 = 48
22c22
< nx2 = 320
---
> nx2 = 80
28c28
< nx3 = 192
---
> nx3 = 48
37,39c37,39
< nx1=64
< nx2=64
< nx3=64
---
> nx1=16
> nx2=16
> nx3=16

the outputs at the first timestep for all variables except the density are bogus, e.g.:
AMR0002

XDMF outputs are invalid and cannot be read by VisIt or Paraview

Opening XDMF outputs in VisIt or Paraview causes an immediate crash of both programs.

This appears to be because the XDMF outputs do not successfully validate against the XDMF specification DTD file (https://gitlab.kitware.com/xdmf/xdmf/blob/master/Xdmf.dtd):

In [1]: import lxml

In [2]: from lxml import etree

In [3]: parser = etree.XMLParser(dtd_validation=True)

In [4]: tree = etree.parse("./parthenon.prim.00000.phdf.xdmf", parser)
Traceback (most recent call last):

  File ~/Library/Python/3.10/lib/python/site-packages/IPython/core/interactiveshell.py:3378 in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)

  Cell In [4], line 1
    tree = etree.parse("./parthenon.prim.00000.phdf.xdmf", parser)

  File src/lxml/etree.pyx:3538 in lxml.etree.parse

  File src/lxml/parser.pxi:1876 in lxml.etree._parseDocument

  File src/lxml/parser.pxi:1902 in lxml.etree._parseDocumentFromURL

  File src/lxml/parser.pxi:1805 in lxml.etree._parseDocFromFile

  File src/lxml/parser.pxi:1177 in lxml.etree._BaseParser._parseDocFromFile

  File src/lxml/parser.pxi:615 in lxml.etree._ParserContext._handleParseResultDoc

  File src/lxml/parser.pxi:725 in lxml.etree._handleParseResult

  File src/lxml/parser.pxi:654 in lxml.etree._raiseParseError

  File ./parthenon.prim.00000.phdf.xdmf:10
    <Topology Type="3DRectMesh" NumberOfElements="65 65 65"/>
                                                          ^
XMLSyntaxError: No declaration for attribute Type of element Topology, line 10, column 62

HSE init in ghost zone

@mfournier01 discovered an issue when applying the (initialized) density field as a weight to the scalar potential that initialized the magnetic field.

The MWE is the default cluster.in (just with periodic BC and perturbations): cluster.txt
with these changes (so loop bounds are already adjusted to also init one layer into the ghost cells): main...pgrete/hse-ghost-mwe

Note, that using

u(IDN, k, j, i) = 1/(1+r);

instead of

u(IDN, k, j, i) = rho_r;

results in a correctly initialized field.

The magnetic field divergence for the initial dump looks like:

slc = yt.SlicePlot(ds,"x",("gas","magnetic_field_divergence_absolute"))
slc.annotate_grids()
slc.show()

image

so there are artifacts at the block boundaries (and more specifically, it looks like at boundaries where the coordinate do not cross 0).
(Note, the artifacts at the outermost edge are related to yt domain boundary handling, so should not be a concern.)

We went through the hse code, but didn't spot anything that looks like being related to this kind of behavior.
Any ideas, @forrestglines ?

Origin of Schure cooling table

Where does the data in the Schure cooling table actually come from, @forrestglines ?
The comment say

# Cooling table for solar metallicity, 1/2 solar metallicity
#   (from Chris Loken, computed with Sarazin & White's analytic expression)
#  temperature if log(K), cooling rate/ne^3 (erg cm^3/s)
# This is the new cooling table based on http://arxiv.org/pdf/0909.5204v2

but when I look at the pdf
a) there are slight discrepancies in the numbers for solar metallicity (table 2)
b) there's no table for the half-solar metallicity column

To be clear, I'm not concerned that those numbers are "wrong" (or that the slight deviation have a significant impact), I just want to understand where they come from.

Doubt in problem output (Not generating the output)

I am facing problem with output of the Rayleigh-Taylor problem.

C++ Code

//! \file rt.cpp
//! \brief Rayleigh Taylor problem generator
//! Problem domain should be -.05 < x < .05; -.05 < y < .05, -.1 < z < .1, gamma=5/3 to
//! match Dimonte et al.  Interface is at z=0; perturbation added to Vz. Gravity acts in
//! z-dirn. Special reflecting boundary conditions added in x3.  A=1/2.  Options:
//!    - iprob = 1 -- Perturb V3 using single mode
//!    - iprob = 2 -- Perturb V3 using multiple mode
//!    - iprob = 3 -- B rotated by "angle" at interface, multimode perturbation
//!
// C headers

// C++ headers
#include <algorithm> // min, max
#include <cmath>     // log
#include <cstring>   // strcmp()
#include <sstream>

// Parthenon headers
#include "mesh/mesh.hpp"
#include "parthenon/driver.hpp"
#include "parthenon/package.hpp"
#include "utils/error_checking.hpp"

// AthenaPK headers
#include "../main.hpp"

namespace rt {
  using namespace parthenon::driver::prelude;

  void SetupSingleMode(MeshBlock *pmb, parthenon::ParameterInput *pin) {
    if (pmb->pmy_mesh->ndim == 1) {
      PARTHENON_THROW("This problem should be either in 2d or 3d.");
      return;
    }

    Real kx = 2.0*(M_PI)/(pmb->pmy_mesh->mesh_size.x1max - pmb->pmy_mesh->mesh_size.x1min);
    Real ky = 2.0*(M_PI)/(pmb->pmy_mesh->mesh_size.x2max - pmb->pmy_mesh->mesh_size.x2min);
    Real kz = 2.0*(M_PI)/(pmb->pmy_mesh->mesh_size.x3max - pmb->pmy_mesh->mesh_size.x3min);

    Real amp = pin->GetReal("problem/rt","amp");
    Real drat = pin->GetOrAddReal("problem/rt","drat",3.0);
    Real grav_acc = pin->GetReal("hydro","const_accel_val");

    auto ib = pmb->cellbounds.GetBoundsI(IndexDomain::interior);
    auto jb = pmb->cellbounds.GetBoundsJ(IndexDomain::interior);
    auto kb = pmb->cellbounds.GetBoundsK(IndexDomain::interior);
    
    auto gam = pin->GetReal("hydro", "gamma");
    auto gm1 = (gam - 1.0);
    Real p0 = 1.0/gam;

    // initialize conserved variables
    auto &rc = pmb->meshblock_data.Get();
    auto &u_dev = rc->Get("cons").data;
    auto &coords = pmb->coords;
    // initializing on host
    auto u = u_dev.GetHostMirrorAndCopy();

    for (size_t k = kb.s; k < kb.e; k++) {
      for (size_t j = jb.s; j < jb.e; j++) {
        for (size_t i = ib.s; j < ib.e; i++) {
            auto x1v = coords.Xc<1>(i);
            auto x2v = coords.Xc<2>(j);
            auto x3v = coords.Xc<3>(k);
            
            switch (pmb->pmy_mesh->ndim) {
              case 2:{
                Real den=1.0;
                if (x2v > 0.0) den *= drat;

                u(IM2,k,j,i) = (1.0 + cos(kx*x1v))*(1.0 + cos(ky*x2v))/4.0;
                u(IDN,k,j,i) = den;
                u(IM1,k,j,i) = 0.0;
                u(IM2,k,j,i) *= (den*amp);
                u(IM3,k,j,i) = 0.0;
                u(IEN,k,j,i) = (p0 + grav_acc*den*x2v)/gm1 + 0.5*SQR(u(IM2,k,j,i))/den;
              }
              break;
              case 3: {
                Real den=1.0;
                if (x3v > 0.0) den *= drat;
                u(IM3,k,j,i) = (1.0+cos(kx*x1v))*(1.0+cos(ky*x2v))*(1.0+cos(kz*x3v))/8.0;
                u(IDN,k,j,i) = den;
                u(IM1,k,j,i) = 0.0;
                u(IM2,k,j,i) = 0.0;
                u(IM3,k,j,i) *= (den*amp);
                u(IEN,k,j,i) = (p0 + grav_acc*den*x3v)/gm1 + 0.5*SQR(u(IM3,k,j,i))/den;
              }
              break;
            }
        }
      }
    }
  }

  void ProblemGenerator(MeshBlock *pmb, parthenon::ParameterInput *pin) {
    auto iprob = pin->GetOrAddInteger("problem/rt", "iprob", 1);

    switch (iprob) {
      case 1:
        SetupSingleMode(pmb, pin);
        break;

      default:
        std::stringstream msg;
        msg << "Problem type " << iprob << " is not supported.";
        PARTHENON_THROW(msg.str());
    }
  }
} // namespace rt

Input file

<comment>
problem = Kelvin-Helmholtz instability
reference = Lecoanet et al., MNRAS 455, 4274-4288, 2016

<job>
problem_id = rt

<problem/rt>
iprob  = 1
amp   = 0.01
drat  = 2.0

<parthenon/mesh>
refinement = adaptive
numlevel = 3
nghost = 4

nx1       = 100         # Number of zones in X1-direction
x1min     = -0.16666667 # minimum value of X1
x1max     = 0.16666667  # maximum value of X1
ix1_bc    = periodic    # inner-X1 boundary flag
ox1_bc    = periodic    # outer-X1 boundary flag

nx2       = 200         # Number of zones in X2-direction
x2min     = -0.5        # minimum value of X2
x2max     = 0.5         # maximum value of X2
ix2_bc    = reflecting     # inner-X2 boundary flag
ox2_bc    = reflecting     # outer-X2 boundary flag

nx3       = 1           # Number of zones in X3-direction
x3min     = -0.5        # minimum value of X3
x3max     = 0.5         # maximum value of X3
ix3_bc    = periodic    # inner-X3 boundary flag
ox3_bc    = periodic    # outer-X3 boundary flag


<parthenon/meshblock>
nx1       = 100         # Number of cells in each MeshBlock, X1-dir
nx2       = 200         # Number of cells in each MeshBlock, X2-dir
nx3       = 1           # Number of cells in each MeshBlock, X3-dir

<parthenon/time>
integrator = vl2
cfl = 0.4
tlim = 10.0
nlim = 100000
perf_cycle_offset = 2 # number of inital cycles not to be included in perf calc
ncycle_out_mesh = -1000

<hydro>
fluid = euler
eos = adiabatic
riemann = hlle
reconstruction = ppm
gamma = 1.666666666666667 # gamma = C_p/C_v
const_accel     = true   # add constant accleration source term
const_accel_val = -0.1   # value of constant acceleration
const_accel_dir = 2      # direction of constant acceleration
first_order_flux_correct = true 
dfloor = 0.00000001                            # unused, in [code units]
pfloor = 0.00000001                            # unused, in [code units]

<parthenon/output0>
file_type = hdf5
dt = 1.0
variables = prim

<parthenon/output1>
file_type = hst
dt = 0.1

Contents of the history file are changing but neither paraview, nor movie2d script is showing the evolution of fluid.

#  History data
# [1]=time     [2]=dt       [3]=mass    [4]=1-mom   [5]=2-mom   [6]=3-mom   [7]=KE      [8]=tot-E   
  0.00000e+00  1.03280e-03  3.33333e-09  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  5.00000e-09
  1.00181e-01  1.03280e-03  3.33333e-09 -5.31666e-25 -4.29764e-24  0.00000e+00  2.81286e-39  5.00000e-09
  2.00362e-01  1.03280e-03  3.33333e-09 -1.06333e-24 -8.59528e-24  0.00000e+00  1.12514e-38  5.00000e-09
  3.00544e-01  1.03280e-03  3.33333e-09 -1.59500e-24 -1.28929e-23  0.00000e+00  2.53157e-38  5.00000e-09
  4.00725e-01  1.03280e-03  3.33333e-09 -2.12666e-24 -1.71906e-23  0.00000e+00  4.50058e-38  5.00000e-09
  5.00906e-01  1.03280e-03  3.33333e-09 -2.65833e-24 -2.14882e-23  0.00000e+00  7.03215e-38  5.00000e-09
  6.00054e-01  1.03280e-03  3.33333e-09 -3.18451e-24 -2.57415e-23  0.00000e+00  1.00915e-37  5.00000e-09
  7.00235e-01  1.03280e-03  3.33333e-09 -3.71618e-24 -3.00392e-23  0.00000e+00  1.37424e-37  5.00000e-09
  8.00417e-01  1.03280e-03  3.33333e-09 -4.24785e-24 -3.43368e-23  0.00000e+00  1.79559e-37  5.00000e-09
  9.00598e-01  1.03280e-03  3.33333e-09 -4.77951e-24 -3.86345e-23  0.00000e+00  2.27320e-37  5.00000e-09
  1.00078e+00  1.03280e-03  3.33333e-09 -5.31118e-24 -4.29321e-23  0.00000e+00  2.80706e-37  5.00000e-09
  1.10096e+00  1.03280e-03  3.33333e-09 -5.84284e-24 -4.72298e-23  0.00000e+00  3.39718e-37  5.00000e-09
  1.20011e+00  1.03280e-03  3.33333e-09 -6.36903e-24 -5.14831e-23  0.00000e+00  4.03661e-37  5.00000e-09
  1.30029e+00  1.03280e-03  3.33333e-09 -6.90070e-24 -5.57807e-23  0.00000e+00  4.73867e-37  5.00000e-09
  1.40047e+00  1.03280e-03  3.33333e-09 -7.43236e-24 -6.00784e-23  0.00000e+00  5.49698e-37  5.00000e-09
  1.50065e+00  1.03280e-03  3.33333e-09 -7.96403e-24 -6.43760e-23  0.00000e+00  6.31155e-37  5.00000e-09
  1.60083e+00  1.03280e-03  3.33333e-09 -8.49569e-24 -6.86737e-23  0.00000e+00  7.18237e-37  5.00000e-09
  1.70101e+00  1.03280e-03  3.33333e-09 -9.02736e-24 -7.29713e-23  0.00000e+00  8.10946e-37  5.00000e-09
  1.80016e+00  1.03280e-03  3.33333e-09 -9.55354e-24 -7.72246e-23  0.00000e+00  9.08237e-37  5.00000e-09
  1.90034e+00  1.03280e-03  3.33333e-09 -1.00852e-23 -8.15223e-23  0.00000e+00  1.01214e-36  5.00000e-09
... stripped ...

It created 10 image files, and all of them has same plot

image

KOKKOS_ENABLE_CUDA defined but the compiler is not defining the __CUDACC__ macro as expected

Getting this error in <mesh/mesh.hpp> of kh.cpp file.

File external/Kokkos/core/src/setup/Kokkos_Setup_Cuda.hpp

#if defined(KOKKOS_ENABLE_CUDA) && !defined(__CUDACC__)
#error \
    "KOKKOS_ENABLE_CUDA defined but the compiler is not defining the __CUDACC__ macro as expected"
// Some tooling environments will still function better if we do this here.
#define __CUDACC__
#endif /* defined(KOKKOS_ENABLE_CUDA) && !defined(__CUDACC__) */

I have also tried -D__CUDACC__=external/Kokkos/bin/nvcc_wrapper, but still same error. Though code works fine and compiles without any error but it complains (false positive messages) in the code editor while working with the codebase.

fix reflecting boundary conditions

The Parthenon reflecting boundary conditions do not work correctly in AthenaPK, because AthenaPK allocates variables in a different way from other Parthenon codes.

I will add custom boundary conditions for this case.

CMake warning: empty CUDA_ARCHITECTURES not allowed

Not sure if this is a Parthenon issue or AthenaPK issue. Only appears with very recent versions of CMake.

CMake Warning (dev) in src/CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning.

  CUDA_ARCHITECTURES is empty for target "athenaPK".
This warning is for project developers.  Use -Wno-dev to suppress it.

Why are you using problem/[id] for initial conditions?

Hi, I am wondering why did you plan to use <problem/[id]> instead of <problem> like athena++ code?

Also are you planning to use different AMR solutions along with parthenon? If not, then why are you using <parthenon/....> in blocks?

Implement xsPPM7 reconstruction

@BenWibking opened on GitLab https://gitlab.com/theias/hpc/jmstone/athena-parthenon/athenapk/-/issues/15:

This method is described in: Rider, Greenough, and Kamm (JCP, Vol. 225, 2, p. 1827-1848): https://ui.adsabs.harvard.edu/abs/2007JCoPh.225.1827R/abstract.
I tested this method in conjunction with RK2, but it requires a very low CFL number to be stable. It may work better with RK3.
The numerical dissipation is extremely low, so physical dissipation may be required for practical use. For instance, you cannot simulate a Sedov point explosion without extra dissipation.
Note: It requires the same number of ghost zones as PPM+flattening (4).

Cluster pgen Cleanup

We identified a few aspects of the cluster setup in #2 that could be cleaned up, although they are low priority

  • Unify the calculation mean molecular masses from helium mass fraction
  • Simplify/avoid the passing around of hydro_pkg for the different cluster functions
  • k_0 and k_100 have bad orders of magnitude, <-100. This could be partially rectified by dividing these quantities by mean particle mass
  • Make determining the radial points where the 1D HSE profile is integrated more robust without the r_sampling parameter

Split of cooling time cfl

The adaptive and Townsend integrators do not require a limitation on the timestep a priori.
Nevertheless, from a physical point of view it may be desired to limit the timestep anyway so that the thermal dynamics and other dynamics don't decouple.
While this is currently supported/tied to the adaptive rk12, and rk45 cooling integrators, we should separate that option and properly document it.

Mass is not conserved for reflecting boundary conditions - Sod tube problem

Hello! We are currently doing some tests with Parthenon in another code in development and thought to test AthenaPK as well if the problem occurs there. It seems that both mass and energy is not conserved for a Sod tube problem with periodic boundary conditions along x2 and x3, and reflecting along the x1 direction. We also encounter this problem for an RT test we wrote for a private code we have where we set reflecting boundary conditions along the x1 direction as well.

We compile the code using these commands
cmake -DKokkos_ENABLE_OPENMP=off -DKokkos_ARCH_BDW=on ..
and
make -j

and run the Sod tube problem with the parameter file attached. Is there a fix for this problem or is it an unresolved issue for Parthenon? I run with commit 540deb9 in the main branch.

Parameter file used
sod.txt

Run with ix1_bc and ox1_bc set to periodic
parthenon_periodic.txt

Run with ix1_bc and ox1_bc set to reflecting
parthenon_reflecting.txt

Any insight about this would be really helpful!

MPI-related error message at end of run

On macOS 13.1 + OpenMPI 4.1.4:

walltime used = 8.22e+01
zone-cycles/wallsecond = 7.10e+06
--------------------------------------------------------------------------
A system call failed during shared memory initialization that should
not have.  It is likely that your MPI job will now either abort or
experience performance degradation.

  Local host:  Bens-MacBook-Pro.local
  System call: unlink(2) /var/folders/lz/n6jhm4hj4_l73w0m2yky5m340000gn/T//ompi.Bens-MacBook-Pro.501/pid.89453/1/vader_segment.Bens-MacBook-Pro.501.bdf40001.6
  Error:       No such file or directory (errno 2)
--------------------------------------------------------------------------

Build error when using cuda

I have used the following command to build the project for cuda

cmake -Bbuild-gpu -DKokkos_ARCH_TURING75=ON  -DKokkos_ENABLE_CUDA=ON  -DCUDA_ROOT=/opt/cuda
make -j $(nproc)

now it is failing at

/usr/include/c++/12.2.1/bits/locale_facets_nonio.tcc: In member function ‘_InIter std::__cxx11::time_get<_CharT, _InIter>::get(iter_type, iter_type, std::ios_base&, std::ios_base::iostate&, tm*, const char_type*, const char_type*) const’:
/usr/include/c++/12.2.1/bits/locale_facets_nonio.tcc:1477:77: error: invalid type argument of unary ‘*’ (have ‘int’)
 1477 |       if ((void*)(this->*(&time_get::do_get)) == (void*)(&time_get::do_get))
      | 

Is it specific to g++ version, if yes which one is recommended?

ascent slices do not work on GPU (likely ascent bug)

The Ascent outputs do not work on GPU for the precipitator pgen. It works fine on host runs.

On GPU runs, we instead get:

s1/p1 pseudocolor plot yielded no data, i.e., no cells remains2/p1 pseudocolor plot yielded no data, i.e., no cells remaincycle=1465 time=2.6308698410348811e+03 dt=1.7657586286923035e+00 zone-cycles/wsec_step=4.56e+05 wsec_total=1.23e+03 wsec_step=1.15e+00

and

dens00000

Unable to run cloud in wind simulation because of cuda allocation error.

build/bin/athenaPK -i inputs/cloud.in -d cloud

The error I am getting is

######################################
###### Cloud in wind problem generator
#### Input parameters
## Cloud radius: 0.1 kpc
## Cloud density: 1e-24 g/cm^3
## Wind density: 1e-27 g/cm^3
## Wind temperature: 1e+07 K
## Wind velocity: 1.7e+03 km/s
#### Derived parameters
## Cloud temperature (from pressure equ.): 1e+04 K
## Cloud to wind density ratio: 1e+03
## Cloud to wind temperature ratio: 0.001
## Uniform pressure (code units): 2.2
## Wind sonic Mach: 3.5
## Cloud crushing time: 1.8 Myr
#### INFO:
## Interpreted time limits (partenon/time/tlim and dt for outputs) as in multiples of the cloud crushing time.
## Simulation will now run for 3.6 [code_time] corresponding to 2 [t_cc].
######################################
#Variables in use:
# Package: parthenon::resolved_state
# ---------------------------------------------------
# Variables:
# Name  Metadata flags
# ---------------------------------------------------
prim                      Cell,Provides,Real,Derived
cons                      Cell,Provides,Real,Independent,FillGhost,WithFluxes
# ---------------------------------------------------
# Sparse Variables:
# Name  sparse id       Metadata flags
# ---------------------------------------------------
# ---------------------------------------------------
# Swarms:
# Swarm Value   metadata
# ---------------------------------------------------

terminate called after throwing an instance of 'std::runtime_error'
  what():  Kokkos failed to allocate memory for label "cons".  Allocation using MemorySpace named "Cuda" failed with the following error:  Allocation of size 17.09 M failed, likely due to insufficient memory.  (The allocation mechanism was cudaMalloc().  The Cuda allocation returned the error code ""cudaErrorMemoryAllocation".)

These are my GPU details

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "Quadro P4000"
  CUDA Driver Version / Runtime Version          12.0 / 11.1
  CUDA Capability Major/Minor version number:    6.1
  Total amount of global memory:                 8106 MBytes (8499363840 bytes)
  (14) Multiprocessors, (128) CUDA Cores/MP:     1792 CUDA Cores
  GPU Max Clock rate:                            1480 MHz (1.48 GHz)
  Memory Clock rate:                             3802 Mhz
  Memory Bus Width:                              256-bit
  L2 Cache Size:                                 2097152 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total shared memory per multiprocessor:        98304 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Managed Memory:                Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 3 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 12.0, CUDA Runtime Version = 11.1, NumDevs = 1
Result = PASS

Add shock flattener

@BenWibking reported on https://gitlab.com/theias/hpc/jmstone/athena-parthenon/athenapk/-/issues/14:

It would be very useful to implement the shock flattener from Miller & Colella (2002): https://ui.adsabs.harvard.edu/abs/2002JCoPh.183...26M/abstract

Thread:

@pgrete FYI (not sure if you've seen it), first order flux correction is implemented to fix troubled cells.
@BenWibking Yes, I've seen that. In my tests with Quokka, the shock flattener appeared necessary to suppress some post-shock oscillations in strong shocks. I remember this being a particular issue with the Zel'dovich spike in radiative shocks. Do you see any of these kinds of issues currently? I didn't do characteristic reconstruction in Quokka, so that might be a difference.

@pgrete I see. I didn't think of the post-shock oscillations. For those the flattener might be the better choice.

@BenWibking Is first-order flux correction implemented for RK2/RK3?

@pgrete We call first order flux correction in each stage at the moment (so not just at the very end). I'm not sure which is the better approach, but I went for this one as I think that "each stage" might be better because no neighboring cells will be updated during an entire cycle with potentially nonphysical data from a previous stage.
@BenWibking That's what I did in Quokka as well. If I didn't do this at the end of each stage, then in the later stages, the Riemann solver would produce NANs from the imaginary sound speeds. So this makes sense to me.

@BenWibking The oscillations can also be reproduced with a pure-hydro shock tube, such as this one: https://github.com/BenWibking/quokka/blob/development/src/HydroShocktube/test_hydro_shocktube.cpp based on this test problem: https://cococubed.com/code_pages/ppm_1d.shtml

Namespace is not showing in the main.cpp

It's been long time I worked on athenapk. I have defined a problem in pgen folder with name rt.cpp and here is the boilerplate

//! \file rt.cpp
//! \brief Rayleigh Taylor problem generator

// C headers

// C++ headers
#include <algorithm> // min, max
#include <cmath>     // log
#include <cstring>   // strcmp()

// Parthenon headers
#include "mesh/mesh.hpp"
#include <parthenon/driver.hpp>
#include <parthenon/package.hpp>
#include <random>

// AthenaPK headers
#include "../main.hpp"

namespace rt {
  using namespace parthenon::driver::prelude;

  void ProblemGenerator(MeshBlock *pmb, parthenon::ParameterInput *pin) {

  }
} // namespace rt

I have also added the file in src/pgen/CMakeLists.txt target source . Now when I try to use the namespace in main.cpp and compile it I am getting error

/home/tbhaxor/athenapk/src/main.cpp:102:44: error: use of undeclared identifier 'rt'
    pman.app_input->MeshProblemGenerator = rt::ProblemGenerator;
                                           ^
1 error generated when compiling for sm_61.
make[2]: *** [src/CMakeFiles/athenaPK.dir/build.make:63: src/CMakeFiles/athenaPK.dir/main.cpp.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:1736: src/CMakeFiles/athenaPK.dir/all] Error 2
make: *** [Makefile:141: all] Error 2

Note Other namespace like kh or advection are working. Problem is when I define new problem.

...
  } else if (problem == "rt") {
    pman.app_input->MeshProblemGenerator = rt::ProblemGenerator;
  } else {
...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.