ceed / laghos Goto Github PK

View Code? Open in Web Editor NEW

178.0 178.0 57.0 8.63 MB

High-order Lagrangian Hydrodynamics Miniapp

Home Page: http://ceed.exascaleproject.org/miniapps

License: BSD 2-Clause "Simplified" License

C++ 93.69% Makefile 6.31%

ceed finite-elements high-order hpc hydrodynamics lagrangian miniapp proxy-application

laghos's Issues

Laghos in a workflow?

Hi! I was wondering if you have any example workflows that use Laghos that are beyond a single command (e.g., the mpirun examples at the bottom of the repository). We are looking for something that has a bit of complexity in terms of workflow steps. Thanks!

Error messages from glibc

I got error messages from glibc while running the newest version of Laghos on Titan. I didn't see such errors from commission bdd3fe4 that dates back to Apr 28, 2018.

After launched laghos with command aprun -n1 -S1 ./laghos -p 0 -m data/square01_quad.mesh -rs 3 -tf 0.75 -no-vis -pa, I got the following output from Laghos. The error messages were thrown at the end of the entire execution.

       __                __                 
      / /   ____  ____  / /_  ____  _____   
     / /   / __ `/ __ `/ __ \/ __ \/ ___/ 
    / /___/ /_/ / /_/ / / / / /_/ (__  )    
   /_____/\__,_/\__, /_/ /_/\____/____/  
               /____/                       

Options used:
   --mesh data/square01_quad.mesh
   --refine-serial 3
   --refine-parallel 0
   --problem 0
   --order-kinematic 2
   --order-thermo 1
   --ode-solver 4
   --t-final 0.75
   --cfl 0.5
   --cg-tol 1e-08
   --cg-max-steps 300
   --max-steps -1
   --partial-assembly
   --no-visualization
   --visualization-steps 5
   --no-visit
   --no-print
   --outputfilename results/Laghos
   --partition 111
Zones min/max: 256 256
Number of kinematic (position, velocity) dofs: 2178
Number of specific internal energy dofs: 1024
Repeating step 1
step     5,	t = 0.0419,	dt = 0.008389,	|e| = 49.5149494919
Repeating step 7
step    10,	t = 0.0789,	dt = 0.007131,	|e| = 49.5161128627
Repeating step 14
step    15,	t = 0.1124,	dt = 0.006061,	|e| = 49.5177753959
step    20,	t = 0.1427,	dt = 0.006061,	|e| = 49.5198024925
Repeating step 23
step    25,	t = 0.1703,	dt = 0.005152,	|e| = 49.5220105238
step    30,	t = 0.1960,	dt = 0.005152,	|e| = 49.5243916346
Repeating step 33
step    35,	t = 0.2195,	dt = 0.004379,	|e| = 49.5267998761
step    40,	t = 0.2414,	dt = 0.004379,	|e| = 49.5293004423
Repeating step 44
step    45,	t = 0.2619,	dt = 0.003722,	|e| = 49.5317784718
step    50,	t = 0.2806,	dt = 0.003722,	|e| = 49.5341809725
step    55,	t = 0.2992,	dt = 0.003722,	|e| = 49.5367381548
Repeating step 59
step    60,	t = 0.3167,	dt = 0.003164,	|e| = 49.5392442749
step    65,	t = 0.3325,	dt = 0.003164,	|e| = 49.5415947028
step    70,	t = 0.3483,	dt = 0.003164,	|e| = 49.5440447714
Repeating step 75
step    75,	t = 0.3637,	dt = 0.002689,	|e| = 49.5465122122
step    80,	t = 0.3771,	dt = 0.002689,	|e| = 49.5487380142
step    85,	t = 0.3905,	dt = 0.002689,	|e| = 49.5510287175
step    90,	t = 0.4040,	dt = 0.002689,	|e| = 49.5533891554
Repeating step 94
step    95,	t = 0.4166,	dt = 0.002286,	|e| = 49.5556691604
step   100,	t = 0.4281,	dt = 0.002286,	|e| = 49.5577980923
step   105,	t = 0.4395,	dt = 0.002286,	|e| = 49.5599883308
step   110,	t = 0.4509,	dt = 0.002286,	|e| = 49.5622452152
step   115,	t = 0.4624,	dt = 0.002286,	|e| = 49.5645931286
Repeating step 117
step   120,	t = 0.4724,	dt = 0.001943,	|e| = 49.5667451456
step   125,	t = 0.4821,	dt = 0.001943,	|e| = 49.5689048459
step   130,	t = 0.4918,	dt = 0.001943,	|e| = 49.5711623989
step   135,	t = 0.5016,	dt = 0.001943,	|e| = 49.5735320029
step   140,	t = 0.5113,	dt = 0.001943,	|e| = 49.5760335422
Repeating step 143
step   145,	t = 0.5201,	dt = 0.001652,	|e| = 49.5784372156
step   150,	t = 0.5284,	dt = 0.001652,	|e| = 49.5808038787
step   155,	t = 0.5366,	dt = 0.001652,	|e| = 49.5832994621
step   160,	t = 0.5449,	dt = 0.001652,	|e| = 49.5859386014
step   165,	t = 0.5532,	dt = 0.001652,	|e| = 49.5887330840
step   170,	t = 0.5614,	dt = 0.001652,	|e| = 49.5916942113
Repeating step 174
step   175,	t = 0.5692,	dt = 0.001404,	|e| = 49.5946319207
step   180,	t = 0.5762,	dt = 0.001404,	|e| = 49.5974264428
step   185,	t = 0.5832,	dt = 0.001404,	|e| = 49.6003578700
step   190,	t = 0.5902,	dt = 0.001404,	|e| = 49.6034325065
step   195,	t = 0.5973,	dt = 0.001404,	|e| = 49.6066528310
step   200,	t = 0.6043,	dt = 0.001404,	|e| = 49.6100156658
step   205,	t = 0.6113,	dt = 0.001404,	|e| = 49.6135120436
Repeating step 210
step   210,	t = 0.6181,	dt = 0.001193,	|e| = 49.6170328503
step   215,	t = 0.6241,	dt = 0.001193,	|e| = 49.6202288539
step   220,	t = 0.6300,	dt = 0.001193,	|e| = 49.6235256499
step   225,	t = 0.6360,	dt = 0.001193,	|e| = 49.6269117345
step   230,	t = 0.6420,	dt = 0.001193,	|e| = 49.6303789972
step   235,	t = 0.6479,	dt = 0.001193,	|e| = 49.6339237950
step   240,	t = 0.6539,	dt = 0.001193,	|e| = 49.6375377515
step   245,	t = 0.6599,	dt = 0.001193,	|e| = 49.6412152823
step   250,	t = 0.6658,	dt = 0.001193,	|e| = 49.6449452229
Repeating step 251
step   255,	t = 0.6709,	dt = 0.001014,	|e| = 49.6481459027
step   260,	t = 0.6760,	dt = 0.001014,	|e| = 49.6513683314
step   265,	t = 0.6810,	dt = 0.001014,	|e| = 49.6546071849
step   270,	t = 0.6861,	dt = 0.001014,	|e| = 49.6578542919
step   275,	t = 0.6912,	dt = 0.001014,	|e| = 49.6611021143
step   280,	t = 0.6963,	dt = 0.001014,	|e| = 49.6643387071
step   285,	t = 0.7013,	dt = 0.001014,	|e| = 49.6675524246
step   290,	t = 0.7064,	dt = 0.001014,	|e| = 49.6707385505
step   295,	t = 0.7115,	dt = 0.001014,	|e| = 49.6738866402
step   300,	t = 0.7165,	dt = 0.001014,	|e| = 49.6769911272
Repeating step 301
step   305,	t = 0.7209,	dt = 0.000862,	|e| = 49.6795843556
step   310,	t = 0.7252,	dt = 0.000862,	|e| = 49.6821283607
step   315,	t = 0.7295,	dt = 0.000862,	|e| = 49.6846152846
step   320,	t = 0.7338,	dt = 0.000862,	|e| = 49.6870395550
step   325,	t = 0.7381,	dt = 0.000862,	|e| = 49.6893989005
step   330,	t = 0.7424,	dt = 0.000862,	|e| = 49.6916916363
step   335,	t = 0.7467,	dt = 0.000862,	|e| = 49.6939068701
step   339,	t = 0.7500,	dt = 0.000702,	|e| = 49.6955373491

CG (H1) total time: 9.7055324290
CG (H1) rate (megadofs x cg_iterations / second): 4.4836736489

CG (L2) total time: 1.2748883630
CG (L2) rate (megadofs x cg_iterations / second): 3.4116116565

Forces total time: 1.3472846380
Forces rate (megadofs x timesteps / second): 3.3653111392

UpdateQuadData total time: 5.2018385210
UpdateQuadData rate (megaquads x timesteps / second): 1.1275767166

Major kernels total time (seconds): 16.2546555880
Major kernels total rate (megadofs x time steps / second): 0.1897332111

Energy  diff: 1.97e-07
L_inf  error: 2.78e-02
L_1    error: 6.10e-03
L_2    error: 7.37e-03
*** glibc detected *** ./laghos_error: double free or corruption (!prev): 0x0000000000af1820 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x790e8)[0x2aaaac3510e8]
/lib64/libc.so.6(cfree+0x6c)[0x2aaaac35618c]
./laghos_error[0x485529]
./laghos_error[0x53cfab]
./laghos_error[0x416b53]
./laghos_error[0x4097c9]
/lib64/libc.so.6(__libc_start_main+0xe6)[0x2aaaac2f6c36]
./laghos_error[0x40de9d]
======= Memory map: ========
00400000-007fa000 r-xp 00000000 00:0e 800493                             /var/opt/cray/alps/spool/19089102/laghos_error
009f9000-009fa000 r-xp 003f9000 00:0e 800493                             /var/opt/cray/alps/spool/19089102/laghos_error
009fa000-009fc000 rwxp 003fa000 00:0e 800493                             /var/opt/cray/alps/spool/19089102/laghos_error
009fc000-00a3a000 rwxp 00000000 00:00 0                                  [heap]
00a3a000-00be3000 rwxp 00000000 00:00 0                                  [heap]
2aaaaaaab000-2aaaaaaca000 r-xp 00000000 00:1e 31589004                   /lib64/ld-2.11.3.so
2aaaaaaca000-2aaaaaacb000 r-xp 00000000 00:00 0                          [vdso]
2aaaaaacb000-2aaaaaacc000 rwxp 00000000 00:00 0 
2aaaaaacc000-2aaaaaad8000 rwxs 00000000 00:04 844043                     /dev/zero (deleted)
2aaaaacca000-2aaaaaccb000 r-xp 0001f000 00:1e 31589004                   /lib64/ld-2.11.3.so
2aaaaaccb000-2aaaaaccd000 rwxp 00020000 00:1e 31589004                   /lib64/ld-2.11.3.so
2aaaaaccd000-2aaaaacd5000 r-xp 00000000 00:1e 30860254                   /lib64/librt-2.11.3.so
2aaaaacd5000-2aaaaaed4000 ---p 00008000 00:1e 30860254                   /lib64/librt-2.11.3.so
2aaaaaed4000-2aaaaaed5000 r-xp 00007000 00:1e 30860254                   /lib64/librt-2.11.3.so
2aaaaaed5000-2aaaaaed6000 rwxp 00008000 00:1e 30860254                   /lib64/librt-2.11.3.so
2aaaaaed6000-2aaaaaed7000 rwxp 00000000 00:00 0 
2aaaaaed7000-2aaaaaeda000 r-xp 00000000 00:1e 17687334                   /opt/cray/atp/2.1.1/libAppDebug/libAtpSigHandler.so.0.0.0
2aaaaaeda000-2aaaab0d9000 ---p 00003000 00:1e 17687334                   /opt/cray/atp/2.1.1/libAppDebug/libAtpSigHandler.so.0.0.0
2aaaab0d9000-2aaaab0da000 r-xp 00002000 00:1e 17687334                   /opt/cray/atp/2.1.1/libAppDebug/libAtpSigHandler.so.0.0.0
2aaaab0da000-2aaaab0db000 rwxp 00003000 00:1e 17687334                   /opt/cray/atp/2.1.1/libAppDebug/libAtpSigHandler.so.0.0.0
2aaaab0db000-2aaaab0e1000 rwxp 00000000 00:00 0 
2aaaab0e1000-2aaaab0e4000 r-xp 00000000 00:1e 18164156                   /opt/cray/rca/1.0.0-2.0502.60530.1.63.gem/lib64/librca.so.0.0.0
2aaaab0e4000-2aaaab2e3000 ---p 00003000 00:1e 18164156                   /opt/cray/rca/1.0.0-2.0502.60530.1.63.gem/lib64/librca.so.0.0.0
2aaaab2e3000-2aaaab2e4000 r-xp 00002000 00:1e 18164156                   /opt/cray/rca/1.0.0-2.0502.60530.1.63.gem/lib64/librca.so.0.0.0
2aaaab2e4000-2aaaab2e5000 rwxp 00003000 00:1e 18164156                   /opt/cray/rca/1.0.0-2.0502.60530.1.63.gem/lib64/librca.so.0.0.0
2aaaab2e5000-2aaaab65b000 r-xp 00000000 00:1e 17295358                   /opt/cray/mpt/7.6.3/gni/mpich-gnu/5.1/lib/libmpich_gnu_51.so.3.0.1
2aaaab65b000-2aaaab85b000 ---p 00376000 00:1e 17295358                   /opt/cray/mpt/7.6.3/gni/mpich-gnu/5.1/lib/libmpich_gnu_51.so.3.0.1
2aaaab85b000-2aaaab86c000 r-xp 00376000 00:1e 17295358                   /opt/cray/mpt/7.6.3/gni/mpich-gnu/5.1/lib/libmpich_gnu_51.so.3.0.1
2aaaab86c000-2aaaab873000 rwxp 00387000 00:1e 17295358                   /opt/cray/mpt/7.6.3/gni/mpich-gnu/5.1/lib/libmpich_gnu_51.so.3.0.1
2aaaab873000-2aaaab8a1000 rwxp 00000000 00:00 0 
2aaaab8a1000-2aaaab8fc000 r-xp 00000000 00:1e 30860807                   /lib64/libm-2.11.3.so
2aaaab8fc000-2aaaabafb000 ---p 0005b000 00:1e 30860807                   /lib64/libm-2.11.3.so
2aaaabafb000-2aaaabafc000 r-xp 0005a000 00:1e 30860807                   /lib64/libm-2.11.3.so
2aaaabafc000-2aaaabb1a000 rwxp 0005b000 00:1e 30860807                   /lib64/libm-2.11.3.so
2aaaabb1a000-2aaaabb31000 r-xp 00000000 00:1e 30860247                   /lib64/libpthread-2.11.3.so
2aaaabb31000-2aaaabd31000 ---p 00017000 00:1e 30860247                   /lib64/libpthread-2.11.3.so
2aaaabd31000-2aaaabd32000 r-xp 00017000 00:1e 30860247                   /lib64/libpthread-2.11.3.so
2aaaabd32000-2aaaabd33000 rwxp 00018000 00:1e 30860247                   /lib64/libpthread-2.11.3.so
2aaaabd33000-2aaaabd37000 rwxp 00000000 00:00 0 
2aaaabd37000-2aaaabeb1000 r-xp 00000000 00:1e 224448                     /opt/gcc/6.3.0/snos/lib64/libstdc++.so.6.0.22
2aaaabeb1000-2aaaac0b0000 ---p 0017a000 00:1e 224448                     /opt/gcc/6.3.0/snos/lib64/libstdc++.so.6.0.22
2aaaac0b0000-2aaaac0ba000 r-xp 00179000 00:1e 224448                     /opt/gcc/6.3.0/snos/lib64/libstdc++.so.6.0.22
2aaaac0ba000-2aaaac0bc000 rwxp 00183000 00:1e 224448                     /opt/gcc/6.3.0/snos/lib64/libstdc++.so.6.0.22
2aaaac0bc000-2aaaac0c1000 rwxp 00000000 00:00 0 
2aaaac0c1000-2aaaac0d7000 r-xp 00000000 00:1e 224229                     /opt/gcc/6.3.0/snos/lib64/libgcc_s.so.1
2aaaac0d7000-2aaaac2d6000 ---p 00016000 00:1e 224229                     /opt/gcc/6.3.0/snos/lib64/libgcc_s.so.1
2aaaac2d6000-2aaaac2d7000 r-xp 00015000 00:1e 224229                     /opt/gcc/6.3.0/snos/lib64/libgcc_s.so.1
2aaaac2d7000-2aaaac2d8000 rwxp 00016000 00:1e 224229                     /opt/gcc/6.3.0/snos/lib64/libgcc_s.so.1
2aaaac2d8000-2aaaac44a000 r-xp 00000000 00:1e 30860153                   /lib64/libc-2.11.3.so
2aaaac44a000-2aaaac64a000 ---p 00172000 00:1e 30860153                   /lib64/libc-2.11.3.so
2aaaac64a000-2aaaac64e000 r-xp 00172000 00:1e 30860153                   /lib64/libc-2.11.3.so
2aaaac64e000-2aaaac64f000 rwxp 00176000 00:1e 30860153                   /lib64/libc-2.11.3.so
2aaaac64f000-2aaaac654000 rwxp 00000000 00:00 0 
2aaaac654000-2aaaac655000 r-xp 00000000 00:1e 13738189                   /opt/cray/xpmem/0.1-2.0502.64982.7.19.gem/lib64/libxpmem.so.0.0.0
2aaaac655000-2aaaac855000 ---p 00001000 00:1e 13738189                   /opt/cray/xpmem/0.1-2.0502.64982.7.19.gem/lib64/libxpmem.so.0.0.0
2aaaac855000-2aaaac856000 r-xp 00001000 00:1e 13738189                   /opt/cray/xpmem/0.1-2.0502.64982.7.19.gem/lib64/libxpmem.so.0.0.0
2aaaac856000-2aaaac857000 rwxp 00002000 00:1e 13738189                   /opt/cray/xpmem/0.1-2.0502.64982.7.19.gem/lib64/libxpmem.so.0.0.0
2aaaac857000-2aaaac858000 rwxp 00000000 00:00 0 
2aaaac858000-2aaaac8ac000 r-xp 00000000 00:1e 18164527                   /opt/cray/ugni/6.0-1.0502.10863.8.28.gem/lib64/libugni.so.0.6.0
2aaaac8ac000-2aaaacaab000 ---p 00054000 00:1e 18164527                   /opt/cray/ugni/6.0-1.0502.10863.8.28.gem/lib64/libugni.so.0.6.0
2aaaacaab000-2aaaacaac000 r-xp 00053000 00:1e 18164527                   /opt/cray/ugni/6.0-1.0502.10863.8.28.gem/lib64/libugni.so.0.6.0
2aaaacaac000-2aaaacaad000 rwxp 00054000 00:1e 18164527                   /opt/cray/ugni/6.0-1.0502.10863.8.28.gem/lib64/libugni.so.0.6.0
2aaaacaad000-2aaaacab5000 r-xp 00000000 00:1e 18164804                   /opt/cray/udreg/2.3.2-1.0502.10518.2.17.gem/lib64/libudreg.so.0.2.3
2aaaacab5000-2aaaaccb4000 ---p 00008000 00:1e 18164804                   /opt/cray/udreg/2.3.2-1.0502.10518.2.17.gem/lib64/libudreg.so.0.2.3
2aaaaccb4000-2aaaaccb5000 r-xp 00007000 00:1e 18164804                   /opt/cray/udreg/2.3.2-1.0502.10518.2.17.gem/lib64/libudreg.so.0.2.3
2aaaaccb5000-2aaaaccb6000 rwxp 00008000 00:1e 18164804                   /opt/cray/udreg/2.3.2-1.0502.10518.2.17.gem/lib64/libudreg.so.0.2.3
2aaaaccb6000-2aaaacce9000 r-xp 00000000 00:1e 17309877                   /opt/cray/pmi/5.0.12/lib64/libpmi.so.0.5.0
2aaaacce9000-2aaaacee8000 ---p 00033000 00:1e 17309877                   /opt/cray/pmi/5.0.12/lib64/libpmi.so.0.5.0
2aaaacee8000-2aaaacee9000 r-xp 00032000 00:1e 17309877                   /opt/cray/pmi/5.0.12/lib64/libpmi.so.0.5.0
2aaaacee9000-2aaaaceea000 rwxp 00033000 00:1e 17309877                   /opt/cray/pmi/5.0.12/lib64/libpmi.so.0.5.0
2aaaaceea000-2aaaacefd000 rwxp 00000000 00:00 0 
2aaaacefd000-2aaaad023000 r-xp 00000000 00:1e 224234                     /opt/gcc/6.3.0/snos/lib64/libgfortran.so.3.0.0
2aaaad023000-2aaaad223000 ---p 00126000 00:1e 224234                     /opt/gcc/6.3.0/snos/lib64/libgfortran.so.3.0.0
2aaaad223000-2aaaad224000 r-xp 00126000 00:1e 224234                     /opt/gcc/6.3.0/snos/lib64/libgfortran.so.3.0.0
2aaaad224000-2aaaad226000 rwxp 00127000 00:1e 224234                     /opt/gcc/6.3.0/snos/lib64/libgfortran.so.3.0.0
2aaaad226000-2aaaad265000 r-xp 00000000 00:1e 224263                     /opt/gcc/6.3.0/snos/lib64/libquadmath.so.0.0.0
2aaaad265000-2aaaad464000 ---p 0003f000 00:1e 224263                     /opt/gcc/6.3.0/snos/lib64/libquadmath.so.0.0.0
2aaaad464000-2aaaad465000 r-xp 0003e000 00:1e 224263                     /opt/gcc/6.3.0/snos/lib64/libquadmath.so.0.0.0
2aaaad465000-2aaaad466000 rwxp 0003f000 00:1e 224263                     /opt/gcc/6.3.0/snos/lib64/libquadmath.so.0.0.0
2aaaad466000-2aaaad467000 rwxp 00000000 00:00 0 
2aaaad467000-2aaaad469000 r-xp 00000000 00:1e 30860179                   /lib64/libdl-2.11.3.so
2aaaad469000-2aaaad669000 ---p 00002000 00:1e 30860179                   /lib64/libdl-2.11.3.so
2aaaad669000-2aaaad66a000 r-xp 00002000 00:1e 30860179                   /lib64/libdl-2.11.3.so
2aaaad66a000-2aaaad66b000 rwxp 00003000 00:1e 30860179                   /lib64/libdl-2.11.3.so
2aaaad66b000-2aaaad670000 rwxp 00000000 00:00 0 
2aaaad670000-2aaaad672000 r-xp 00000000 00:1e 20169207                   /opt/cray/alps/5.2.4-2.0502.9950.37.1.gem/lib64/libalpsutil.so.0.0.0
2aaaad672000-2aaaad871000 ---p 00002000 00:1e 20169207                   /opt/cray/alps/5.2.4-2.0502.9950.37.1.gem/lib64/libalpsutil.so.0.0.0
2aaaad871000-2aaaad872000 r-xp 00001000 00:1e 20169207                   /opt/cray/alps/5.2.4-2.0502.9950.37.1.gem/lib64/libalpsutil.so.0.0.0
2aaaad872000-2aaaad873000 rwxp 00002000 00:1e 20169207                   /opt/cray/alps/5.2.4-2.0502.9950.37.1.gem/lib64/libalpsutil.so.0.0.0
2aaaad873000-2aaaad877000 r-xp 00000000 00:1e 20169205                   /opt/cray/alps/5.2.4-2.0502.9950.37.1.gem/lib64/libalpslli.so.0.0.0
2aaaad877000-2aaaada76000 ---p 00004000 00:1e 20169205                   /opt/cray/alps/5.2.4-2.0502.9950.37.1.gem/lib64/libalpslli.so.0.0.0
2aaaada76000-2aaaada77000 r-xp 00003000 00:1e 20169205                   /opt/cray/alps/5.2.4-2.0502.9950.37.1.gem/lib64/libalpslli.so.0.0.0
2aaaada77000-2aaaada78000 rwxp 00004000 00:1e 20169205                   /opt/cray/alps/5.2.4-2.0502.9950.37.1.gem/lib64/libalpslli.so.0.0.0
2aaaada78000-2aaaada79000 r-xp 00000000 00:1e 18164344                   /opt/cray/wlm_detect/1.0-1.0502.64649.2.2.gem/lib64/libwlm_detect.so.0.0.0
2aaaada79000-2aaaadc78000 ---p 00001000 00:1e 18164344                   /opt/cray/wlm_detect/1.0-1.0502.64649.2.2.gem/lib64/libwlm_detect.so.0.0.0
2aaaadc78000-2aaaadc79000 r-xp 00000000 00:1e 18164344                   /opt/cray/wlm_detect/1.0-1.0502.64649.2.2.gem/lib64/libwlm_detect.so.0.0.0
2aaaadc79000-2aaaadc7a000 rwxp 00001000 00:1e 18164344                   /opt/cray/wlm_detect/1.0-1.0502.64649.2.2.gem/lib64/libwlm_detect.so.0.0.0
2aaaadc7a000-2aaaae0a8000 rwxp 00000000 00:00 0 
2aaab0000000-2aaab0021000 rwxp 00000000 00:00 0 
2aaab0021000-2aaab4000000 ---p 00000000 00:00 0 
7fffffed6000-7ffffffff000 rwxp 00000000 00:00 0                          [stack]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
_pmiu_daemon(SIGCHLD): [NID 08132] [c20-7c1s2n2] [Wed Oct 31 11:44:52 2018] PE RANK 0 exit signal Aborted
Application 19089102 exit codes: 134
Application 19089102 resources: utime ~24s, stime ~1s, Rss ~26356, inblocks ~18227, outblocks ~50239

GPU run of verification result #5 does not match values in table

In the README.md, there is a table of expected values for different MPI runs. After the table, it also gives commands that are expected to work with GPU support and return the same expected values. With my own testing against Nvidia and AMD GPUs, the verification results match except for run 5.

Here is my results for the CPU run 5:

vers@ss10grizzlypeak001:~> srun -n8 -p workqss10 ${LAGHOS_ROOT}/bin/laghos -p 2 -dim 1 -rs 5 -tf 0.2 -fa
<snip>
step   410,	t = 0.1986,	dt = 0.000470,	|e| = 3.1987264362e+01
step   413,	t = 0.2000,	dt = 0.000470,	|e| = 3.2012077410e+01

CG (H1) total time: 0.1181249440
CG (H1) rate (megadofs x cg_iterations / second): 18.3783452206

CG (L2) total time: 0.0031617820
CG (L2) rate (megadofs x cg_iterations / second): 34.0061395757

Forces total time: 0.0076223940
Forces rate (megadofs x timesteps / second): 56.6436214134

UpdateQuadData total time: 0.0268307880
UpdateQuadData rate (megaquads x timesteps / second): 16.1056768068

Major kernels total time (seconds): 0.1513464110
Major kernels total rate (megadofs x time steps / second): 20.0522032861

Energy  diff: 2.78e-06

This is what I see when I run on a node with an Nvidia A100:

vers@ss10grizzlypeak001:~> ${LAGHOS_GPU_ROOT}/bin/laghos -p 2 -dim 1 -rs 5 -tf 0.20 -fa
<snip>
step   135,	t = 0.1981,	dt = 0.001467,	|e| = 2.8284271247e+01
step   137,	t = 0.2000,	dt = 0.000459,	|e| = 2.8284271247e+01

CG (H1) total time: 0.3403743400
CG (H1) rate (megadofs x cg_iterations / second): 2.2845787964

CG (L2) total time: 0.0077113670
CG (L2) rate (megadofs x cg_iterations / second): 4.5480911491

Forces total time: 0.0142958950
Forces rate (megadofs x timesteps / second): 9.8514993290

UpdateQuadData total time: 0.0554843500
UpdateQuadData rate (megaquads x timesteps / second): 2.5330385956

Major kernels total time (seconds): 0.4101545850
Major kernels total rate (megadofs x time steps / second): 2.5819338336

Energy  diff: 0.00e+00

If I run with my binary built against ROCm 5.4.1 for an AMD Mi250, I get the same results as I did for the Nvidia GPU. And for both GPUs, all of the other GPU verification results do match the values in the table.

To me, it seems like the wrong command has been shared for the GPU verification result 5, since I find it hard to believe that the GPU results would match for the other 7 listed results.

I built Laghos, matching tag 'v3.1', MFEM matching 'v4.5.2', metis 5.1.0, and hypre 'v2.28.0'.

Cylindrical co-ordinate implementation with FA

Dear All,

Implemented cylindrical co-ordinate implementation with FA. It also include some modifications in MFEM/fem/bilininteg.cpp & MFEM/fem/bilininteg.hpp files. Please find attached files and some one please comment is it worth to incorporate in the actual code.

laghosRZ.zip
MFEM_RZ.zip

Regards
--Sijoy C. D.

CUDA-awareness question

Hello,
Yesterday I was able to build and run the Laghos-CUDA version and got the following output:

       __                __                 
      / /   ____  ____  / /_  ____  _____   
     / /   / __ `/ __ `/ __ \/ __ \/ ___/ 
    / /___/ /_/ / /_/ / / / / /_/ (__  )    
   /_____/\__,_/\__, /_/ /_/\____/____/  
               /____/                       

Options used:
   --mesh ../data/square01_quad.mesh
   --refine-serial 3
   --refine-parallel 0
   --problem 0
   --order-kinematic 2
   --order-thermo 1
   --ode-solver 4
   --t-final 0.75
   --cfl 0.5
   --cg-tol 1e-08
   --cg-max-steps 300
   --max-steps -1
   --partial-assembly
   --no-visualization
   --visualization-steps 5
   --no-visit
   --no-print
   --outputfilename results/Laghos
   --no-uvm
   --no-aware
   --no-hcpo
   --no-sync
   --no-share
�[32m[laghos] MPI is �[31;1mNOT�[32m CUDA aware�[m
�[32m[laghos] CUDA device count: 1�[m
�[32m[laghos] Rank_0 => Device_0 (Tesla K80:sm_3.7)�[m
�[32m[laghos] �[32;1mCartesian�[m�[32m partitioning will be used�[m
�[32m[laghos] pmesh->GetNE()=256�[m
Number of kinematic (position, velocity) dofs: 2178
Number of specific internal energy dofs: 1024

Repeating step 1
step     5,	t = 0.0419,	dt = 0.008389,	|e| = 49.5149494918
Repeating step 7
step    10,	t = 0.0789,	dt = 0.007131,	|e| = 49.5161128627
Repeating step 14
step    15,	t = 0.1124,	dt = 0.006061,	|e| = 49.5177753959
step    20,	t = 0.1427,	dt = 0.006061,	|e| = 49.5198024925
Repeating step 23
step    25,	t = 0.1703,	dt = 0.005152,	|e| = 49.5220105237
step    30,	t = 0.1960,	dt = 0.005152,	|e| = 49.5243916346
Repeating step 33
... (lines erased for clarity purposes)
step   320,	t = 0.7396,	dt = 0.000862,	|e| = 49.6902277095
step   325,	t = 0.7440,	dt = 0.000862,	|e| = 49.6924934237
step   330,	t = 0.7483,	dt = 0.000862,	|e| = 49.6946809472
step   333,	t = 0.7500,	dt = 0.000008,	|e| = 49.6955373330

CG (H1) total time: 16.3832961330
CG (H1) rate (megadofs=2178 x cg_iterations=92304 / second): 12.2709197446

CG (L2) total time: 0.6757104250
CG (L2) rate (megadofs x cg_iterations / second): 6.3284860523

Forces total time: 0.0423018780
Forces rate (megadofs x timesteps / second): 105.3661021858

UpdateQuadData total time: 0.6663413440
UpdateQuadData rate (megaquads x timesteps / second): 8.6549754896

Major kernels total time (seconds): 17.0919393550
Major kernels total rate (megadofs x time steps / second): 0.1773804562

I didn't give it too much thought yesterday but it seems to indicate that MPI is not CUDA-aware (a suggestion would be to use an output format that could also be more easily read with a standard text editor). Anyhow, this was in a node with a GPU and the results seemed to be within the admissible error. The problem arises when I try to run on two nodes (once GPU per node) and the code quickly stops and reports:

       __                __                 
      / /   ____  ____  / /_  ____  _____   
     / /   / __ `/ __ `/ __ \/ __ \/ ___/ 
    / /___/ /_/ / /_/ / / / / /_/ (__  )    
   /_____/\__,_/\__, /_/ /_/\____/____/  
               /____/                       

Options used:
   --mesh ../data/square01_quad.mesh
   --refine-serial 3
   --refine-parallel 0
   --problem 0
   --order-kinematic 2
   --order-thermo 1
   --ode-solver 4
   --t-final 0.75
   --cfl 0.5
   --cg-tol 1e-08
   --cg-max-steps 300
   --max-steps -1
   --partial-assembly
   --no-visualization
   --visualization-steps 5
   --no-visit
   --no-print
   --outputfilename results/Laghos
   --no-uvm
   --no-aware
   --no-hcpo
   --no-sync
   --no-share
�[32m[laghos] MPI is �[31;1mNOT�[32m CUDA aware�[m
�[32m[laghos] CUDA device count: 1�[m
�[32m[laghos] Rank_1 => Device_0 (Tesla K80:sm_3.7)�[m
�[32m[laghos] Rank_0 => Device_0 (Tesla K80:sm_3.7)�[m

The system is built with OpenMPI 4.0.2 and reports CUDA-awareness as shown in the snapshot. Any suggestions on what might be missing or why Laghos might be reporting that MPI is not CUDA-aware? Thank you.

R-Z coordinates with partial assembly

Hi,

I was trying to get an idea of what it would take to add an r-z coordinate system to laghos. From this paper, it looks like modifications need to happen to

det(J)(0)rho(0) (eq. 16)
The artificial viscosity
The force matrix (eq. 24)
Both mass matrices (eq. 20 and 23)

where the modifications appear to mostly be multiplications/divisions by r. What's the right way to get the coordinate information into the partial assembly kernels? I think the following code should be right, but I wanted to make sure I understood what's going on correctly:

const mfem::QuadratureInterpolator* coord_interpolator = H1.GetQuadratureInterpolator(integration_rule);
mfem::Vector coordsQ(NQ*NE*DIM); 
coord_interpolator->Values(coord_gridfunction, coordsQ);
auto X = mfem::Reshape(coordsQ.Read(), NQ, NE, DIM);

and then I could access the radius as double radius = X[quadrature_point,element,0] at some quadrature point of some element.

Thanks,
Varchas

"spack install laghos" on LLNL quartz with gcc/8.3.1 is failing

Hi: I am Mahesh Rajan ( [email protected]) supporting COE CI/CD effort.

Want to get your help with building Laghos on LLNL Quartz.
My attempt today ( 3/22/2021) with "spack install laghos" fails with the following error messages.
7 cd .; /p/lustre1/mrajan/spack/opt/spack/linux-rhel7-broadwell/gcc-8.
3.1/openmpi-4.0.5-6nhmalbbnbdxc2qmun5yztjoud3e7vnf/bin/mpic++ -O3 -s
td=c++11 -I/p/lustre1/mrajan/spack/opt/spack/linux-rhel7-broadwell/g
cc-8.3.1/mfem-4.2.0-6eq367cxc55rdkuh6xtg4j7tmqbah4m2/include -I/p/lu
stre1/mrajan/spack/opt/spack/linux-rhel7-broadwell/gcc-8.3.1/hypre-2
.20.0-xiuhkbvqpuwkkogpsrqxq4ggzo76re5w/include -I/p/lustre1/mrajan/s
pack/opt/spack/linux-rhel7-broadwell/gcc-8.3.1/metis-5.1.0-bdz23neyy
vkgzqxjzpildqzfd2skem3v/include -I/p/lustre1/mrajan/spack/opt/spack/
linux-rhel7-broadwell/gcc-8.3.1/zlib-1.2.11-c7zv5ftnxfe4h3237t42mshb
rwd4ecv7/include -I/p/lustre1/mrajan/spack/opt/spack/linux-rhel7-bro
adwell/gcc-8.3.1/mfem-4.2.0-6eq367cxc55rdkuh6xtg4j7tmqbah4m2/include
/mfem -c laghos_solver.cpp
8 laghos.cpp: In function 'int main(int, char**)':

9 laghos.cpp:525:39: error: no matching function for call to 'mfem::Fu
nctionCoefficient::FunctionCoefficient()'
10 FunctionCoefficient mat_coeff(gamma);
11 ^

FYI, my attempts to build Laghos with instructions at Github have also been unsuccssful with Module Intel/19.0.4 and with Module gcc. If you have new instructions on successful build ( directly, not with spack) I would appreciate a tested build process that works on LLNL CTS system like Quartz.
Thanks

Spack package

Spack package PR for Laghos thanks to @goxberry and @bhatele:

spack/spack#5956

A quick question

Hi all,

I was wondering why Laghos takes negative force; rhs.Neg(), which is -F in SolveVelocity.
There is no reason to do that, I think, because F = ma. However, if I take it out, it does not run correctly.
So, there is a certain reason for taking a negative there.
Does anyone know about it?

Best,

Sungho

Question about performance (mainly cuda code)

Hello,
I'm performing some tests (problem 1) with the MPI versions but the results are not as expected. Increasing the number of MPI ranks does not change the simulation parameters (e.g. dofs or cg iterations) as far as I can tell, but the job is split into the number of ranks. Then, the computational times multiply (in some cases roughly by the number of ranks) and the rates decrease. I was wondering if this is the expected behavior or if the problem size should increase according to the number of ranks in which case I'd be missing something.
Thanks.

Question about Non-AMR and AMR version of Laghos

Hi all,

Laghos has a separate AMR version, and I was wondering if you have any specific reasons for that.
I got the impression that the AMR version is not updated as much as the non-AMR version.
For example, the AMR version does not provide RK2average time stepping method.

Best,

Sungho

Failing tests with Hypre-cuda and full assembly

With mfem master and hypre built with CUDA support, make tests show failing tests.

NC or NURBS meshes

Hi,

I am a little new to this, so apologies if these are naive questions. My understanding is that MFEM generally supports nonconforming or NURBS meshes. However, it looks like all the meshes used by laghos (either generated on the fly or in the data/ directory) are conforming and in an H1 basis.

With regards to non-conforming meshes - the experimental AMR version (which should be NC, right?) does not appear to support partial assembly. Is this a general issue with NC meshes (i.e. that they do not support partial assembly)?

With regards to NURBS meshes, while there doesn't seem to be any explicit usage of it right now, since it's just a different basis for the nodes grid function, it would seem that they should be compatible with laghos. Is this accurate, or are there any gotchas I ought to be aware of?

Thanks!

Energy growth in all OCCA modes

Hi guys,

I'm trying out Laghos with the new OCCA 1.0 on both a Titan V with CUDA and a Radeon Frontier with HIP. After sorting out some issues (fyi OCCA_PI is not defined in kernels/gpu/quadratureData.okl and kernels/cpu/quadratureData.okl) I managed to get it running. However, I'm finding for all test cases the energy |e| begins to increase dramatically after around 100 time-steps. I've observed this for Serial, OpenMP, CUDA, and HIP modes on both a Titan V and a Radeon Frontier. Each of these runs also gives slightly different outputs on which time-steps get repeated, what dt is, and what the energy is.

Any help tracking down the cause would be great!

Path in Visit, no curved meshes seen in Visit plots despite higher order calculations

The files to load in Visit plotting software basically writes in a directory results/Lagos, specified in Lagos.cpp through character "basename". However, while trying to open the files "Laghos_00**.mfem_root" in the results directory with Visit, the path detected by the Visit and path defined in the files "Laghos_00***.mfem_root" get conflicted. I have get rid of this issue in a crude way by changing the character value "*basename" in Lagos.cpp to "Lagos". Then the results dumped in Lagos directory will be moved to directory "results" and open Visit in this directly to plot. This avoids the conflicts of the paths.

Another issue found is that after running higher order simulations using a command say "mpirun -np 4 laghos -p 1 -m data/square01_quad.mesh -rs 4 -tf 0.5 -ok 4 -ot 3 -visit", the plots of the mesh does not show as higher order (curved) meshes in Visit. Is there something I am missing while running the code?

Thanks in advance for the help

Bernstein basis for kinematics

Hello,
I've tried to set the kinematic shape functions to Bernstein polynomials by making the following change in laghos.cpp:
H1_FECollection H1FEC(order_v, dim, BasisType::Positive);

However when I try to run e.g. Taylor-Green test, there is an error message:

mpirun -np 1 ../../laghos -p 0 -m ../../data/square01_quad.mesh -rs 3 -tf 0.5 -s 7 -vis

       __                __                 
      / /   ____  ____  / /_  ____  _____   
     / /   / __ `/ __ `/ __ \/ __ \/ ___/ 
    / /___/ /_/ / /_/ / / / / /_/ (__  )    
   /_____/\__,_/\__, /_/ /_/\____/____/  
               /____/                       

Options used:
   --mesh ../../data/square01_quad.mesh
   --refine-serial 3
   --refine-parallel 0
   --problem 0
   --order-kinematic 2
   --order-thermo 1
   --ode-solver 7
   --t-final 0.5
   --cfl 0.5
   --cg-tol 1e-08
   --cg-max-steps 300
   --max-steps -1
   --partial-assembly
   --visualization
   --visualization-steps 5
   --no-visit
   --no-print
   --outputfilename results/Laghos
   --partition 111
Zones min/max: 256 256
Number of kinematic (position, velocity) dofs: 2178
Number of specific internal energy dofs: 1024


FiniteElement::Project (...) (vector) is not overloaded !
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD 
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[1786,1],0]
  Exit code:    1

ResetTimeStepEstimate logic

I think it would be better to have ResetTimeStepEstimate() also set qdata_is_current to false. Currently, it only sets dt_est to infinity. Since ResetTimeStepEstimate() is called between ode_solver steps, qdata_is_current is already set to false by LagrangianHydroOperator::Mult, so there is no bug. However, if someone introduces code that calls UpdateQuadratureData, which is a public function, then it will set qdata_is_current to true, and the infinite dt_est will be used rather than recomputing it.

Excessive device memory wastage

Hi developers, our team has developed a GPU device memory profiling tool and found excessive device memory wastage in Laghos. For example, the vector q_dt_est, q_e, e_vec, q_dx, q_dv in class QUpdate were used in very early stage and were released until the class QUpdate object reclamation, etc. I used some naive optimization methods, such as explicitly releasing some objects in advance which were used in the early stage and present in the whole application liveness. And it gets a good device memory peak reduction (at least 35%). Could you please optimize these kinds of inefficiencies?

Improve exception safety with smart pointers

Would you like to wrap any pointers with the class template “std::unique_ptr”?

Unknown kernel in partial assembly

Hi,

I have a quick question. I tested Laghos using high-order elements, but it was terminated when I used more than 4th-order and 3rd-order elements for H1 and L2 spaces with partial assembly calculation. It showed an error message saying "Unknown kernel".

Partial assembly is supposed to benefit high-order methods, but it seems not to be working in this case. This is strange to me.

Sungho

Something like this,
Unknown kernel 0x2c
MFEM abort: Unknown kernel
... in function: void mfem::hydrodynamics::QUpdate::UpdateQuadratureData(const mfem::Vector&, mfem::hydrodynamics::QuadratureData&)
... in file: laghos_solver.cpp:1327

L2 space for the energy variable discretization

We are trying to incorporate the ability to do higher order conforming mesh adaptation for the LAGHOS problem by using the PUMI framework. Is the discretization of the energy variable in the L2 space a requirement for the solver? Our approach so far has been to express the solution variables in the Bernstein basis to ease the solution transfer onto the curved adapted mesh. Is the piece-wise constant energy values from the previous time step used in a coupled fashion to solve for the next step? We would like to understand the necessity for the chosen discretization and the details of the time update step before moving forward.

Definition of Q1D

Hi,

I had a doubt about how the number of quadrature points are defined, since there are two definitions that I'm not sure are always the same.

If I understood it right, in Rho0DetJ0Vol, Q1D is defined by using the integration order for the problem, assuming it is used for a line segment, and then getting the number of quadrature points.

In LagrangianHydroOperator and I think all its members, Q1D is defined by int(floor(0.7 + pow(ir.GetNPoints(), 1.0 / dim))), which I'm not sure I understand. I'm guessing the N-root part is taking the number of quadrature points in the N-dimensional space and taking the N-root to get the effective number of quadrature points in 1D. What then is the purpose of adding 0.7 and taking the floor? Alternatively, why not use the same definition as in Rho0DetJ0Vol?

Thanks,
Varchas

multi-material implementation

    I am trying to implement multi-material hydrodynamics.
    I have few doubts and problems regarding its implementation in Laghos.

    1) Suppose ParGridFunction say "rhom" & VectorFunctionCoefficient say "rhom_coeff"
       are set for material densities and similarly for ParGridFunction "eta" &
       VectorFunctionCoefficient "eta_coeff" for material volume fractions.
       Is there any module/class/function exists in MFEM which can be used for derived
       ParGridFunction & GridFunctionCoefficient; say "rho" & "rho_coeff"
       in which rho = sum_m (eta * rhom) ? Basically I want to know, how a derived
       ParGridFunction & GridFunctionCoefficient which can be sum and/or product
       of already defined ParGridFunction & GridFunctionCoefficient obtained ?

    2) For multi-material formulation what could be a best or efficient way for
       computing force matrix, F_k (for each material k) and F = sum_k F_k
       in the AssembleForceMatrix() --> ForceIntegrator for the full assembly case.
       Is it necessary to call the AssembleForceMatrix() for F = sum_k F_k for velocity
       update and several such calls (for each k) for F_k which will be used for
       material energy (ek) update. I want an efficient way in which the
       AssembleForceMatrix() is called only once and both F = sum_k F_k and F_k (for each k)
       are computed and storred. Can anyone please suggest.

    3) Also, the energy mass matrices should also be a function of material index "k".
       How the energy mass matrix Me(i) and its inverse Me_inv(i) be made as a function
       of material index "k". Is it effient to define the index "i" --> "nzones * nmc()",
       where nmc is the total number of materials in a zone.

    4) Is there any simple template already available for full assembly case to start with
       or is there any plan to extend Laghos with this capability in near future ?

Errors in OpenCL mode

Hi I'm encountering errors with Laghos using OCCA 1.0 in OpenCL mode. Command

./laghos -p 1 -m data/square01_quad.mesh -rs 3 -tf 0.8 -no-vis -d "mode: 'OpenCL', device_id: 0, platform_id: 0 "

Output on a Titan V:

       __                __                 
      / /   ____  ____  / /_  ____  _____   
     / /   / __ `/ __ `/ __ \/ __ \/ ___/ 
    / /___/ /_/ / /_/ / / / / /_/ (__  )    
   /_____/\__,_/\__, /_/ /_/\____/____/  
               /____/                       

Options used:
   --mesh data/square01_quad.mesh
   --refine-serial 3
   --refine-parallel 0
   --problem 1
   --order-thermo 1
   --order-kinematic 2
   --ode-solver 4
   --t-final 0.8
   --cfl 0.1
   --no-visualization
   --no-visit
   --visualization-steps 5
   --outputfilename Laghos
   --device-info mode: 'OpenCL', device_id: 0, platform_id: 0 
   --occa-config 
   --no-occa-verbose
   --no-timings
Number of kinematic (position, velocity) dofs: 2178
Number of specific internal energy dofs: 1024
Repeating step 1
Repeating step 2
Repeating step 2
pure virtual method called
terminate called without an active exception
[paripp1:85275] *** Process received signal ***
[paripp1:85275] Signal: Aborted (6)
[paripp1:85275] Signal code:  (-6)
[paripp1:85275] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7fa889019390]
[paripp1:85275] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38)[0x7fa888c73428]
[paripp1:85275] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x16a)[0x7fa888c7502a]
[paripp1:85275] [ 3] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x16d)[0x7fa8897d384d]
[paripp1:85275] [ 4] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x8d6b6)[0x7fa8897d16b6]
[paripp1:85275] [ 5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x8d701)[0x7fa8897d1701]
[paripp1:85275] [ 6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x8e23f)[0x7fa8897d223f]
[paripp1:85275] [ 7] /home/nchalmer/laghos/occa/lib/libocca.so(_ZN4occa6device20removeDHandleRefFromERPNS_8device_vE+0x21)[0x7fa88a142721]
[paripp1:85275] [ 8] /home/nchalmer/laghos/occa/lib/libocca.so(_ZN4occa6kernel16removeKHandleRefEv+0x48)[0x7fa88a08f428]
[paripp1:85275] [ 9] ./laghos(_ZNSt8_Rb_treeINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt4pairIKS5_N4occa6kernelEESt10_Select1stISA_ESt4lessIS5_ESaISA_EE8_M_eraseEPSt13_Rb_tree_nodeISA_E+0x1ce)[0x60ed9e]
[paripp1:85275] [10] /home/nchalmer/laghos/occa/lib/libocca.so(_ZN4occa8device_vD1Ev+0x33)[0x7fa88a143753]
[paripp1:85275] [11] /home/nchalmer/laghos/occa/lib/libocca.so(_ZN4occa6opencl6deviceD0Ev+0x9)[0x7fa88a1b4f89]
[paripp1:85275] [12] /home/nchalmer/laghos/occa/lib/libocca.so(_ZN4occa6device20removeDHandleRefFromERPNS_8device_vE+0x2f)[0x7fa88a14272f]
[paripp1:85275] [13] /home/nchalmer/laghos/occa/lib/libocca.so(_ZN4occa6memory16removeMHandleRefEv+0x4b)[0x7fa88a16cb7b]
[paripp1:85275] [14] ./laghos[0x420d82]
[paripp1:85275] [15] ./laghos[0x419b4f]
[paripp1:85275] [16] ./laghos[0x411b61]
[paripp1:85275] [17] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7fa888c5e830]
[paripp1:85275] [18] ./laghos[0x4149a9]
[paripp1:85275] *** End of error message ***
Aborted (core dumped)

and output on a Radeon Frontier:

       __                __                 
      / /   ____  ____  / /_  ____  _____   
     / /   / __ `/ __ `/ __ \/ __ \/ ___/ 
    / /___/ /_/ / /_/ / / / / /_/ (__  )    
   /_____/\__,_/\__, /_/ /_/\____/____/  
               /____/                       

Options used:
   --mesh data/square01_quad.mesh
   --refine-serial 3
   --refine-parallel 0
   --problem 1
   --order-thermo 1
   --order-kinematic 2
   --ode-solver 4
   --t-final 0.8
   --cfl 0.1
   --no-visualization
   --no-visit
   --visualization-steps 5
   --outputfilename Laghos
   --device-info mode: 'OpenCL', device_id: 0, platform_id: 0 
   --occa-config 
   --no-occa-verbose
   --no-timings
Number of kinematic (position, velocity) dofs: 2178
Number of specific internal energy dofs: 1024

---[ Error ]------------------------------------------------
    File     : /home/noel/laghos/occa/src/modes/opencl/memory.cpp
    Function : addOffset
    Line     : 71
    Message  : Device: clCreateSubBuffer
    Error   : OpenCL Error [ -13 ]: CL_MISALIGNED_SUB_BUFFER_OFFSET
    Stack    :
      8 occa::opencl::error(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
      7 occa::opencl::memory::addOffset(long, bool&)
      6 occa::memory::slice(long, long) const
      5 occa::memory::operator+(long) const
      4 ./laghos()
      3 ./laghos()
      2 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)
      1 ./laghos()
============================================================
terminate called after throwing an instance of 'int'
Aborted (core dumped)

Any suggestions would be helpful.

Scaling Laghos / Picking number of processes/tasks

(Copied from an email thread)

I’m working with @taufer and her student @smobo on modeling the performance of LLNL’s scheduler Flux on heterogeneous workflows. One of the proxy apps we would like to use is Laghos. We have been running several of the examples in the repo readme. These examples all use 8 processes, and the scaling example also scales up by a factor of 8. Due to the number of cores on the machine we are using (36 cores), it would result in a better node packing for us to only use 6 cores per task (or multiples of 6).

Our ultimate question is: are processes in multiples of 8 a requirement of the domain decomposition of Laghos? Or is it safe and reasonable for us to use a variable number of processes to better match the hardware on the system.

@vladotomov responded with

Using multiples of 8 is not a requirement. You can use any number of tasks.

However, to get good scaling, you probably need to partition the mesh uniformly, with equal number of mesh elements per task. See the comment in the beginning laghos.cpp.

For 36 tasks in 3D, you should use the mesh ./data/cube_922_hex.mesh.

Other Equation of State in Laghos

Hi,
I tried to modify the equation of state in Laghos, but encountered some trouble. Taking Sod shock tube as an example, when using the original equation of state for ideal gas, the result is correct. But when modified to the NASG equation for liquid water, there are a lot of numerical oscillations, like the following images.

I don't quite understand why? I am not sure if Laghos is suitable for EOS with real parameters. Looking forward to your response.

Thanks,
Liu

'laghos' cores when run against 'box01_hex.mesh' with perfect cube of ranks

Following the guidance from the README.md, I tried to run laghos with the 'box01_hex.mesh' data file on 64 ranks. It resulted in a segfault, with a core file created. Investigating this, I found that if the value of ranks was a perfect cube such as 27, 64 or 125, it would segfault (with the exception of '8'; that works). Set the number of ranks to be one more or less than the perfect cube value, and it works without issue.

Here is an example of the command used:
srun --ntasks=27 laghos -p 3 -m box01_hex.mesh -rs 1 -tf 3.0 -pa

I've only observed this issue with the 'box01_hex.mesh' input file; all other data files seem to work without issue for any number of ranks.

I built Laghos against hypre 2.7.0, metis 5.1.0, and mfem 4.0.

Requesting assistance from the Laghos Team: partial assembly Implementation Inquiry

Hi,

I'm working on developing a tectonic solver using Laghos. It still has many areas for improvement, but now I can use it for some applications.

The main features I've added to Laghos are elasticity and (brittle)plasticity.
To achieve this, a lot of parts, I referred to "Dobrev, V.A. et al (2014), High order curvilinear finite elements for elastic–plastic Lagrangian dynamics".

However, I was only able to implement the assembly for stress rate in full assembly mode.
Therefore, I'm unable to fully leverage the benefits of Laghos.
I'm wondering if I can get assistance from the Laghos team to implement stress assembly in partial assembly mode.

Sungho

getting a makefile error when trying to build laghos-raja

I want to run Laghos with Raja.
However I am not able to build the application. After a host of warnings, I see these errors.

make[2]: *** [makefile:344: laghos] Error 252
make[2]: Leaving directory '/g/g92/ashwin/Laghos1/Laghos' 
make[1]: *** [makefile:340: all] Error 2
make[1]: Leaving directory '/g/g92/ashwin/Laghos1/Laghos'
make: *** [makefile:185: raja] Error 2

I would really appreciate any help on this.
Thank you.

access to mfem

I am trying to do : git clone [email protected]:mfem/mfem.git ./mfem but I get Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

blast wave propagation

Hi!
I wonder if this software can be used to simulate blast wave propagation in urban environment?
Thanks in advance！

Getting velocity vectors

Dear all,

Can someone please help me to tell me the function/class to get the physical velocity vectors (say Vx and Vy in 2D case) at eachquadrature points ? I need this especially in the module "laghos_solver.cpp" -- in update quadrature module .

This is to add necessary additional terms for axisymmetric conversion apart from the modifications done in the MFEM library in the bilinear integration. As a first step It
will be tried in full assembly case . Thanks

Thanks in advance..

Incorrect assertion in laghos_assembly.cpp

On laghos_assembly.cpp (line 124 in raja-dev branch):

An array is compared against an integer. Compiling with Clang reports this error:

/usr/workspace/wsa/laguna/laghos_cuda/laghos/Laghos/laghos_assembly.cpp:124:20: 
error: ordered comparison between pointer and zero ('int *' and 'int')
   assert(ess_tdofs>0);

I believe the solution is assert(ess_tdofs.Size()>0);

Multi GPU runs

How are you intending for users to assign multiple ranks to multiple devices when using a GPU backend?

Unless I am missing something that isn't possible in the code as written. For example you could edit laghos.cpp by adding "dev=myid;" before:

https://github.com/CEED/Laghos/blob/master/laghos.cpp#L205

This works fine, but maybe this is not what was intended? Please let me know if I am missing something! Thank you.

occa-dev branch compile error

I want to run Laghos on NVIDIA P100, so I am trying to compile the occa-dev branch.
However, the compiler complains at declarations with occa involved.
For example:

laghos_assembly.hpp:34:7: error: 'occa' does not name a type
       occa::device device;
       ^

Is it because my makefile is not well configured?
I am looking forward to get some instructions about how to compile the occa-dev branch appropriately.
Thanks in advance!

Add Travis testing

We should add a simple testing with Travis CI, at least for the official benchmark version.

Laghos fails with message -- unknown kernel -- higher order test with PA and higher order runs with -ok >= 8 fails with FA

Dear all,

I was trying to test three-material, triple point problem using the following command.

mpirun -np 1 ./laghos -p 3 -m data/rectangle01_quad.mesh -rs 1 -tf 5.0 -pa -ok 5 -ot 4

and it shows the following error message and stops.

Unknown kernel 0x2a

MFEM abort: Unknown kernel
... in function: void mfem::hydrodynamics::QUpdate::UpdateQuadratureData(cont mfem::Vector&, mfem::hydrodynamics::QuadratureData&)
... in file: laghos_solver.cpp:1329

This happens when I choose PA and -ok greater than 4 (order of the kinematic field).

Similarly, with FA, for the same problem with -ok 8 -ot 7, it shows the following error.

Verification failed: (Q1D <= MQ) is false.
--> Quadrature rules more than 14 1D points are not supported!
... in function: void mfem::internal::quadrature_interpolator::TensorValues(int, int, const mfem::DofToQuad&, const mfem::Vector&, mfem::Vector&) [with mfem::QVectorLayout::byVDIM]
... in file: fem/qinterp/eval_by_vdim.cpp:74

Please suggest..

Thank you in advance..

Sijoy C.D.

Building Laghos on Crusher

I'm currently in the process of "upgrading" the examples in the HPCToolkit tutorial examples repo, which includes Laghos. How do I build Laghos for AMD GPUs on Crusher (notably, the flags to pass onto MFEM_BUILD)?

ceed / laghos Goto Github PK

laghos's Issues

Recommend Projects

Recommend Topics

Recommend Org