adaptive-cfd / wabbit Goto Github PK

Wavelet Adaptive Block-Based solver for Interactions with Turbulence

Home Page: https://www.cfd.tu-berlin.de/

License: GNU General Public License v3.0

Fortran 97.85% Makefile 0.84% Shell 0.13% Python 1.17%

multiresolution cfd navier-stokes fluid-dynamics high-performance-computing fortran turbulence flapping-flight insect-flight tree-structure

wabbit's Introduction

WABBIT v2.0beta5 (newBiorthogonal)

(W)avelet (A)daptive (B)lock-(B)ased solver for (I)nteractions with (T)urbulence

❗ New in 05/2021: please see this video for an introduction to the code's datastructures: https://www.youtube.com/watch?v=qBBIW2-ktgo

With WABBIT it is possible to solve partial differential equations (PDE) on block-based adaptive grids. Simulations in 2D and 3D are possible and performed fully parallel on CPU. The set of PDE is encapsulated from the code handling the adaptive grid, and thus existing monobloc solvers can be adapted easily for this solver. WABBIT can handle PDEs of the following type:

$\partial_t \phi = N\left(\phi\right)$

and $N\left(\phi\right)$ can be defined. This implementation is handled by the "physics-modules". Note the current version of the code does not handle elliptic PDE, such as the Poisson equation that typically arises in incompressible fluid dynamics. Instead, we use a quasi-hyperbolic approximation in that case, the "artificial compressibility method".

Installation of WABBIT

How to get a copy of WABBIT and compile the code:

git clone https://github.com/adaptive-cfd/WABBIT.git

Unpack the file and run the compilation and tests with make, make sure that all necessary dependencies are loaded:

make all
make test

⚠️ since 15 Aug 2023, the unit testing framework has evolved. It now stores full HDF5 files in the TESTING directory, which makes it easier to visualize the reference data and current results, should they be different. We now calculate the L2 error of the field, if the grid is identical. This new framework requires the WABBIT Python Tools repository for comparing two WABBIT HDF5 files.

Dependencies

WABBIT needs several packages installed in order to compile and run, the main dependencies are:

MPI
HDF5
BLAS + LAPACK
Python-scripts found in the "python-tools" repository - these scripts are used when we perform WABBIT unit tests

Further information on the installation and compilation of all pre-requesites can be found in the wiki under Install-WABBIT-with-requirements.

A list of all environment variables to be set can be find in the wiki under Loading-prerequesites. Ensure that the WABBIT-specific variables for HDF5 are set in order for the compilation to finish successfully.

Running WABBIT

Customize the template .ini-file and rename file to [your_filename.ini], run WABBIT and pass it the .ini-file as well as the total amount of memory used:

mpirun -n 1 ./wabbit [your_filename.ini] --memory=2.0GB

alternatively, you can specify the amount of memory per core that the code may use:

mpirun -n 1 ./wabbit [your_filename.ini] --mem-per-core=2.0GB

where the --memory options allows you to control how much memory is globally allocated, i.e., on all CPUs. Note that WABBIT does not free memory during runtime. This is because the code is intented to run on clusters/supercomputers, where the available memory is reserved for the execution of the code alone. This is quite typical for supercomputing.

Additional Information

In case you have problems with the preparation to use WABBIT, have a look if you can find anything in the wiki

More information can also be found in the documentation build with doxygen. Therefore it is necessary to have Doxygen installed. Create the documentation with

doxygen doc/doc_configuration

and display doc/output/html/index.html with your browser. You can also locally display all files with

python3 -m http.server --directory doc/output/html

and then open in a browser localhost:8000

Publications

"Wavelet adaptive proper orthogonal decomposition for large-scale flow data"; Krah, Engels, Schneider, Reiss; Advances in Comput. Math. Volume 48, Article number: 10, 2022
"A wavelet-adaptive method for multiscale simulation of turbulent flows in flying insects"; T. Engels, K. Schneider, J. Reiss and M. Farge; Commun. Comput. Phys., 30, 1118-1149, 2021
"An Open and Parallel Multiresolution Framework Using Block-Based Adaptive Grids"; Sroka, Engels, Krah, Mutzel, Schneider, Reiss; Active Flow and Combustion Control 2018

wabbit's People

Contributors

Stargazers

Watchers

wabbit's Issues

Is it a memory issue?

Code aborts by itself! I am using 64 GB memory on a small cluster

HDF5 crashes unexpectedly on cluster12 tnt.tu-berlin

I had this issue now for several times. It always happens unexpectedly after a while. It seams like HDF5 is crashing.
The Errormessage:


IO: Saving data triggered, time= 0.26300000E-02
IO: writing data for time =      0.00263000 file = rho_000000002630.h5 active blocks= 4096
IO: writing data for time =      0.00263000 file = Ux_000000002630.h5 active blocks= 4096
IO: writing data for time =      0.00263000 file = Uy_000000002630.h5 active blocks= 4096
IO: writing data for time =      0.00263000 file = p_000000002630.h5 active blocks= 4096
IO: writing data for time =      0.00263000 file = vort_000000002630.h5 active blocks= 4096
IO: writing data for time =      0.00263000 file = mask_000000002630.h5 active blocks= 4096
HDF5-DIAG: Error detected in HDF5 (1.8.17) MPI-process 37:
  #000: H5D.c line 194 in H5Dcreate2(): unable to create dataset
    major: Dataset
    minor: Unable to initialize object
  #001: H5Dint.c line 455 in H5D__create_named(): unable to create and link to dataset
    major: Dataset
    minor: Unable to initialize object
  #002: H5L.c line 1638 in H5L_link_object(): unable to create new link to object
    major: Links
    minor: Unable to initialize object
  #003: H5L.c line 1882 in H5L_create_real(): can't insert link
    major: Symbol table
    minor: Unable to insert object
  #004: H5Gtraverse.c line 861 in H5G_traverse(): internal path traversal failed
    major: Symbol table
    minor: Object not found
  #005: H5Gtraverse.c line 641 in H5G_traverse_real(): traversal operator failed
    major: Symbol table
    minor: Callback failed
  #006: H5L.c line 1685 in H5L_link_cb(): unable to create object
    major: Object header
    minor: Unable to initialize object
  #007: H5O.c line 3016 in H5O_obj_create(): unable to open object
    major: Object header
    minor: Can't open object
  #008: H5Doh.c line 293 in H5O__dset_create(): unable to create dataset
    major: Dataset
    minor: Unable to initialize object
  #009: H5Dint.c line 1140 in H5D__create(): can't update the metadata cache
    major: Dataset
    minor: Unable to initialize object
  #010: H5Dint.c line 854 in H5D__update_oh_info(): unable to update layout/pline/efl header message
    major: Dataset
    minor: Unable to initialize object
  #011: H5Dlayout.c line 238 in H5D__layout_oh_create(): unable to initialize storage
    major: Dataset
    minor: Unable to initialize object
  #012: H5Dint.c line 1822 in H5D__alloc_storage(): unable to initialize dataset with fill value
    major: Dataset
    minor: Unable to initialize object
  #013: H5Dint.c line 1914 in H5D__init_storage(): unable to allocate all chunks of dataset
    major: Dataset
    minor: Unable to initialize object
  #014: H5Dchunk.c line 3575 in H5D__chunk_allocate(): unable to write raw data to file
    major: Low-level I/O
    minor: Write failed
  #015: H5Dchunk.c line 3745 in H5D__chunk_collective_fill(): unable to write raw data to file
    major: Low-level I/O
    minor: Write failed
  #016: H5Fio.c line 171 in H5F_block_write(): write through metadata accumulator failed
    major: Low-level I/O
    minor: Write failed
  #017: H5Faccum.c line 825 in H5F__accum_write(): file write failed
    major: Low-level I/O
    minor: Write failed
  #018: H5FDint.c line 260 in H5FD_write(): driver write request failed
    major: Virtual File Layer
    minor: Write failed
  #019: H5FDmpio.c line 1846 in H5FD_mpio_write(): MPI_File_write_at_all failed
    major: Internal error (too specific to document in detail)
    minor: Some MPI function failed
  #020: H5FDmpio.c line 1846 in H5FD_mpio_write(): Other I/O error , error stack:
(unknown)(): Other I/O error 
    major: Internal error (too specific to document in detail)
    minor: MPI Error String
HDF5-DIAG: Error detected in HDF5 (1.8.17) MPI-process 37:
  #000: H5D.c line 460 in H5Dget_space(): not a dataset
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.8.17) MPI-process 37:
  #000: H5S.c line 791 in H5Sget_simple_extent_ndims(): not a dataspace
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.8.17) MPI-process 37:
  #000: H5Dio.c line 228 in H5Dwrite(): not a dataset
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.8.17) MPI-process 37:
  #000: H5S.c line 392 in H5Sclose(): not a dataspace
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.8.17) MPI-process 37:
  #000: H5D.c line 415 in H5Dclose(): not a dataset
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.8.17) MPI-process 37:
  #000: H5D.c line 358 in H5Dopen2(): not found
    major: Dataset
    minor: Object not found
  #001: H5Gloc.c line 430 in H5G_loc_find(): can't find object
    major: Symbol table
    minor: Object not found
  #002: H5Gtraverse.c line 861 in H5G_traverse(): internal path traversal failed
    major: Symbol table
    minor: Object not found
  #003: H5Gtraverse.c line 641 in H5G_traverse_real(): traversal operator failed
    major: Symbol table
    minor: Callback failed
  #004: H5Gloc.c line 385 in H5G_loc_find_cb(): object 'blocks' doesn't exist
    major: Symbol table
    minor: Object not found
HDF5-DIAG: Error detected in HDF5 (1.8.17) MPI-process 37:
  #000: H5A.c line 1640 in H5Aexists(): not a location
    major: Invalid arguments to routine
    minor: Inappropriate type
  #001: H5Gloc.c line 253 in H5G_loc(): invalid object ID
    major: Invalid arguments to routine
    minor: Bad value
HDF5-DIAG: Error detected in HDF5 (1.8.17) MPI-process 37:
  #000: H5A.c line 247 in H5Acreate2(): not a location
    major: Invalid arguments to routine
    minor: Inappropriate type
  #001: H5Gloc.c line 253 in H5G_loc(): invalid object ID
    major: Invalid arguments to routine
    minor: Bad value
HDF5-DIAG: Error detected in HDF5 (1.8.17) MPI-process 37:
  #000: H5A.c line 591 in H5Awrite(): not an attribute
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.8.17) MPI-process 37:
  #000: H5A.c line 1602 in H5Aclose(): not an attribute
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.8.17) MPI-process 37:
  #000: H5D.c line 415 in H5Dclose(): not a dataset
    major: Invalid arguments to routine
    minor: Inappropriate type

on [email protected], unit tests fail

on reynolds@tnt, two unit tests fail mysteriously: the run appears to be successful, but the analysis phase fails:

sum: 158.621

FREE: Beginning freeying of memory.
All memory is cleared!

END: cpu-time = 103.4640 s

THE RUN SEEMS TO BE OKAY

============================
run done, analyzing data now

STARTING wabbit-post
MPI: using 1 processes

INIT: running 3D case
Starting postprocessing in --keyvalues mode
analyzing file phi_000000000000.h5 for keyvalues

INIT: Beginning memory allocation and initialization.
INIT: mpisize= 1
INIT: Bs= 17 blocks-per-rank= 2592 total blocks= 2592
INIT: Allocating a 3D case.
INIT: ALLOCATED hvy_block MEM= 0.1019GB SHAPE= 17 17 17 1 2592
INIT: ALLOCATED hvy_neighbor MEM= 0.0015GB SHAPE= 2592 74
INIT: ALLOCATED lgt_block MEM= 0.0001GB SHAPE= 2592 7
INIT: ALLOCATED lgt_sortednumlist MEM= 0.0000GB SHAPE= 2592 2
INIT: ALLOCATED lgt_active MEM= 0.0000GB SHAPE= 2592
INIT: ALLOCATED hvy_active MEM= 0.0000GB SHAPE= 2592

READING: initializing grid from file...
Filename: phi_000000000000.h5
Expected Nblocks= 288 (on all cpus)
In the file we just read, Jmin= 2 Jmax= 3

READING: Reading datafield 1 from file phi_000000000000.h5
STARTING wabbit-post
MPI: using 1 processes

INIT: running 3D case
Starting postprocessing in --compare-keys mode
ERROR! file: phi_000000000000.key not found
File not found....phi_000000000000.key
:[ Sad, this is

failed! phi_000000000000.key ./TESTING/conv/blob_3D_adaptive/phi_000000000000.ref

saving data in ACM (and possibly others)

The function of the PREPARE_SAVE_DATA_ACM routine is to store everything we want to write to disk in the hvy_tmp variable.
For example, if the name is ux, then the first component of the state vector is copied to this place in the hvy_tmp array.
This all works fine except for the vorticity, which has in 3D 3 components. The current PREPARE_SAVE_DATA_ACM will not work for that and should be revised at some point. Note the number params%N_fields_saved must in fact include 3 slots for the vorticity in that case.

It may be better to not even provide the possibility to save arbitrary data in wabbit at this point and simply compute derived qty in postprocessing. At least for the ACM, this might be an option.

Ghost node sync on domain boarders does creepy things

I have created a video of my sod shock tube. It seems that with the new jmax the sync of the domain boundaries fails.
The initial condition is already wrong. Here I initialize my field with a simple if statement:
if (x < Lx/2)
p=1
else
p=0.1
end
but in the plot there are wrong values on the boundaries, which seem to be not synced right. This error increases for further time iterations until my program fails. Note that this is without any filtering.

Please check out the gif I exported to visualize my problem:

wabbit-post dry-run does bug with 2nd order MR

I found out that the postprocessing dry-run mode fails to create the correct mask function if the second order MR predictor is used (for both g=1 and g=2).
specifically, the time independent mask is wrongly created.

No turbulence models?

Looking at your research and code this is DNS with no turbulence modeling correct? I'm asking because I am wondering if its possible to use Wabbit to simulate the flapping flight of a larger animal such as a bird. Insects are probably small enough that you can use DNS, but a larger model like a bird would probably require a turbulence model like LES right? Perhaps it would be a relatively simple task to implement LES into Wabbit.

unit test fails for mpisize=12 - bug in wabbit-post

if I run
export mpi_command="nice mpirun -n 12 "
then make test
./TESTING/navier_stokes/filter/explicit_7pt/explicit_7pt.sh
fails: some bug in the wabbit post-processing

Unit testing balance the load

It turns out I forgot to compare the keyvalues from the load balancing unit tests. This unit test apparently does not always give the same results which is the reason I skipped it for the moment. I will try to think about it and find a better test within the next days!

Error in get_inicond_from_file.f90?

I think there is a mistake in LIB/INI/get_inicond_from_file.f90

do i = 1, params%dim
    if (periodic_BC(i).eqv.params%periodic_BC(i)) then
        if (params%rank==0) write(*,*) "WARNING WARNING WARNING periodic_BC in inifile and HDF5 file are different!"
    endif

    if (symmetry_BC(i).eqv.params%symmetry_BC(i)) then
        if (params%rank==0) write(*,*) "WARNING WARNING WARNING symmetry_BC in inifile and HDF5 file are different!"
    endif
enddo

Maybe '.eqv.' should be replaced with '.ne.' ?

ensure_gradedness uses a full MPI_allreduce

The current version of ensure_gradedness uses a full MPI_ALLREDUCE:

call MPI_Allreduce(my_refine_change, refine_change, lgt_n, MPI_INTEGER1, MPI_MAX, WABBIT_COMM, ierr)

this can be extremely expensive if params%number_blocks becomes large and should be replaced b the more recent code using

subroutine synchronize_lgt_data( params, lgt_block, refinement_status_only )

Postprocessing - Decreasing #Blocks but increasing #Blocknodes to keep resolution

loading .h5 data into paraview can be very slow for large numbers of blocks. It would be nice to have a tool which reduces the number of blocks without loosing the resolution.

Maybe on can decrease the number of blocks iteratively by merging two into one, i.e. increasing the number of block nodes?

newGhostNodes: integration bounds

while differential operators are fine, the change in the grid definition to remove the redundant point affects the integration bounds:
a loop must now run 0 <= i <= Bs while in previous versions, 0 <= i <= Bs-1 had to be used.

POD unit tests fail

The POD unit tests currrently fail, because they generate data with an ACM test case, and I removed some old functionality of the ACM module (remove mean pressure).

Fixing is a simple matter of updating the unit test.

non-blocking send/recv in balance_load

the current version of balance load uses a blocking send/recv and communication lists, but this can be a bottleneck on large scale computations. we'll need to change that to non-blocking communications at some point.
it is also tedious that the routine requires additonal memory for the send/recv operations: if called on a full grid with say 1024 blocks, balance_load may require 1536 allocated blocks.

funnel unit test appears to fail using IFORT

on my local machine, using gfortran and MPICH, all unit tests pass. When using IFORT on (any) cluster, the funnel test fails.

homogenize sponge parameters for NS and ACM?

The ACM module reads the section

[Sponge]
; sponge term, used in ACM module to mimick outflow conditions for pressure waves
use_sponge=0;
L_sponge=0.1;
C_sponge=1.0e-2;

from the ini file, while NS reads

[Physics]
...
; sponge (yes=1, no=0) NOTE: this setting here applies only to the compressible Navier-Stokes
; equation, not to other physics modules. For ACM, use the "sponge" section.
sponge_layer=1;
; sponge parameter
C_sp=0.001;

if not too much trouble, I suggest to remove the NS-specific part, as it can easily be merged with the [Sponge] section.
This is optional and not urgent, though.

bumblebee unit test appears to fail using IFORT

Unit test fails with 3 sad tests, when using following compiler options on cluster 12:

source /opt/intel/composer_xe_2011_sp1.11.339/bin/ifortvars.sh intel64
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/mvapich2/1.8a2/intel/lib64
export FPATH=/opt/mvapich2/1.8a2/intel/include
export PATH=/opt/mvapich2/1.8a2/intel/bin:$PATH

CDF42 unit test fail - link with reset_ghost_nodes

On my machine and using the latest version, one of the unit tests does fail in a suspicous manner:

Running test:  TESTING/conv/blob_2D_adaptive_CDF42/blob-convection-adaptive.sh
Writing output to:  TESTING/conv/blob_2D_adaptive_CDF42/blob-convection-adaptive.log
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 3 in communicator MPI COMMUNICATOR 3 DUP FROM 0
with errorcode 280420232.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
 fail  	Time used in test: 2 seconds

when looking at the logfile, it reads

IO: Saving data triggered, time= 0.15000000    
IO: saving HDF5 file t=     0.15000000 file = phi_000000150000.h5 J=( 1 |  2) Nblocks=      7 sparsity=( 43.8% /   0.0%)
RUN: it=     19 time=     0.150000000 t_wall=9.234E-03 Nb=(    40/     7) J=( 2: 3/ 1: 2) dt= 6.3E-03 mem=  0% adapt iter Nb:(    16)
   932341.99218895088     
large values in phi. that cannot be good.

upon further inspection, I realized that the test is indeed running fine (the data look okay, even if stored for every time step), and that the failure may have to do with the reset_ghost_nodes routine that you have activated if DEV is using at compile-time.

Are you aware of this issue ?

input files type .stl ?

dear professor
if possible any suggestion for input geometry .STL file in our WABBIT code? I am learning now how to use the WABBIT code.

thank you so much.

sparse-to-dense memory allocation: commit 8eed2528f97ac2a97e1280a5c1d855446449a298

Hello,
why do I get an error every second time, when running the following command:

./wabbit-post 2D --sparse-to-dense Ux_000000150000.h5 sparse_Ux.h5 4 4

This is the error which occures sometimes:

Operating system error: Nicht genügend Hauptspeicher verfügbar
Reservierung würde Speichergrenze überschreiten

Error termination. Backtrace:
#0 0x7f03e7a5335e in ???
#1 0x7f03e7a53f09 in ???
#2 0x7f03e7a54119 in ???
#3 0x4280c9 in _module_initialization_MOD_allocate_grid
at /home/philipp/master_PI/code/WABBIT_PIG/WABBIT/allocate_grid.f90:112
#4 0x651262 in sparse_to_dense
at LIB/POSTPROCESSING/sparse_to_dense.f90:136
#5 0x403c76 in main_post
at LIB/POSTPROCESSING/main_post.f90:84
#6 0x403538 in main
at LIB/POSTPROCESSING/main_post.f90:19

N_thresholding_components warning message

Noticed this warning:
LIB/EQUATION/ACMnew/module_ACM.f90(490): warning #6843: A dummy argument with an explicit INTENT(OUT) declaration is not given an explicit value. [N_THRESHOLDING_COMPONENTS]
subroutine PREPARE_THRESHOLDFIELD_ACM( u, g, x0, dx, threshold_field, N_thresholding_components )
Is there anything wrong with N_thresholding_components ?

Possible improvements of simulation code

This thread is a non-exhaustive list of ideas I have to improve the code performance.

Skip diagonal ghost node synching
The diagonal ghost nodes are a significant overhead and I am unsure if they are actually required at all. for the differential operators no (only x,y,z = faces), and for CDF44 wavelets neither. Also, CDF40 does no longer check details on the ghost node layer, which was the case in old versions
Improve load balancing to take different computing time per block into account (from the mask generation, possibly chemistry as well)
Allow different N_ghostNodes for RHS and wavelet (because CDF44 requires g=6 but 4th order differences only g=3). This feature was enabled in the newGhostNodes branch.
openMP hybridization (take adavantage of odern, heterogeneous machines)

bug in ensure_gradedness

the newest version of ensure_gradedness has a bug
version d428581
is fine

crashes if vector size is unexpected

If I run with
params%number_data_fields=4;
but set
threshold_state_vector_component=1 1 1;
then the code crashes with no useful error message

resulting grid depends on number ghost nodes

In the current implementation, the threshold_block routine checks also detail coefficients in the ghost node layer of a block, which is the layer overlapping with neighboring blocks. hence, if a flow feature is near the block interface, but still on the other block, a block can be refined.

this sounded like a secure idea back in the design phase, as it precludes flow features to "arrive" on a block which is not ready for it, i.e. not refined.

an ugly consequence is that the grid is then dependent on ng! Larger overlap possibly refines more blocks. As this is counter-intuitive, it should maybe be removed. In any case, the restriction/prediction cycle is performed on the whole block, including the ghost node layer. It would just be the detail coefficient magnitude check which is altered. This strategy avoids using one-sided interpolation stencils in the block thresholding process.

flusi_to_wabbit not working as before with different Bs implemented (but not used in my case)

I cannot find a way to opt out if I have the same Bs for both directions in 2D.

failing for 2D runs with larger number of blocks?

The code failed to perform a detailed 2D simulation with J_max = 12 with the obscure message:

RUN: it=  14967  time=     1.481635000 t_cpu=  9.5492E-01 Nb=( 13768/  3454) Jmin= 3 Jmax=11
 Lord Vader, restrict_predict_data is called with leveldiff /= -+1
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 2 in communicator MPI COMMUNICATOR 3 DUP FROM 0
with errorcode 123005.

the reason this message is obscure is that it was not supposed to be seen. the level diff indicator can be -1 0 or +1. What happens here, probably, is that the packing of multiple numbers into one integer can fail

packing:

    ! pack multipe information into one number
    level_diff_indicator = 256*bounds_type + 16*(level_diff+1) + entrySortInRound

unpacking:

 ! unpack & evaluate level_diff_indicator (contains multiple information, unpack it)
       level_diff_indicator = int_send_buffer(l+2, istage_buffer)
       entrySortInRound = modulo( level_diff_indicator, 16 )
       level_diff       = modulo( level_diff_indicator/16  , 16 ) - 1
       bounds_type      = modulo( level_diff_indicator/256 , 16 )
       sender_hvy_id    =       ( level_diff_indicator/4096 )

all this takes place in synchronize_ghosts_generic.f90

integration does count redundant nodes twice?

the integration of, e.g. mean flow, for instance

params_acm%mean_flow(1) = params_acm%mean_flow(1) + sum(u(g+1:Bs+g, g+1:Bs+g, 1, 1))*dx(1)*dx(2)

appears to be counting the redundant nodes twice; shouldn't the array boundary be

params_acm%mean_flow(1) = params_acm%mean_flow(1) + sum(u(g+1:Bs+g-1, g+1:Bs+g-1, 1, 1))*dx(1)*dx(2)

instead??

Mean value computation in wabbit-post --keyvalues

./wabbit-post --keyvalues doesn't seem to compute the mean values correctly. The sums are divided by the number of active blocks lgt_n there's no division by Bs.

With great power comes great responsibility: Allocating large arrays

Allocating and handling of large data arrays 🥇

For best performance try to reduce the time for allocating and deallocating arrays (allocate/deallocate). Especially in functions which are called in every iteration allocating and deallocating arrays adds up and become expensive.

We suggest to use:

the hvy_work(ix,iy,iz,dF) array, which can be used as block data (i.e. data on every local grid point) and is passed to most of the functions inside WABBIT. It is only allocated once in the beginning of the program and deallocated on the end to save our time.

in every routine which uses large temporary arrays, which are not block data arrays, use:

      subroutine my_fun(Np)
            ....
            real, allocatable, save :: my_array(:)
            if (.not. allocated(my_array) allocate(my_array(Np))
            ...
      end subroutine my_fun

This bounds the allocated memory to my_fun and thus my_array does not need to be allocated in every function call

if possible use a global array in your module as a private variable and allocate it once, to reuse it in every function of the module.

Some reasons, why we don`t want to allocate and deallocate arrays in every function call:

a process can be killed in the middle of your simulation without plausible reasons! This can happen when you are allocating a lot of memory and the operating system decides to kill you
even worse: it is not killed and starts to swap
allocating and deallocating all the time makes WABBIT slow!!

For the advanced people: block data in RHS call which is not time evolved 🥈

It would be nice to have a module which provides you with block data, that is not time evolved and can be saved during the time iterations. This can be used for the generation and preservation of masks or other temporary fields.
A possible scenario is that you compute the mask function (or a part of it) only once on the finest level, at the beginning of the program, and coarsen it to the actual mesh level during the time iterations.
Another scenario: only compute the mask function when the mask or block has changed (i.e. rank or mesh lvl).

Use ABORT instead of STOP!

ERROR Messaging in WABBIT

Please avoid using the stop command! Using stop instead of MPI_abortcan leave some processes on the computation nodes, without aborting them.
If you see it in any of the modules replace it by one of the following routines (use module_globals):

call abort( <some ID>, <your error message>)
call abort( <your error message>)
call abort( <some ID>)

Example: call abort(10092018, "Error: this is my error message")

This routine is a wrapper for MPI_ABORT, which is awesome because your module does not need to know the MPI Communicator or any other extra variable needed for error messaging.

loadbalancing dreadfully slow

the loadbalancing routine is dreadfully slow. if the code runs with maxlevel dealiasing, i.e. blocks on jmax are forced to coarsen, then loadbalancing is done only after the grid adaptation step. Hence, in such a case, most of the time the routine does not do much or nothing.

if the maxlevel dealiasing is off, the loadbalancing is also executed in refinement. now, many blocks are shuffeled around, and loadbalancing becomes by far the most expensive part of the code (~70%).

we should hence improve loadbalancing.

Compilation Error: Type mismatch between actual argument at (1) and actual argument at (2) (INTEGER(4)/REAL(8)).

Seems that we have the same issue with the new version of gfortran
https://www.mail-archive.com/[email protected]/msg34361.html

This is what I got:
GNU Fortran (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0

LIB/PARAMS/module_ini_files_parser_mpi.f90:599:25:

599 | call MPI_BCAST( matrixlines, 1, MPI_INTEGER, 0, MPI_COMM_WORLD, mpicode )
| 1
......
651 | call MPI_BCAST( matrix, matrixlines*matrixcols, MPI_DOUBLE_PRECISION, 0, MPI_COMM_WORLD, mpicode )
| 2
Error: Type mismatch between actual argument at (1) and actual argument at (2) (INTEGER(4)/REAL(8)).
LIB/PARAMS/module_ini_files_parser_mpi.f90:228:24:

228 | call MPI_BCAST( params_real, 1, MPI_DOUBLE_PRECISION, 0, WABBIT_COMM, mpicode )
| 1
......
651 | call MPI_BCAST( matrix, matrixlines*matrixcols, MPI_DOUBLE_PRECISION, 0, MPI_COMM_WORLD, mpicode )
| 2
Error: Rank mismatch between actual argument at (1) and actual argument at (2) (rank-1 and scalar)
make: *** [LIB/fortran.mk:250: OBJ/module_ini_files_parser_mpi.o] Error 1

dry-run mode for mask generation

Wabbit needs a dry-run mode, much like flusi has it. In such a mode, we do not solve any PDE, but simply construct the mask function at the times specified in the INI file (i.e. the saving interval) and save it to disk.

this can of course only the done if the mask does not depend on the solution of the fluid (e.g. if it moves according to the drag / lift force (freeflight) or any form of active penalization). For a pre-defined geometry and velocity field, it should be possible.

To this end, the create_mask routines of penalization-ready physics modules (Navier--Stokes-compressible and artificial-compressibility) should provide a PUBLIC create_mask routine.

the dry run would be called like
./wabbit-post --dry-run PARAMS.ini --memory=5.0GB
and create a grid based on the mask function only. the time step is set to the saving interval. no work arrays are allocated (hvy_work)

one should also include the possibility to use an existing grid:
./wabbit-post --create-mask PARAMS.ini input_000103030.h5 mask_000103030.h5
it can then be used to complete existing simulation data, in post-processing, with the mask function. This way, we can save on HDD space or, if the data is on a server, the transfer time to the machine used for visualization.

Note this post/pre processing tool requires to know the INI file and properly initializes the physics modules, much like wabbit itself does.

homogenization of saving intervals

I suggest to create a small module that takes care of the saving conditions for each task.
There is currently only saving the fields, but in the future, saving statistics and the like will be added.

We have the following conditions for saving: (some of which will be in the future)

every N time steps
every T physical time units
but not before the first N0 time steps
and not before the first T0 physical time units
at the beginning and end of the simulation
(?) maybe in the future, if an external command is received

This should be connected to an "intelligent" dt, which ensures we do not jump past the saving times and if necessary adjusts dt to precisely land on this instant.

The BASIC CODE STRUCTURE of the Right hand side

The major structure in a nutshell:

a physics module goes in its own folder (EQUATION/MYPHYSMODEL/).
All its internal routines are PRIVATE and cannot be
seen from wabbit. Only a handful of interface routines are made PUBLIC
with an explicit statement. The parameter struct is a
module global (and of course not visible in wabbit).

Every module has at least the following PUBLIC routines:

READ_PARAMETERS_MYPHYSMODEL -- main level wrapper routine to read parameters in
the physics module. It reads from the same ini-file as wabbit what it
needs to know. caled only once. note in physics modules the parameter
struct for wabbit is not available.

PREPARE_SAVE_DATA_MYPHYSMODEL -- save data. Since you might want to save
derived data, such as the vorticity, the divergence etc., which are not
in your state vector, this routine has to copy and compute what you want
to save to the work array. In the main code, save_fields than saves the
first N_fields_saved components of the work array to file.

FIELD_NAMES_MYPHYSMODEL -- the main routine save_fields has to know how you
label the stuff you want to store from the work array, and this routine
returns those strings

RHS_MYPHYSMODEL -- main RHS. it must be valid for all dimensions (2d/3d). The
idea is that the physics modules are not aware of the GRID. that's for
wabbit. physics modules see only blocks. that's their deal. Now, it is
possible for the RHS to depend on global stuff, such as integrals, for
example for forcing terms. therefore, I suggest a staging concept for
the RHS calls, see RHS_wrapper.f90 and the new module. RHS's that do not
require such integrals can just skip the steps internally.

GET_DT_BLOCK_MYPHYSMODEL -- returns the dt for a block. wabbit selects the
smallest dt of all blocks.

INICOND_MYPHYSMODEL -- sets on one block the initial condition.

All additional routines accessible from WABBIT can be found in module_physics_metamodule

wabbit should check if it saves/reads crap

the code should at least verify, directly after reading and immediately before saving, that the data does not contain NaNs, and if so, abort the simulation informing the user

adaptive-cfd / wabbit Goto Github PK

wabbit's Introduction

WABBIT v2.0beta5 (newBiorthogonal)

(W)avelet (A)daptive (B)lock-(B)ased solver for (I)nteractions with (T)urbulence

Installation of WABBIT

Dependencies

Running WABBIT

Additional Information

Publications

wabbit's People

Contributors

Stargazers

Watchers

Forkers

wabbit's Issues

sum: 158.621

FREE: Beginning freeying of memory. All memory is cleared!

============================ run done, analyzing data now

For the advanced people: block data in RHS call which is not time evolved 🥈

ERROR Messaging in WABBIT

Recommend Projects

Recommend Topics

Recommend Org

FREE: Beginning freeying of memory.
All memory is cleared!

============================
run done, analyzing data now