llnl / amg Goto Github PK
View Code? Open in Web Editor NEWAlgebraic multigrid benchmark
License: GNU Lesser General Public License v2.1
Algebraic multigrid benchmark
License: GNU Lesser General Public License v2.1
#BHEADER********************************************************************** # Copyright (c) 2017, Lawrence Livermore National Security, LLC. # Produced at the Lawrence Livermore National Laboratory. # Written by Ulrike Yang ([email protected]) et al. CODE-LLNL-738-322. # This file is part of AMG. See files COPYRIGHT and README for details. # # AMG is free software; you can redistribute it and/or modify it under the # terms of the GNU Lesser General Public License (as published by the Free # Software Foundation) version 2.1 dated February 1999. # # This program is distributed in the hope that it will be useful, but WITHOUT # ANY WARRANTY; without even the IMPLIED WARRANTY OF MERCHANTABILITY or # FITNESS FOR A PARTICULAR PURPOSE. See the terms and conditions of the # GNU General Public License for more details. # # You should have received a copy of the GNU Lesser General Public License # along with this program; if not, write to the Free Software Foundation, # Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA # #EHEADER********************************************************************** General description: AMG is a parallel algebraic multigrid solver for linear systems arising from problems on unstructured grids. The driver provided with AMG builds linear systems for various 3-dimensional problems. AMG is written in ISO-C. It is an SPMD code which uses MPI and OpenMP threading within MPI tasks. Parallelism is achieved by data decomposition. The driver provided with AMG achieves this decomposition by simply subdividing driver provided with AMG achieves this decomposition by simply subdividing the grid into logical P x Q x R (in 3D) chunks of equal size. For more information, see the amg.readme file in the docs directory of the distribution. %========================================================================== %========================================================================== Building the Code AMG uses a simple Makefile system for building the code. All compiler and link options are set by modifying the file 'AMG/Makefile.include' appropriately. To build the code, first modify the 'Makefile.include' file appropriately, (it is recommended to use the options -DHYPRE_BIGINT ) then type (in the AMG directory) make Other available targets are make clean (deletes .o files) make veryclean (deletes .o files, libraries, and executables) To configure the code to run with: 1 - MPI only , add '-DTIMER_USE_MPI' to the 'INCLUDE_CFLAGS' line in the 'Makefile.include' file and use a valid MPI. 2 - OpenMP with MPI, add vendor dependent compilation flag for OMP 3 - to be able to solve problems that are larger than 2^31-1, add '-DHYPRE_BIGINT' 4 - For additional optimizations in MPI add '-DHYPRE_USING_PERSISTENT_COMM' 5 - For additional optimizations in OpenMP add '-DHYPRE_HOPSCOTCH' %========================================================================== %========================================================================== Figure of Merit (FOM) For problem 1, there are 2 FOMs printed out at the end of each run: nnz_AP / setup_time nnz_AP * #iterations / solve time Both need to be considered. For problem 2, one FOM needs to be considered: nnz_AP * (#iterations + time_steps) / time
We are developing a static race detection tool and we found a few potential data races in this project. We were unable to determine if these races are possible under some input, so we thought it best to report just in case.
temp
We found a potential race on temp
inside of the parallel loop at seq_mv/csr_matvec.c:747
.
#ifdef HYPRE_USING_OPENMP
#pragma omp parallel for private(i,jj) HYPRE_SMP_SCHEDULE
#endif
for (i = 0; i < num_rows; i++)
{
if (CF_marker_x[i] == fpt)
{
temp = y_data[i];
for (jj = A_i[i]; jj < A_i[i+1]; jj++)
if (CF_marker_y[A_j[jj]] == fpt) temp += A_data[jj] * x_data[A_j[jj]];
y_data[i] = temp;
}
}
There are two writes to tmp
temp = y_data[i];
temp += A_data[jj] ...
We were unable to confirm if the branch conditions guarding these writes prevent multiple threads from executing these writes in parallel.
Was temp
intended to be marked private?
I have pasted the full report from our tool below for reference.
==== Found a race between:
line 754, column 10 in csr_matvec.c AND line 756, column 51 in csr_matvec.c
Shared variable:
temp at line 675 of csr_matvec.c
675| HYPRE_Complex temp;
Thread 1:
752| if (CF_marker_x[i] == fpt)
753| {
>754| temp = y_data[i];
755| for (jj = A_i[i]; jj < A_i[i+1]; jj++)
756| if (CF_marker_y[A_j[jj]] == fpt) temp += A_data[jj] * x_data[A_j[jj]];
>>>Stack Trace:
>>>.omp_outlined._debug__.36.414 [csr_matvec.c:750]
Thread 2:
754| temp = y_data[i];
755| for (jj = A_i[i]; jj < A_i[i+1]; jj++)
>756| if (CF_marker_y[A_j[jj]] == fpt) temp += A_data[jj] * x_data[A_j[jj]];
757| y_data[i] = temp;
758| }
>>>Stack Trace:
>>>.omp_outlined._debug__.36.414 [csr_matvec.c:750]
The OpenMP region this bug occurs:
/home/brad/tmp/AMG/seq_mv/csr_matvec.c
>747|#pragma omp parallel for private(i,jj) HYPRE_SMP_SCHEDULE
748|#endif
749|
750| for (i = 0; i < num_rows; i++)
751| {
752| if (CF_marker_x[i] == fpt)
res0
The second potential race we identified occurs on res0
inside of the parallel loop at parcsr_ls/par_relax.c:1444
.
There are writes to res0
at lines 1470 and 1477:
res0 = 0.0;
res0 -= A_diag_data[jj] ...
#ifdef HYPRE_USING_OPENMP
#pragma omp parallel for private(i,ii,j,jj,ns,ne,res,rest,size) HYPRE_SMP_SCHEDULE
#endif
for (j = 0; j < num_threads; j++)
{
....
for (i = ne-1; i > ns-1; i--) /* interior points first */
{
/*-----------------------------------------------------------
* If diagonal is nonzero, relax point i; otherwise, skip it.
*-----------------------------------------------------------*/
if ( A_diag_data[A_diag_i[i]] != zero)
{
res = f_data[i];
res0 = 0.0; // <================================= Racing Write
res2 = 0.0;
for (jj = A_diag_i[i]+1; jj < A_diag_i[i+1]; jj++)
{
ii = A_diag_j[jj];
if (ii >= ns && ii < ne)
{
res0 -= A_diag_data[jj] * u_data[ii]; // <=== Racing Write
We were unable to determine if the branches guarding these writes would prevent multiple threads from accessing these lines.
If this is a real race, there is a nearly identical case in the parallel loop at parcsr_ls/ams.c:3783
on the variable res2
.
I ran AMG on Xeon, KNL and KNM platform, but when I was running on KNM/KNL I find that the numbers of iterations become much more than on Xeon(with same configuration), what's more, when I running with 1 OMP thread per process, the number of iterations will become 0 on KNM/KNL, is there any way(like some flag) to control the number of iteration, or this is a problem of the code?
Perhaps this is a strange question, and I am not sure if the project author is still actively involved in it, but I thought I would give it a try and inquire. Here's the situation: Firstly, I have successfully built the project using the make command. However, when I attempted to compile and run amg.c separately using different approaches (I am using the Clang compiler), I encountered the following issues:
amg.c
can be successfully compiled into an executable file directly using Clang, and it can run smoothly. The command is as follows:clang -fopenmp -o test4 amg.c -I.. -I../utilities -I../IJ_mv -I../seq_mv -I../parcsr_mv -I../parcsr_ls -I../krylov -DTIMER_USE_MPI -DHYPRE_USING_OPENMP -DHYPRE_HOPSCOTCH -DHYPRE_USING_PERSISTENT_COMM -DHYPRE_BIGINT -DHYPRE_TIMING -lmpi -L. -L../parcsr_ls -L../parcsr_mv -L../IJ_mv -L../seq_mv -L../krylov -L../utilities -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm
amg.c
into an executable file, I encountered different errors during the execution:clang -S -emit-llvm -o amgtest.ll amgtest.c
clang amgtest.ll -fopenmp -DTIMER_USE_MPI -DHYPRE_USING_OPENMP -DHYPRE_HOPSCOTCH -DHYPRE_USING_PERSISTENT_COMM -DHYPRE_BIGINT -DHYPRE_TIMING -lmpi -L. -L../parcsr_ls -L../parcsr_mv -L../IJ_mv -L../seq_mv -L../krylov -L../utilities -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -o test4 -g
mpirun -np 4 test4
clang -S -emit-llvm -o amgtest.ll amgtest.c
llvm-as amgtest.ll -o amgtest.bc
llc amgtest.bc -o amgtest.s
clang amgtest.s -no-pie -fopenmp -DTIMER_USE_MPI -DHYPRE_USING_OPENMP -DHYPRE_HOPSCOTCH -DHYPRE_USING_PERSISTENT_COMM -DHYPRE_BIGINT -DHYPRE_TIMING -lmpi -L. -L../parcsr_ls -L../parcsr_mv -L../IJ_mv -L../seq_mv -L../krylov -L../utilities -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -o test4 -g
mpirun -np 4 test4
IJ_A
structure.amg.c
code into intermediate code due to the complexity of the IJ_A
structure. This may have resulted in some unexpected errors, leading to the final error. I have ruled out the issue of compiler versions (as I have tried Clang 3.8, Clang 4.0, and Clang 3.0). Could it be a problem with the code itself? Have the authors encountered similar errors before? (Since my current research is in code optimization/profiling, I need to manipulate the intermediate code of amg.c
). Do you have any suggestions? Thank you very much!I just wanted to give huge props and say thank you for how easy this was to build, and find examples for running here! I literally typed make
in an environment with the dependencies (all installed easily with apt) and it worked, and then the example problems did too.
Please close this after reading, just wanted to say thank you! ๐
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.