Coder Social home page Coder Social logo

amgx's Introduction

Algebraic Multigrid Solver (AmgX) Library

AmgX is a GPU accelerated core solver library that speeds up computationally intense linear solver portion of simulations. The library includes a flexible solver composition system that allows a user to easily construct complex nested solvers and preconditioners. The library is well suited for implicit unstructured methods. The AmgX library offers optimized methods for massive parallelism, the flexibility to choose how the solvers are constructed, and is accessible through a simple C API that abstracts the parallelism and scale across a single or multiple GPUs using user provided MPI.

This is the source of the AMGX library on the NVIDIA Registered Developer Program portal.

Key features of the library include:

  • fp32, fp64 and mixed precision solve
  • Complex datatype support (currently limited)
  • Scalar or coupled block systems
  • Distributed solvers using provided MPI
  • Flexible configuration allows for nested solvers, smoothers and preconditioners
  • Classical (Ruge-Steuben) and Unsmoothed Aggregation algebraic multigrid
  • Krylov methods: CG, BiCGSTAB, GMRES, etc. with optional preconditioning
  • Various smoother: Jacobi, Gauss-Seidel, Incomplete LU, Chebyshev Polynomial, etc.
  • A lot of exposed parameters for algorithms via solver configuration in JSON format
  • Modular structure for easy implementation of your own methods
  • Linux and Windows support

Check out these case studies and white papers:

Table of Contents

Quickstart

Here are the instructions on how to build library and run an example solver on the matrix in the Matrix Market format file. By default provided examples use vector of ones as RHS of the linear system and vector of zeros as initial solution. In order to provide you own values for RHS and initial solution edit the examples.

Dependencies and requirements

In order to build project you would need CMake and CUDA Toolkit. If you want to try distributed version of AMGX library you will also need MPI implementation, such as OpenMPI for Linux or MPICH for Windows. You will need compiler with c++11 support (for example GCC 4.8 or MSVC 14.0). You also need NVIDIA GPU with Compute Capability >=3.0, check to see if your GPU supports this here.

Cloning / Pulling

In order to pull all necessary dependencies, AmgX must be cloned using the --recursive option, i.e.:

git clone --recursive [email protected]:nvidia/amgx.git

If you want to update a copy of the repository which was cloned without --recursive, you can use:

git submodule update --init --recursive

Building

Typical build commands from the project root:

mkdir build
cd build
cmake ../
make -j16 all

Therer are few custom CMake flags that you could use:

  • CUDA_ARCH: List of virtual architectures values that in the CMakeLists file is translated to the corresponding nvcc flags. For example:
cmake ....  -DCUDA_ARCH="60 70" ....
  • CMAKE_NO_MPI: Boolean value. If True then non-MPI (single GPU) build will be forced. Results in smaller sized library which could be run on systems without MPI installed. If not specified then MPI build would be enabled if FindMPI script found any MPI installation.
  • AMGX_NO_RPATH: Boolean value. By default CMake adds -rpath flags to binaries. Setting this flag to True tell CMake to not do that - useful for controlling execution environment.
  • MKL_ROOT_DIR and MAGMA_ROOT_DIR: string values. MAGMA/MKL functionality is used to accelerate some of the AMGX eigensolvers. Those solvers will return error 'not supported' if AMGX was not build with MKL/MAGMA support.

The build system now enables CUDA as a language, and employs FindCUDAToolkit and FindMPI, so refer to those scripts from your CMake installation for module-specific flags.

When building with the NVIDIA HPC SDK, please use CMake >= 3.22, and GCC for C/CXX compilation, e.g.

cmake \
    -DCMAKE_C_COMPILER=gcc \
    -DCMAKE_CXX_COMPILER=g++ \
    -DCMAKE_BUILD_TYPE=Release \
    -DCUDA_ARCH="80" ..

Artifacts of the build are shared and static libraries (libamgxsh.so or amgxsh.dll and libamgx.a or amgx.lib) and few binaries from 'examples' directory that give you examples of using various AMGX C API. MPI examples are built only if MPI build was enabled.

Running examples

Sample input matrix matrix.mtx is in the examples directory. Sample AMGX solvers configurations are located in the src/configs directory in the root folder. Make sure that examples are able to find AMGX shared library - by default -rpath flag is used for binaries, but you might specify path manually in the environment variable: LD_LIBRARY_PATH for Linux and PATH for Windows.

Running single GPU example from the build directory:

> examples/amgx_capi -m ../examples/matrix.mtx -c ../src/configs/FGMRES_AGGREGATION.json
AMGX version 2.0.0-public-build125
Built on Oct  7 2017, 04:51:11
Compiled with CUDA Runtime 9.0, using CUDA driver 9.0
Warning: No mode specified, using dDDI by default.
Reading data...
RHS vector was not found. Using RHS b=[1,…,1]^T
Solution vector was not found. Setting initial solution to x=[0,…,0]^T
Finished reading
AMG Grid:
         Number of Levels: 1
            LVL         ROWS               NNZ    SPRSTY       Mem (GB)
         --------------------------------------------------------------
           0(D)           12                61     0.424       8.75e-07
         --------------------------------------------------------------
         Grid Complexity: 1
         Operator Complexity: 1
         Total Memory Usage: 8.75443e-07 GB
         --------------------------------------------------------------
           iter      Mem Usage (GB)       residual           rate
         --------------------------------------------------------------
            Ini            0.403564   3.464102e+00
              0            0.403564   1.619840e-14         0.0000
         --------------------------------------------------------------
         Total Iterations: 1
         Avg Convergence Rate:               0.0000
         Final Residual:           1.619840e-14
         Total Reduction in Residual:      4.676075e-15
         Maximum Memory Usage:                0.404 GB
         --------------------------------------------------------------
Total Time: 0.00169123
    setup: 0.00100198 s
    solve: 0.000689248 s
    solve(per iteration): 0.000689248 s

Running multi GPU example from the build directory:

> mpirun -n 2 examples/amgx_mpi_capi.exe -m ../examples/matrix.mtx -c ../src/configs/FGMRES_AGGREGATION.json
Process 0 selecting device 0
Process 1 selecting device 0
AMGX version 2.0.0-public-build125
Built on Oct  7 2017, 04:51:11
Compiled with CUDA Runtime 9.0, using CUDA driver 9.0
Warning: No mode specified, using dDDI by default.
Warning: No mode specified, using dDDI by default.
Cannot read file as JSON object, trying as AMGX config
Converting config string to current config version
Parsing configuration string: exception_handling=1 ;
Using Normal MPI (Hostbuffer) communicator...
Reading matrix dimensions in file: ../examples/matrix.mtx
Reading data...
RHS vector was not found. Using RHS b=[1,…,1]^T
Solution vector was not found. Setting initial solution to x=[0,…,0]^T
Finished reading
Using Normal MPI (Hostbuffer) communicator...
Using Normal MPI (Hostbuffer) communicator...
Using Normal MPI (Hostbuffer) communicator...
AMG Grid:
         Number of Levels: 1
            LVL         ROWS               NNZ    SPRSTY       Mem (GB)
         --------------------------------------------------------------
           0(D)           12                61     0.424        1.1e-06
         --------------------------------------------------------------
         Grid Complexity: 1
         Operator Complexity: 1
         Total Memory Usage: 1.09896e-06 GB
         --------------------------------------------------------------
           iter      Mem Usage (GB)       residual           rate
         --------------------------------------------------------------
            Ini             0.79834   3.464102e+00
              0             0.79834   3.166381e+00         0.9141
              1              0.7983   3.046277e+00         0.9621
              2              0.7983   2.804132e+00         0.9205
              3              0.7983   2.596292e+00         0.9259
              4              0.7983   2.593806e+00         0.9990
              5              0.7983   3.124839e-01         0.1205
              6              0.7983   5.373423e-02         0.1720
              7              0.7983   9.795357e-04         0.0182
              8              0.7983   1.651436e-13         0.0000
         --------------------------------------------------------------
         Total Iterations: 9
         Avg Convergence Rate:               0.0331
         Final Residual:           1.651436e-13
         Total Reduction in Residual:      4.767284e-14
         Maximum Memory Usage:                0.798 GB
         --------------------------------------------------------------
Total Time: 0.0170917
    setup: 0.00145344 s
    solve: 0.0156382 s
    solve(per iteration): 0.00173758 s

Testing the library

AmgX is automatically tested using the infrastructure in the ci/ directory, see the README.md for more information.

Further reading

Plugins and bindings to other software

User @shwina built python bindings to AMGX, check out following repository: https://github.com/shwina/pyamgx.

User @piyueh provided link to their work on PETSc wrapper plugins for AMGX: https://github.com/barbagroup/AmgXWrapper.

Julia bindings to AMGX are available at: https://github.com/JuliaGPU/AMGX.jl.

See API reference doc for detailed description of the interface. In the next few weeks we will be providing more information and details on the project such as:

  • Plans on the project development and priorities
  • Issues
  • Information on contributing
  • Information on solver configurations
  • Information on the code and algorithms

amgx's People

Contributors

aaronyicongfu avatar artv3 avatar byran77 avatar cponder avatar dmikushin avatar fbordignon avatar ftvkun avatar gonzalobg avatar ibaned avatar jirikraus avatar joshessman-llnl avatar kristofferc avatar lukeyeager avatar marsaev avatar mattmartineau avatar mhrywniak avatar mrogowski avatar nbickford-nv avatar odarbs avatar robertsawko avatar vishalmehta1991 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

amgx's Issues

Cannot read file as JSON object

When I use amgx_capi to solve some sparse linear system, it comes out that most of the default configs cannot be loaded into the API. Some default config will be used and the number of AMG levels is set to be 1.

Cannot read file as JSON object, trying as AMGX config
Converting config string to current config version

For some large enough sparse systems, these default settings won't converge.

For instance, ../core/configs/FGMRES_AGGREGATION_JACOBI.json will be loaded, however, ../core/configs/FGMRES_CLASSICAL_AGGRESSIVE_HMIS.json and ../core/configs/AMG_CLASSICAL_L1_AGGRESSIVE_HMIS.json will not.

Error building for Ubuntu 18.04.02 + CUDA 10.1.105-1

-- The C compiler identification is GNU 7.3.0
-- The CXX compiler identification is GNU 7.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- Found MPI_C: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so (found version "3.1")
-- Found MPI_CXX: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi_cxx.so (found version "3.1")
-- Found MPI: TRUE (found version "3.1")
This is a MPI build:TRUE
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Found CUDA: /usr/local/cuda (found version "10.1")
Cuda libraries: /usr/local/cuda/lib64/libcudart_static.a-lpthreaddl/usr/lib/x86_64-linux-gnu/librt.so
-- Configuring done
-- Generating done
-- Build files have been written to: /home/victor/AMGX/build

Also use gcc and g++ version 4.8.5 without success.

During installation, multiple notifications appear and as a result, the build crashes:
/home/victor/AMGX/base/include/matrix.h:247:3646: note: in C++11 destructors default to noexcept
/home/victor/AMGX/base/include/matrix.h:247:4053: warning: throw will always call terminate() [-Wterminate]
cusparseCheckError(cusparseDestroyMatDescr(cuMatDescr));
/home/victor/AMGX/base/include/matrix.h:247:4053: note: in C++11 destructors default to noexcept
CMakeFiles/Makefile2:165: recipe for target 'base/CMakeFiles/amgx_base.dir/all' failed
make[1]: *** [base/CMakeFiles/amgx_base.dir/all] Error 2
Makefile:129: recipe for target 'all' failed
make: *** [all] Error 2

A big request to help deal with the problem and build the library

Compiling lib from scratch fails

git clone https://github.com/NVIDIA/AMGX.git
cd amgx; mkdir build; cd build; cmake ..

This fails, because I don't pass any intended cuda architectures, when removing CMakeLists.txt:247, cmake runs successfully.

Running 'make' then results in an error around 7%:
amgx/core/src/classical/interpolators/distance2.cu(1343): error: identifier "sign" is undefined
detected during:
instantiation of "void amgx::distance2::compute_inner_sum_kernel<Value_type,CTA_SIZE,SMEM_SIZE,WARP_SIZE>(int, const int *, const int *, const Value_type *, const int *, const __nv_bool *, const int *, const int *, const int *, const int *, const Value_type *, const int *, Value_type *, int, int *, int *) [with Value_type=double, CTA_SIZE=256, SMEM_SIZE=128, WARP_SIZE=32]"
(2418): here
instantiation of "void amgx::Distance2_Interpolator<amgx::TemplateConfig<(AMGX_MemorySpace)1, t_vecPrec, t_matPrec, t_indPrec>>::generateInterpolationMatrix_1x1(amgx::Distance2_Interpolator<amgx::TemplateConfig<(AMGX_MemorySpace)1, t_vecPrec, t_matPrec, t_indPrec>>::Matrix_d &, amgx::Distance2_Interpolator<amgx::TemplateConfig<(AMGX_MemorySpace)1, t_vecPrec, t_matPrec, t_indPrec>>::IntVector &, amgx::Distance2_Interpolator<amgx::TemplateConfig<(AMGX_MemorySpace)1, t_vecPrec, t_matPrec, t_indPrec>>::BVector &, amgx::Distance2_Interpolator<amgx::TemplateConfig<(AMGX_MemorySpace)1, t_vecPrec, t_matPrec, t_indPrec>>::IntVector &, amgx::Distance2_Interpolator<amgx::TemplateConfig<(AMGX_MemorySpace)1, t_vecPrec, t_matPrec, t_indPrec>>::Matrix_d &, void *) [with t_vecPrec=(AMGX_VecPrecision)0, t_matPrec=(AMGX_MatPrecision)0, t_indPrec=(AMGX_IndPrecision)2]"
(2521): here

Line 1343 indeed calls the function sign(), which is defined, but only as a device function.
I do have cuda-10 installed, but the cuda-9.2 preceeds cuda-10 in $PATH.

Openfoam Matrix

Hi guys, how i convert one mesh matrix of openfoam to a matrix market for AMGX?

Cannot compile with sm30 on Quadro K1100

Hi,

So I am looking at compiling AmgX on my laptop Quadro K1100. After making the changes discussed here #11 the project compiles but running examples won't work no kernel image is available for execution on the device. I checked the CMakeLists.txt and of coursed it's because we are fixing CUDA_ARCH to sm35 and greater. So I go back to my configuration and based on this website I put sm30 in my config and compile procedure:

install_dir=${HOME}/apps/amgx
build_dir=build-gcc-ompi

# clean
rm -rf $build_dir
mkdir $build_dir
cd $build_dir

export OMPI_CC=/opt/cuda/bin/gcc
export OMPI_CXX=/opt/cuda/bin/g++

cmake \
    -DCMAKE_INSTALL_PREFIX=$install_dir \
    -DCMAKE_C_COMPILER=/opt/cuda/bin/gcc \
    -DCUDA_ARCH="30" \
    -DCMAKE_CXX_COMPILER=/opt/cuda/bin/g++ \
    ../ && make -j2 && make install

Alas, this breaks pretty early on producing lots of errors. See the file attached. I am investigating it independently, but if you have and advice, please let me know.

Non-mpi build fails with GCC-5

Hi,
I tried to build AMGX on ubuntu 16.04, but was not able to go very far.
Basic information about my system:
Ubuntu 16.04,
gcc, g++, 5.4.0
cuda 7.5

Compilation fails at 4%:
[ 4%] Building NVCC (Device) object base/CMakeFiles/amgx_base.dir/src/energymin/interpolators/amgx_base_generated_em_interpolator.cu.o
/usr/lib/gcc/x86_64-linux-gnu/5/include/mwaitxintrin.h(36): error: identifier "__builtin_ia32_monitorx" is undefined

Any suggestions on how to move forward? Thanks a lot!

Solving a system with multiple right-hand sides

Is there currently a way to solve a system A*X=B, where X and B are not a single but multiple vectors, at once? The only part of the documentation I could find that approaches this topic talks about computing multiple right-hand sides in succession.
There certainly are some applications where prior knowledge of multiple right-hand side vectors b^i is given and computing the solution to them simultaneously could lead to efficieny gains.

AMGx adapter in Trilinos/MueLu

Hello,
In 2015, an AMGx adapter was created within Trilinos's MueLU package [E. Furst, A. Prokopenko, J. Hu, 'Creating an AMGX adapter within the MueLu package', CCR Summer Proceedings 2015]. It is however restricted to a single GPU. Do you know whether work is currently being done (or planned) so as to make it work for more than a single GPU?
Thank you very much.
With best regards,
Serge

Build by vs2017, CUDA10.0 ERRORS

I can,t solve it......
error LNK1104 无法打开文件“D:\Users\HIT\Desktop\AMGX-master\build6\base\CMakeFiles\amgx_base.dir\src\Debug\amgx_base_generated_csr_multiply_sm20.cu.obj” amgxsh D:\Users\HIT\Desktop\AMGX-master\build6\LINK 1

error calling a host function("std::_Iterator_base12::_Iterator_base12") from a host device function("std::_Iterator_base12::_Iterator_base12 [subobject]") is not allowed amgx_base C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\include\thrust\system\cuda\detail\assign_value.h 78

Mode switch missing in the docs?

Indepdently, I got AmgX to compile on K80s on our cluster and I can now can run the examples. It seems to me that -mode switch is missing in README.md e.g. this

examples/amgx_capi -m ../examples/matrix.mtx -c ../core/configs/FGMRES_AGGREGATION.json

Should be probably something like:

examples/amgx_capi -mode dDDI -m ../examples/matrix.mtx -c ../core/configs/FGMRES_AGGREGATION.json

Happy to provide an independent pull request with README.md update and a tabular description of modes if you think it's useful.

GTC 2018 Anyone? (this is not an issue)

Hi,
is anyone from the develpers team or any user at GTC 2018 in San Jose?

I would be interested in meeting developers / other users to understand better the road-map of AMGX and share experiences in the use of AMGX.

Best Regards,

Andrea Borsic

An (indirect) issue from Fraunhofer-Chalmers Centre: CUDA 9.0/9.1

@marsaev Me and @niklaskarla had been having great performance issues with cusparse< type >csrsv_analysis in CUDA 9.0 and CUDA 9.1. It is 20 times slower than in CUDA 8.0, just running the same Sample code conjugateGradientPrecond on same GPU for a matrix sufficiently large enough.

We know it is not AMGX related (from what I can see that specific function is not being called anywhere in AMGX), but since having a Windows AMGX requires CUDA 9.0+, this enforces us to use those versions. Is it a known bug? Could we direct it to the cuSparse API group?

No host implementation of the dense LU solver

I get the following error message:

"Caught amgx exception: No host implementation of the dense LU solver"

How can I fix this?

I use the following configuration: (mode=hDDI)
config_version=2
solver(main)=FGMRES
main:max_iters=300
main:convergence=RELATIVE_MAX
main:tolerance=0.00000001
main:monitor_residual=1
main:preconditioner(amg)=AMG
main:print_solve_stats=1
amg:algorithm=CLASSICAL
amg:cycle=V
amg:max_iters=1
amg:max_levels=10
amg:smoother(amg_smoother)=BLOCK_JACOBI
amg:relaxation_factor=0.75
amg:presweeps=1
amg:postsweeps=2
amg:coarsest_sweeps=4
determinism_flag=1

Exact output:
AMGX version 2.0.0.130-opensource
Built on May 21 2019, 15:47:45
Compiled with CUDA Runtime 10.0, using CUDA driver 10.2
Cannot read file as JSON object, trying as AMGX config
Caught amgx exception: No host implementation of the dense LU solver
at: /home/lucas/Repositories/AMGX-master/core/include/solvers/dense_lu_solver.h:65

More information:
Ubuntu 18.04 (gcc 7.4)
CUDA 10.0
AMGX source from git master at: 21st of may
Build using MKL 2019.3-062 and MAGMA 2.3.0

Problem compiling my program

Hi guys, i am trying compiling my simple program usign AMGX. But when i try, initialize the AMGX, e.g:

#include<"Amgx_c.h">

main(){

AMGX_initialize()
}

I have one error, "Undefined reference to AMGX_initialize".

What happens?
Best regards.

Extract multiple eigenvalues

Hello guys.
I try to figure out how to get multiple eigenvalues using your example 'eigen_examples/eigensolver.c'. Can it be done by using specific config file or I have to modify the code?
I suppose that this algorithm works like this: starting from random vector and by modifying (through iteration) this vector getting closer to eigenvector?
Then i have to start from different random vectors to receive multiple eigenvalues, but this will not give me all of the eigenvalues every time for sure.
Is there any info I can read about this eigensolvers algorithms or I have to get this info by reading all the code?
Cheers.

why the result is emanative?

iter Mem Usage (GB) residual rate
--------------------------------------------------------------
Ini 0.743436 1.242538e+02
0 0.743436 1.895453e+03 15.2547
1 0.7434 2.219963e+03 1.1712
2 0.7434 6.992442e+03 3.1498
3 0.7434 4.034371e+04 5.7696
4 0.7434 2.512463e+05 6.2276
5 0.7434 1.574859e+06 6.2682
6 0.7434 9.933583e+06 6.3076
7 0.7434 6.550438e+07 6.5942
8 0.7434 4.426426e+08 6.7575
9 0.7434 2.992345e+09 6.7602
10 0.7434 2.048499e+10 6.8458
11 0.7434 1.371790e+11 6.6966
12 0.7434 8.775767e+11 6.3973
13 0.7434 5.814756e+12 6.6259
14 0.7434 3.906154e+13 6.7177
15 0.7434 2.647010e+14 6.7765
16 0.7434 1.822857e+15 6.8865
17 0.7434 1.258167e+16 6.9022
18 0.7434 8.768401e+16 6.9692
19 0.7434 6.146868e+17 7.0102
20 0.7434 4.336527e+18 7.0549
21 0.7434 3.075281e+19 7.0916
22 0.7434 2.193094e+20 7.1314
23 0.7434 1.572653e+21 7.1709
24 0.7434 1.133802e+22 7.2095
25 0.7434 8.224542e+22 7.2539
26 0.7434 5.996229e+23 7.2907
27 0.7434 4.401880e+24 7.3411
28 0.7434 3.243756e+25 7.3690
29 0.7434 2.408748e+26 7.4258
30 0.7434 1.788772e+27 7.4261
31 0.7434 1.339881e+28 7.4905
32 0.7434 9.923855e+28 7.4065
33 0.7434 7.419991e+29 7.4769
34 0.7434 5.320212e+30 7.1701
35 0.7434 3.810510e+31 7.1623
36 0.7434 2.739372e+32 7.1890
37 0.7434 1.967098e+33 7.1808
38 0.7434 1.414827e+34 7.1925
39 0.7434 1.017845e+35 7.1941
40 0.7434 7.338657e+35 7.2100
41 0.7434 5.301943e+36 7.2247
42 0.7434 3.837148e+37 7.2372
43 0.7434 2.786919e+38 7.2630
44 0.7434 2.020863e+39 7.2512
45 0.7434 1.470666e+40 7.2774
46 0.7434 1.058089e+41 7.1946
47 0.7434 7.589583e+41 7.1729
48 0.7434 5.452218e+42 7.1838
49 0.7434 3.913964e+43 7.1787
50 0.7434 2.815332e+44 7.1930
51 0.7434 2.026757e+45 7.1990
52 0.7434 1.461508e+46 7.2111
53 0.7434 1.056404e+47 7.2282
54 0.7434 7.628804e+47 7.2215
55 0.7434 5.523249e+48 7.2400
56 0.7434 3.972145e+49 7.1917
57 0.7434 2.852312e+50 7.1808
58 0.7434 2.046261e+51 7.1740
59 0.7434 1.470226e+52 7.1849
60 0.7434 1.055143e+53 7.1767
61 0.7434 7.573274e+53 7.1775
62 0.7434 5.432389e+54 7.1731
63 0.7434 3.901670e+55 7.1822
64 0.7434 2.800355e+56 7.1773
65 0.7434 2.015150e+57 7.1961
66 0.7434 1.449547e+58 7.1932
67 0.7434 1.045759e+59 7.2144
68 0.7434 7.526693e+59 7.1973
69 0.7434 5.426749e+60 7.2100
70 0.7434 3.898031e+61 7.1830
71 0.7434 2.798482e+62 7.1792
72 0.7434 2.007781e+63 7.1745
73 0.7434 1.443555e+64 7.1898
74 0.7434 1.036727e+65 7.1818
75 0.7434 7.455786e+65 7.1917
76 0.7434 5.355224e+66 7.1826
77 0.7434 3.851526e+67 7.1921
78 0.7434 2.766181e+68 7.1820
79 0.7434 1.989039e+69 7.1906
80 0.7434 1.428224e+70 7.1805
81 0.7434 1.026730e+71 7.1889
82 0.7434 7.371549e+71 7.1796
83 0.7434 5.298678e+72 7.1880
84 0.7434 3.804295e+73 7.1797
85 0.7434 2.734669e+74 7.1884
86 0.7434 1.963625e+75 7.1805
87 0.7434 1.411857e+76 7.1901
88 0.7434 1.013979e+77 7.1819
89 0.7434 7.292470e+77 7.1919
90 0.7434 5.238013e+78 7.1828
91 0.7434 3.767246e+79 7.1921
92 0.7434 2.705717e+80 7.1822
93 0.7434 1.945582e+81 7.1906
94 0.7434 1.397137e+82 7.1811
95 0.7434 1.004480e+83 7.1896
96 0.7434 7.212811e+83 7.1806
97 0.7434 5.185486e+84 7.1893
98 0.7434 3.723612e+85 7.1808
99 0.7434 2.677180e+86 7.1897
--------------------------------------------------------------
Total Iterations: 100
Avg Convergence Rate: 6.9716
Final Residual: 2.677180e+86
Total Reduction in Residual: 2.154607e+84
Maximum Memory Usage: 0.743 GB

Specifying RHS vector in amgx_capi

I started playing a little bit with matrices from SuiteSparse Matrix Collection and some matrices there come together with separate files for RHS vectors. At the moment I can't get them to work for amgx_capi and amgx_mpi_cap applications.

The apps use AMGX_read_system function to read the matrices. I found in AmgX documentation the following passage:

%%MatrixMarket matrix coordinate real general
%%AMGX block_dimx(int) block_dimy(int) diagonal sorted rhs solution
%% mxn matrix with nnz non-zero elements
%% m=block_dimx*n_block_rows, n=block_dimy*n_block_cols
%% nnz=block_dimx*block_dimy*n_block_entrees
m(int) n(int) nnz(int)
1 1 a_11
1 2 a_12
...
i j a_ij
...
%% these two comment lines present only for the description (to be removed)
%% optional diagonal mx1
...
a_ii
...
%% these two comment lines present only for the description (to be removed)
%% optional rhs mx1
...
b_i
...
%% these two comment lines present only for the description (to be removed)

So as an example I cat atmosmodl.mtx atmosmodl_b.mtx > atmosmodl_Ab.mtx and remove comment lines in between. That doesn't seem to do anything and I still get the message:

Reading data...
RHS vector was not found. Using RHS b=[1,…,1]^T

Also, I am concerned that the diagonal entries are spread independently within the first block of data.

Please advise. I am happy to delve deeper in the code, but I thought I'll check first.

I can't compile this project by Cmake at all for many day and nights

I triede many times, and can't get useful imformations from Internet.
system information:
CUDA 10.0
Cmake cmake-3.13.0-rc3-win64-x64
IDE VS 2017
TOOOOOOO many errors showed below! This made me MAD.

Found OpenMP_C: -openmp
Found OpenMP_CXX: -openmp
Found OpenMP: TRUE
Could NOT find MPI_C (missing: MPI_C_LIB_NAMES MPI_C_HEADER_DIR MPI_C_WORKS)
Could NOT find MPI_CXX (missing: MPI_CXX_LIB_NAMES MPI_CXX_HEADER_DIR MPI_CXX_WORKS)
Could NOT find MPI (missing: MPI_C_FOUND MPI_CXX_FOUND)
This is a MPI build:FALSE
CUDA_TOOLKIT_ROOT_DIR not found or specified
Could NOT find CUDA (missing: CUDA_TOOLKIT_ROOT_DIR CUDA_NVCC_EXECUTABLE CUDA_INCLUDE_DIRS CUDA_CUDART_LIBRARY)
Cuda libraries: CUDA_CUDART_LIBRARY-NOTFOUND
CMake Error at CMakeLists.txt:247 (STRING):
STRING sub-command REGEX, mode REPLACE needs at least 6 arguments total to
command.

CUDA_TOOLKIT_ROOT_DIR not found or specified
Could NOT find CUDA (missing: CUDA_TOOLKIT_ROOT_DIR CUDA_NVCC_EXECUTABLE CUDA_INCLUDE_DIRS CUDA_CUDART_LIBRARY)
CUDA_TOOLKIT_ROOT_DIR not found or specified
Could NOT find CUDA (missing: CUDA_TOOLKIT_ROOT_DIR CUDA_NVCC_EXECUTABLE CUDA_INCLUDE_DIRS CUDA_CUDART_LIBRARY)
CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
CUDA_CUDART_LIBRARY (ADVANCED)
linked by target "amgx" in directory D:/Users/HIT/Desktop/AMGX-master
linked by target "amgxsh" in directory D:/Users/HIT/Desktop/AMGX-master
linked by target "amgx_base" in directory D:/Users/HIT/Desktop/AMGX-master/base
linked by target "amgx_core" in directory D:/Users/HIT/Desktop/AMGX-master/core
linked by target "amgx_template_plugin" in directory D:/Users/HIT/Desktop/AMGX-master/template_plugin
linked by target "amgx_eigensolvers" in directory D:/Users/HIT/Desktop/AMGX-master/eigensolvers
linked by target "generate_poisson" in directory D:/Users/HIT/Desktop/AMGX-master/examples
linked by target "generate_poisson7_dist_renum" in directory D:/Users/HIT/Desktop/AMGX-master/examples
linked by target "amgx_tests_library" in directory D:/Users/HIT/Desktop/AMGX-master/tests
linked by target "amgx_tests_launcher" in directory D:/Users/HIT/Desktop/AMGX-master/tests
CUDA_TOOLKIT_INCLUDE (ADVANCED)
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/base
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/base
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/core
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/core
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/core
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/template_plugin
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/template_plugin
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/eigensolvers
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/eigensolvers
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/examples
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/examples
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/examples
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/examples
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/examples
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/examples
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/examples
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/examples
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/eigen_examples
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/eigen_examples
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/eigen_examples
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/eigen_examples
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/tests
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/tests
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/tests
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/tests
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/tests
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/tests
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/tests
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/tests
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/tests
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/tests
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/tests
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/tests
cublas_library
linked by target "amgx" in directory D:/Users/HIT/Desktop/AMGX-master
linked by target "amgxsh" in directory D:/Users/HIT/Desktop/AMGX-master
cusolver_library
linked by target "amgx" in directory D:/Users/HIT/Desktop/AMGX-master
linked by target "amgxsh" in directory D:/Users/HIT/Desktop/AMGX-master
linked by target "generate_poisson" in directory D:/Users/HIT/Desktop/AMGX-master/examples
linked by target "generate_poisson7_dist_renum" in directory D:/Users/HIT/Desktop/AMGX-master/examples
cusparse_library
linked by target "amgx" in directory D:/Users/HIT/Desktop/AMGX-master
linked by target "amgxsh" in directory D:/Users/HIT/Desktop/AMGX-master
linked by target "generate_poisson" in directory D:/Users/HIT/Desktop/AMGX-master/examples
linked by target "generate_poisson7_dist_renum" in directory D:/Users/HIT/Desktop/AMGX-master/examples
linked by target "amgx_tests_launcher" in directory D:/Users/HIT/Desktop/AMGX-master/tests

Configuring incomplete, errors occurred!
See also "D:/Users/HIT/Desktop/AMGX-master/build/CMakeFiles/CMakeOutput.log".

Your's
Ning

Question on AMGX configuration of published results.

Hi,
I have built AMGX and it works fine. For verification, I am trying to reproduce the results in Table 2 on page S618 in the article on AMGX by N.Naumov et.al. in SIAM J.Sci.Comput., vol 37, no.5, pp S602-S626 (2015). Trying different settings, but can not make the number of iterations agree (hardware is different so execution time will not agree) for any of the cases. It is not clear from the paper exactly which settings AMGX is using, if it is only AMG or AMG as preconditioner to another iterative method ? Is there by any chance a configuration file available for those runs ?
Best Regards
Bjorn

Can't understand the indexing scheme of coefficient matrix-A in Poisson's eqn solver example

Hi! I am modifying the Poisson's equation solver example for incorporating variable coefficient Poisson's equation, however, I am unable to interpret the indexing scheme.

The precise issue is that according to my understanding for every grid-point in my original matrix (for which I am solving the 5 point discretized 2D Poisson's equation), I should have 5 coefficients (which should be -1, -1, 4, -1 and -1), however, I find that the number of coefficients are varying from 2 to 5 for different grid-points.

In order to understand the example code before modifying it for variable coefficient case, I printed the grid-point index and its coefficient value for a simple case of 4x4 matrix with single CPU and GPU processor.
I ran the code as follows:
mpirun -np 1 examples/amgx_mpi_poisson5 -mode dDDI -p 4 4 -c ../core/configs/FGMRES_AGGREGATION_JACOBI.json

The output is attached herewith.
output.log

The example code is also attached.
amgx_mpi_poisson5pt.zip

CMake requires CUDA9.5 flags

Hi,

Thanks for open-sourcing AMGX! Looking forward to using it.

Meanwhile, I am trying to build it on my up-to-date Arch Linux laptop with Quadro and CUDA 9.1 on it. I am aiming for a distributed version so I do the following:

install_dir=${HOME}/apps/amgx                                                                                                                                                                                                                                                   

export OMPI_CC=/opt/cuda/bin/gcc
export OMPI_CXX=/opt/cuda/bin/g++

cmake \
    -DCMAKE_INSTALL_PREFIX=$install_dir \
    ../

I am getting a little confusing error about setting CUDA 9.5 flags inside CMakeLists.txt even though 9.1 was correctly detected.

-- Found CUDA: /opt/cuda (found version "9.1")                                                                                                                                                                                                                                  
Cuda libraries: /opt/cuda/lib64/libcudart_static.a-lpthreaddl/usr/lib/librt.so                                                                                                                                                                                                  
CMake Error at CMakeLists.txt:247 (message):                                                                                                                                                                                                                                    
  Default flags for CUDA 9.5 are not set.  Edit CMakeLists.txt to set them

Could you please comment on this and point me out in the right direction?

Compute capability 2.0?

Would AMGX support GPUs with compute capability 2.0? I see this PR by @niklaskarla (edit - apologies for tagging the wrong user ID) adds support for devices with 3.0. Wondering if a similar fix could accommodate lower CC?

GPU memory usage estimation

AMGX is currently able to provide an estimation of the GPU memory usage. Currently, this is achieved by computing the memory used by all processes, thanks to cudaMemGetInfo function.

It would be more interesting to only provide the memory used by the AMGX process.
To get the amount of memory used by the AMGX process, you may store at launch the current memory used estimation and subtract it to each "allocated = total - free" in the updateMaxMemoryUsage function.

DILU and ILU preconditioning errors

I'm trying to test CG with DILU and ILU preconditioning to make a solver comparison and I'm having troubles getting them working. I'm using PCG_DILU.json provided by AMGX as the baseline config file, solving the example matrix provided ( examples/matrix.mtx ).

Environment:

CUDA 9.2
Ubuntu 18.04 LTS

PCG_DILU.json

{
    "config_version": 2, 
    "solver": {
        "preconditioner": {
            "scope": "precond", 
            "solver": "MULTICOLOR_DILU"
        }, 
        "solver": "PCG", 
        "print_solve_stats": 1, 
        "obtain_timings": 1, 
        "max_iters": 20, 
        "monitor_residual": 1, 
        "scope": "main", 
        "tolerance": 1e-06, 
        "norm": "L2"
    }
}

And I get the following results:

AMGX version 2.0.0.130-opensource
Built on Jun 27 2018, 09:03:34
Compiled with CUDA Runtime 9.2, using CUDA driver 9.2
Warning: No mode specified, using dDDI by default.
Reading data...
RHS vector was not found. Using RHS b=[1,…,1]^T
Solution vector was not found. Setting initial solution to x=[0,…,0]^T
Finished reading
           iter      Mem Usage (GB)       residual           rate
         --------------------------------------------------------------
            Ini                   0   3.464102e+00
              0                   0            nan            nan
              1              0.0000           -nan           -nan
              2              0.0000            nan            nan
              3              0.0000           -nan           -nan
              4              0.0000            nan            nan
              5              0.0000           -nan           -nan
              6              0.0000            nan            nan
              7              0.0000           -nan           -nan
              8              0.0000            nan            nan
              9              0.0000           -nan           -nan
             10              0.0000            nan            nan
             11              0.0000           -nan           -nan
             12              0.0000            nan            nan
             13              0.0000           -nan           -nan
             14              0.0000            nan            nan
             15              0.0000           -nan           -nan
             16              0.0000            nan            nan
             17              0.0000           -nan           -nan
             18              0.0000            nan            nan
             19              0.0000           -nan           -nan
         --------------------------------------------------------------
         Total Iterations: 20
         Avg Convergence Rate: 		           -nan
         Final Residual: 		           -nan
         Total Reduction in Residual: 	           -nan
         Maximum Memory Usage: 		          0.000 GB
         --------------------------------------------------------------
Total Time: 0.125037
    setup: 0.000589824 s
    solve: 0.124448 s
    solve(per iteration): 0.00622238 s

For ILU Preconditioning I made a minor modification to the original PCG_DILU.json

{
    "config_version": 2, 
    "solver": {
        "preconditioner": {
            "scope": "precond", 
			"ilu_sparsity_level": 0,
			"coloring_level": 1,
            "solver": "MULTICOLOR_ILU"
        }, 
        "solver": "PCG", 
        "print_solve_stats": 1, 
        "obtain_timings": 1, 
        "max_iters": 20, 
        "monitor_residual": 1, 
        "scope": "main", 
        "tolerance": 1e-06, 
        "norm": "L2"
    }
}

This gives me the following output

AMGX version 2.0.0.130-opensource
Built on Jun 27 2018, 09:03:34
Compiled with CUDA Runtime 9.2, using CUDA driver 9.2
Warning: No mode specified, using dDDI by default.
Reading data...
RHS vector was not found. Using RHS b=[1,…,1]^T
Solution vector was not found. Setting initial solution to x=[0,…,0]^T
Finished reading
Caught amgx exception: Multicolor ILU smoother requires matrix to be reordered by color with ILU0 solver. Try setting reorder_cols_by_color=1 and insert_diag_while_reordering=1 in the multicolor_ilu solver scope in configuration file
 at: /opt/software/repos/AMGX/core/src/solvers/multicolor_ilu_solver.cu:1902
Stack trace:
 libamgxsh.so : amgx::multicolor_ilu_solver::MulticolorILUSolver_Base<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::pre_setup()+0x4ee
 libamgxsh.so : amgx::multicolor_ilu_solver::MulticolorILUSolver_Base<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solver_setup(bool)+0x8a
 libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::setup(amgx::Operator<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, bool)+0x1d4
 libamgxsh.so : amgx::PCG_Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solver_setup(bool)+0x46
 libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::setup(amgx::Operator<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, bool)+0x1d4
 libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::setup_no_throw(amgx::Operator<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, bool)+0x7c
 libamgxsh.so : amgx::AMG_Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::setup(amgx::Matrix<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&)+0x5c
 libamgxsh.so : amgx::AMGX_ERROR amgx::(anonymous namespace)::set_solver_with_shared<(AMGX_Mode)8193, amgx::AMG_Solver, amgx::Matrix>(AMGX_solver_handle_struct*, AMGX_matrix_handle_struct*, amgx::Resources*, amgx::AMGX_ERROR (amgx::AMG_Solver<amgx::TemplateMode<(AMGX_Mode)8193>::Type>::*)(std::shared_ptr<amgx::Matrix<amgx::TemplateMode<(AMGX_Mode)8193>::Type> >))+0xc9
 libamgxsh.so : AMGX_solver_setup()+0x183
 examples/amgx_capi : main()+0x4d7
 /lib/x86_64-linux-gnu/libc.so.6 : __libc_start_main()+0xe7
 examples/amgx_capi : _start()+0x2a

Caught amgx exception: Error, setup must be called before calling solve
 at: /opt/software/repos/AMGX/base/src/solvers/solver.cu:598
Stack trace:
 libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solve(amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, bool)+0x1fe1
 libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solve_no_throw(amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::AMGX_STATUS&, bool)+0x82
 libamgxsh.so : amgx::AMG_Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solve(amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::AMGX_STATUS&, bool)+0x3d
 libamgxsh.so : amgx::AMGX_ERROR amgx::(anonymous namespace)::solve_with<(AMGX_Mode)8193, amgx::AMG_Solver, amgx::Vector>(AMGX_solver_handle_struct*, AMGX_vector_handle_struct*, AMGX_vector_handle_struct*, amgx::Resources*, bool)+0xdf
 libamgxsh.so : AMGX_solver_solve()+0x17f
 examples/amgx_capi : main()+0x4eb
 /lib/x86_64-linux-gnu/libc.so.6 : __libc_start_main()+0xe7
 examples/amgx_capi : _start()+0x2a

As the output messages states, I added reorder_cols_by_color=1 and insert_diag_while_reordering=1

{
    "config_version": 2, 
    "solver": {
        "preconditioner": {
            "scope": "precond", 
			"ilu_sparsity_level": 0,
			"coloring_level": 1,
			"reorder_cols_by_color": 1,
			"insert_diag_while_reordering": 1,
            "solver": "MULTICOLOR_ILU"
        }, 
        "solver": "PCG", 
        "print_solve_stats": 1, 
        "obtain_timings": 1, 
        "max_iters": 20, 
        "monitor_residual": 1, 
        "scope": "main", 
        "tolerance": 1e-06, 
        "norm": "L2"
    }
}

Then I get the following output

AMGX version 2.0.0.130-opensource
Built on Jun 27 2018, 09:03:34
Compiled with CUDA Runtime 9.2, using CUDA driver 9.2
Warning: No mode specified, using dDDI by default.
Reading data...
RHS vector was not found. Using RHS b=[1,…,1]^T
Solution vector was not found. Setting initial solution to x=[0,…,0]^T
Finished reading
Caught amgx exception: Unsupported block size for Multicolor ILU solver, computeLUFactors
 at: /opt/software/repos/AMGX/core/src/solvers/multicolor_ilu_solver.cu:1818
Stack trace:
 libamgxsh.so : amgx::multicolor_ilu_solver::MulticolorILUSolver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::computeLUFactors()+0x88b
 libamgxsh.so : amgx::multicolor_ilu_solver::MulticolorILUSolver_Base<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solver_setup(bool)+0xa2
 libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::setup(amgx::Operator<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, bool)+0x1d4
 libamgxsh.so : amgx::PCG_Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solver_setup(bool)+0x46
 libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::setup(amgx::Operator<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, bool)+0x1d4
 libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::setup_no_throw(amgx::Operator<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, bool)+0x7c
 libamgxsh.so : amgx::AMG_Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::setup(amgx::Matrix<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&)+0x5c
 libamgxsh.so : amgx::AMGX_ERROR amgx::(anonymous namespace)::set_solver_with_shared<(AMGX_Mode)8193, amgx::AMG_Solver, amgx::Matrix>(AMGX_solver_handle_struct*, AMGX_matrix_handle_struct*, amgx::Resources*, amgx::AMGX_ERROR (amgx::AMG_Solver<amgx::TemplateMode<(AMGX_Mode)8193>::Type>::*)(std::shared_ptr<amgx::Matrix<amgx::TemplateMode<(AMGX_Mode)8193>::Type> >))+0xc9
 libamgxsh.so : AMGX_solver_setup()+0x183
 examples/amgx_capi : main()+0x4d7
 /lib/x86_64-linux-gnu/libc.so.6 : __libc_start_main()+0xe7
 examples/amgx_capi : _start()+0x2a

Caught amgx exception: Error, setup must be called before calling solve
 at: /opt/software/repos/AMGX/base/src/solvers/solver.cu:598
Stack trace:
 libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solve(amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, bool)+0x1fe1
 libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solve_no_throw(amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::AMGX_STATUS&, bool)+0x82
 libamgxsh.so : amgx::AMG_Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solve(amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::AMGX_STATUS&, bool)+0x3d
 libamgxsh.so : amgx::AMGX_ERROR amgx::(anonymous namespace)::solve_with<(AMGX_Mode)8193, amgx::AMG_Solver, amgx::Vector>(AMGX_solver_handle_struct*, AMGX_vector_handle_struct*, AMGX_vector_handle_struct*, amgx::Resources*, bool)+0xdf
 libamgxsh.so : AMGX_solver_solve()+0x17f
 examples/amgx_capi : main()+0x4eb
 /lib/x86_64-linux-gnu/libc.so.6 : __libc_start_main()+0xe7
 examples/amgx_capi : _start()+0x2a

Perhaps I'm doing something wrong modifying the config file for ILU preconditioner. But for DILU I'm using the original config provided in the AMGX (I get same results with other solvers that use DILU preconditioner).

Any help or feedback would be very much appreciated

An issue from Fraunhofer-Chalmers Centre: Unit tests

Which unit tests are relevant and up to date?
We have extensively test on different GPUs, CUDA 8 vs 9.0 and quite a few keep failing under Linux. Besides the missing matrices, which we have filled in from either Matrix Market or proper use of generate_poisson, there are some tests that seem basic and keep failing. Find attached the logs of unit tests on an array of GPUs and configurations.

gtx970_9.0.log
gtx970.log
gtx1080ti.log
gtx660ti.log

All unit tests were compiled with CUDA 8.0, except gtx_970_9.0.log. The tests performed on 660ti were done after we have changed some (possibly too strict? check #8) hardcoded CUDA_ARCH>=3.5 to CUDA_ARCH>=3.0.

We will be more than happy to fix them, as long as we have a list of the tests that are supposed to work, because some seem very outdated or poorly maintained.

The DEBUG version couldn't be built due to "calling a __host__function from a __host__ __device__ function"

The compiler I use is VS 2017, and the CUDA I use is CUDA 10. I have tried it on two computers, and I got the same errors every time.

The errors I get are:
calling a host function("std::_Iterator_base12::_Iterator_base12") from a host device function("std::_Iterator_base12::_Iterator_base12 [subobject]") is not allowed amgx_core C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\include\thrust\system\cuda\detail\assign_value.h

calling a host function("std::_Iterator_base12::_Iterator_base12") from a host device function("std::_Iterator_base12::_Iterator_base12 [subobject]") is not allowed amgx_core C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\include\thrust\system\cuda\detail\assign_value.h

By the way, it take me 4 to 6 hours to build amgx_core project with MPI. Would it be possible that I set up wrong somewhere?

AMGX_solver_get_status() requires monitor_residual.

If monitor_residual is not set or set to 0 in the config, then AMGX_solver_get_status() silently returns 0 (success) regardless of convergence. Perhaps this should be made explicit in the reference for AMGX_solver_get_status.

Build AMG.sln successfully, but can't debug.AND How can I choose different matrix and algorithm?

I debug the project amgx_capi,then shows:

Usage: ./amgx_capi [-mode [hDDI | hDFI | hFFI | dDDI | dDFI | dFFI]] [-m file] [-c config_file] [-amg "variable1=value1 ... variable3=value3"]
-mode: select the solver mode
-m file: read matrix stored in the file
-c: set the amg solver options from the config file
-amg: set the amg solver options from the command line
请按任意键继续. . .

Reproduce results on machine with CUDA 10

My examples/amgx_capi prints

AMGX version 2.0.0.130-opensource
Built on May 17 2019, 14:26:12
Compiled with CUDA Runtime 9.2, using CUDA driver 10.1

The NVIDIA docs say CUDA driver means the highest supported version, and CUDA Runtime the actual used version.

The problem is now that I cannot reproduce my previous results, when CUDA 10.1 was not yet installed. For example, using CG on bodyy6.mtx with a custom B vector now requires 843 iterations, versus 568 earlier. cfd2.mtx does not converge at all, where it only took 22 iterations before. Does anyone else have these problems?
If you'd like to run the experiments as well, unpack the zip in examples/ (replacing amgx_capi.c), make and run with

gcc -O2 -std=c99 amgx_capi.c -c -I/usr/local/cuda-9.2/include -I../base/include
g++ -O2 amgx_capi.o -o amgx_capi -L/usr/local/cuda-9.2/lib64 -L../build -ldl -L../lib -lamgxsh -Wl,-rpath=../build
./amgx_capi -m bodyy6.mtx -b B.ltx -x X.ltx -c CG.json
./amgx_capi -m cfd2.mtx -c CG.json

debug.zip

Compilation Issue CUDA 9.2 and VS2017

The following error occurs:

polynomial_solver.cu(291): error C2668: "amgx::polynomial_solver::poly_postsmooth":

polynomial_solver.cu(351): error C2668: "amgx::polynomial_solver::poly_presmooth":

[Question] Does it support devices of compute capability 70?

I have a Titan V card and I've successfully compiled the codes. But when I was trying to run the example I got the following errors. does 'invalid device function' mean that it currently does not support Titan V? Thanks.

➜  build git:(master) examples/amgx_capi -m ../examples/matrix.mtx -c ../core/configs/FGMRES_AGGREGATION.json
AMGX version 2.0.0.130-opensource
Built on Mar 16 2018, 21:21:21
Compiled with CUDA Runtime 9.1, using CUDA driver 9.1
Warning: No mode specified, using dDDI by default.
Thrust failure: parallel_for failed: invalid device function
File and line number are not available for this exception.
Caught amgx exception: Cuda failure: 'invalid device function'

Error in amgx_mpi_poisson5pt.c example

There are two errors in the 5 point 2D discretization of Poisson's equation. The A matrix is generated in the example code that comes along with AmgX library as follows:


1.         for (int i = 0; i < n; i ++)
2.         {
3.             row_ptrs[i] = nnz;
4. 
5.             if (rank > 0 || i > ny)
6.             {
7.                 col_indices[nnz] = (i + start_idx - ny);
8. 
9.                 if (sizeof_m_val == 4)
10.                 {
11.                     ((float *)values)[nnz] = -1.f;
12.                 }
13.                 else if (sizeof_m_val == 8)
14.                 {
15.                     ((double *)values)[nnz] = -1.;
16.                 }
17. 
18.                 nnz++;
19.             }
20. 
21.             if (i % ny != 0)
22.             {
23.                 col_indices[nnz] = (i + start_idx - 1);
24. 
25.                 if (sizeof_m_val == 4)
26.                 {
27.                     ((float *)values)[nnz] = -1.f;
28.                 }
29.                 else if (sizeof_m_val == 8)
30.                 {
31.                     ((double *)values)[nnz] = -1.;
32.                 }
33. 
34.                 nnz++;
35.             }
36. 
37.             {
38.                 col_indices[nnz] = (i + start_idx);
39. 
40.                 if (sizeof_m_val == 4)
41.                 {
42.                     ((float *)values)[nnz] = 4.f;
43.                 }
44.                 else if (sizeof_m_val == 8)
45.                 {
46.                     ((double *)values)[nnz] = 4.;
47.                 }
48. 
49.                 nnz++;
50.             }
51. 
52.             if ((i + 1) % ny == 0)
53.             {
54.                 col_indices[nnz] = (i + start_idx + 1);
55. 
56.                 if (sizeof_m_val == 4)
57.                 {
58.                     ((float *)values)[nnz] = -1.f;
59.                 }
60.                 else if (sizeof_m_val == 8)
61.                 {
62.                     ((double *)values)[nnz] = -1.;
63.                 }
64. 
65.                 nnz++;
66.             }
67. 
68.             if ( (rank != nranks - 1) || (i / ny != (nx - 1)) )
69.             {
70.                 col_indices[nnz] = (i + start_idx + ny);
71. 
72.                 if (sizeof_m_val == 4)
73.                 {
74.                     ((float *)values)[nnz] = -1.f;
75.                 }
76.                 else if (sizeof_m_val == 8)
77.                 {
78.                     ((double *)values)[nnz] = -1.;
79.                 }
80. 
81.                 nnz++;
82.             }
83.         }

The two errors are in the following conditions:
Line 5. if (rank > 0 || i > ny)
Line 52. if ((i + 1) % ny == 0)

The correct statement should be:
Line 5. if (rank > 0 || i >= ny)
Line 52. if ((i + 1) % ny != 0)

After the above mentioned correction, the output for 4x4 matrix is attached. The solution has been verified against matlab solution.

output.log

Building AMGX using Nix

I'm working on a Nix recipe for AMGX. It currently looks like this. Currently it seems to build without any issues. However, when I try to run an example I get the following,

$ examples/amgx_capi -m ../examples/matrix.mtx -c ../core/configs/FGMRES_AGGREGATION.json
AMGX version 2.0.0.130-opensource
Built on May 14 2018, 23:06:38
AMGX ERROR: file /tmp/nix-build-AmgX.drv-0/lafn8qxabfn95rh3bh3y0bi113kzwl8w-source/examples/amgx_capi.c line    245
AMGX ERROR: Error initializing amgx core.
Failed while initializing CUDA runtime in cudaRuntimeGetVersion

Any ideas?

Inserting debug prints

Is there a way to easily insert debug prints in the library?
For example, I want to create a file with the X and B vector for every iteration. base/src/solver.cu:solve() contains the main iteration loop, and calls solve_iteration(b, x, xIsZero).
They are declared as 'Vector &' as parameters, the solver I use (PBICGSTAB) receives 'VVector &' type parameters.
I tried 'printf("%.5e\n", b[i]); in pbicgstab_solver.cu, but it warns 'non-POD class type passed through ellipsis' and it prints zeros for b, where it should be ones. The value of x starts at zeros (initial guess), but is 6.9e-310 (smallest positive value) in subsequent iterations.
'printf("%.5e\n", b.pod()[i]); , but it results in a SIGSEGV
Printing the types of 'x.pod()' and 'x.pod[0]' result in 'N4amgx9PODVectorIdiEE' and 'd' respectively.

calling a _host_ function from a _host_ _device_ function is not allowed

I was using vs2015 update3 and cmake 3.13.1 to compile. After meeting the problem in https://github.com/NVIDIA/AMGX/issues/36 and having solved the problem, I got another problem in building :

1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.1\include\thrust/detail/allocator/allocator_traits.inl(230): error : calling a __host__ function("std::_Iterator_base12::_Iterator_base12") from a __host__ __device__ function("std::_Iterator_base12::_Iterator_base12 [subobject]") is not allowed
1>
1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.1\include\thrust/detail/allocator/allocator_traits.inl(230): error : calling a __host__ function("std::_Iterator_base12::~_Iterator_base12") from a __host__ __device__ function("std::_Iterator_base12::~_Iterator_base12 [subobject]") is not allowed
1>
1> 2 errors detected in the compilation of "C:/Users/i/AppData/Local/Temp/tmpxft_00005b44_00000000-10_comms_mpi_hostbuffer_stream.cpp1.ii".
1> comms_mpi_hostbuffer_stream.cu
1> CMake Error at amgx_base_generated_comms_mpi_hostbuffer_stream.cu.obj.Debug.cmake:283 (message):
1> Error generating file
1> C:/AMGX-master/build/base/CMakeFiles/amgx_base.dir/src/distributed/Debug/amgx_base_generated_comms_mpi_hostbuffer_stream.cu.obj

and

2>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.1\include\thrust/system/cuda/detail/assign_value.h(78): error : calling a __host__ function("std::_Iterator_base12::_Iterator_base12") from a __host__ __device__ function("std::_Iterator_base12::_Iterator_base12 [subobject]") is not allowed
2>
2>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.1\include\thrust/system/cuda/detail/assign_value.h(78): error : calling a __host__ function("std::_Iterator_base12::~_Iterator_base12") from a __host__ __device__ function("std::_Iterator_base12::~_Iterator_base12 [subobject]") is not allowed
2>
2> 2 errors detected in the compilation of "C:/Users/i/AppData/Local/Temp/tmpxft_00002bd0_00000001-10_matrix_analysis.cpp1.ii".
2> matrix_analysis.cu
2> CMake Error at amgx_core_generated_matrix_analysis.cu.obj.Debug.cmake:283 (message):
2> Error generating file
2> C:/AMGX-master/build/core/CMakeFiles/amgx_core.dir/src/Debug/amgx_core_generated_matrix_analysis.cu.obj

I was really confused that these errors occured in the CUDA thrust head files and I couldn't figure it out.
I wonder if anyone else meets the same problem.

Right-hand-side of Poisson examples should be changed

AMGX ships with many nice sample programs, including simple implementation of Poisson-like equations. Currently, the charge function (e.g. in A x = b, the array b) is set to 1.d0 everywhere.

This is a poor choice, since integrating a constant over larger and large volumes gives you an exponentially growing solution. As a consequence, the sample cases have terrible convergence rates and do not scale as the problem size is increased.

I suggest a better choice: set b = sin( 2 * pi * x / Lx ) * sin( 2 * pi * y / Ly ) * sin( 2 * pi * z / Lz ), which satisfies either periodic or Dirichlet boundary conditions. This is more representative of the use of a Poisson solver.

GMRES segfault when preconditioner="NOSOLVER"

The GMRES solver produces a segfault when preconditioner="NOSOLVER":

[atrikut@node1144 examples]$ cat ../configs/core/GMRES.json
{
    "config_version": 2,
    "solver": {
        "preconditioner": {
            "scope": "amg",
            "solver": "NOSOLVER"
        },
        "use_scalar_norm": 1,
        "solver": "GMRES",
        "print_solve_stats": 1,
        "obtain_timings": 1,
        "monitor_residual": 1,
        "convergence": "RELATIVE_INI_CORE",
        "scope": "main",
        "tolerance": 1e-6,
        "norm": "L2"
    }
}
[atrikut@node1144 examples]$ ./amgx_capi -m matrix.mtx -c ../configs/core/GMRES.json

▽
  1 {
AMGX version 2.0.0.130-opensource
Built on May 17 2018, 05:29:30
Compiled with CUDA Runtime 8.0, using CUDA driver 9.0
Warning: No mode specified, using dDDI by default.
Reading data...
RHS vector was not found. Using RHS b=[1,…,1]^T
Solution vector was not found. Setting initial solution to x=[0,…,0]^T
Finished reading
           iter      Mem Usage (GB)       residual           rate
         --------------------------------------------------------------
            Ini                   0   3.464102e+00
              0                   0   1.845471e+00         0.5327
              1              0.0000   1.541877e+00         0.8355
              2              0.0000   1.374225e+00         0.8913
              3              0.0000   1.366903e+00         0.9947
              4              0.0000   1.040855e+00         0.7615
              5              0.0000   1.026638e+00         0.9863
              6              0.0000   8.614123e-01         0.8391
              7              0.0000   6.599583e-01         0.7661
              8              0.0000   6.596676e-01         0.9996
              9              0.0000   6.593714e-01         0.9996
             10              0.0000   6.331763e-01         0.9603
Caught signal 11 - SIGSEGV (segmentation violation)
 /home/atrikut/local/AMGX/build/libamgxsh.so : amgx::handle_signals(int)+0xbb
 /lib64/libpthread.so.0 : ()+0xf680
 /home/atrikut/local/AMGX/build/libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solve(amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, bool)+0x11
 /home/atrikut/local/AMGX/build/libamgxsh.so : amgx::GMRES_Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solve_iteration(amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, bool)+0x5d2
 /home/atrikut/local/AMGX/build/libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solve(amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, bool)+0x554
 /home/atrikut/local/AMGX/build/libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solve_no_throw(amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::AMGX_STATUS&, bool)+0x77
 /home/atrikut/local/AMGX/build/libamgxsh.so : amgx::AMG_Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solve(amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::AMGX_STATUS&, bool)+0x3d
 /home/atrikut/local/AMGX/build/libamgxsh.so : amgx::AMGX_ERROR amgx::(anonymous namespace)::solve_with<(AMGX_Mode)8193, amgx::AMG_Solver, amgx::Vector>(AMGX_solver_handle_struct*, AMGX_vector_handle_struct*, AMGX_vector_handle_struct*, amgx::Resources*, bool)+0x268
 /home/atrikut/local/AMGX/build/libamgxsh.so : AMGX_solver_solve()+0x147
 ./amgx_capi : main()+0x34d
 /lib64/libc.so.6 : __libc_start_main()+0xf5
 ./amgx_capi() [0x402233]
Segmentation fault (core dumped)

But when preconditioner is not NOSOLVER:

[atrikut@node1144 examples]$ cat ../configs/core/GMRES.json
{
    "config_version": 2,
    "solver": {
        "preconditioner": {
            "scope": "amg",
            "solver": "BLOCK_JACOBI"
        },
        "use_scalar_norm": 1,
        "solver": "GMRES",
        "print_solve_stats": 1,
        "obtain_timings": 1,
        "monitor_residual": 1,
        "convergence": "RELATIVE_INI_CORE",
        "scope": "main",
        "tolerance": 1e-6,
        "norm": "L2"
    }
}
[atrikut@node1144 examples]$ ./amgx_capi -m matrix.mtx -c ../configs/core/GMRES.json
AMGX version 2.0.0.130-opensource
Built on May 17 2018, 05:29:30
Compiled with CUDA Runtime 8.0, using CUDA driver 9.0
Warning: No mode specified, using dDDI by default.
Reading data...
RHS vector was not found. Using RHS b=[1,…,1]^T
Solution vector was not found. Setting initial solution to x=[0,…,0]^T
Finished reading
           iter      Mem Usage (GB)       residual           rate
         --------------------------------------------------------------
            Ini                   0   3.464102e+00
              0                   0   1.934535e+00         0.5585
              1              0.0000   1.934535e+00         1.0000
              2              0.0000   1.934535e+00         1.0000
              3              0.0000   1.644110e+00         0.8499
              4              0.0000   1.507196e+00         0.9167
              5              0.0000   1.213641e+00         0.8052
              6              0.0000   1.160986e+00         0.9566
              7              0.0000   1.092385e+00         0.9409
              8              0.0000   1.088166e+00         0.9961
              9              0.0000   8.810365e-01         0.8097
             10              0.0000   4.990786e-01         0.5665
             11              0.0000   4.764949e-01         0.9547
             12              0.0000   1.059929e-01         0.2224
             13              0.0000   1.677930e-15         0.0000
         --------------------------------------------------------------
         Total Iterations: 14
         Avg Convergence Rate: 		         0.0806
         Final Residual: 		   1.677930e-15
         Total Reduction in Residual: 	   4.843766e-16
         Maximum Memory Usage: 		          0.000 GB
         --------------------------------------------------------------
Total Time: 0.0461597
    setup: 0.000248704 s
    solve: 0.045911 s
    solve(per iteration): 0.00327936 s
[atrikut@node1144 examples]$

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.