nvidia / amgx Goto Github PK
View Code? Open in Web Editor NEWDistributed multigrid linear solver library on GPU
Distributed multigrid linear solver library on GPU
Running amgx_capi_multi example with n=2, on a machine with two identical GPUs, produces (after the 1st solve is completed)
Caught amgx exception: Could not create the CUDENSE handle
The following error occurs:
polynomial_solver.cu(291): error C2668: "amgx::polynomial_solver::poly_postsmooth":
polynomial_solver.cu(351): error C2668: "amgx::polynomial_solver::poly_presmooth":
Hi,
So I am looking at compiling AmgX on my laptop Quadro K1100. After making the changes discussed here #11 the project compiles but running examples won't work no kernel image is available for execution on the device
. I checked the CMakeLists.txt
and of coursed it's because we are fixing CUDA_ARCH to sm35 and greater. So I go back to my configuration and based on this website I put sm30
in my config and compile procedure:
install_dir=${HOME}/apps/amgx
build_dir=build-gcc-ompi
# clean
rm -rf $build_dir
mkdir $build_dir
cd $build_dir
export OMPI_CC=/opt/cuda/bin/gcc
export OMPI_CXX=/opt/cuda/bin/g++
cmake \
-DCMAKE_INSTALL_PREFIX=$install_dir \
-DCMAKE_C_COMPILER=/opt/cuda/bin/gcc \
-DCUDA_ARCH="30" \
-DCMAKE_CXX_COMPILER=/opt/cuda/bin/g++ \
../ && make -j2 && make install
Alas, this breaks pretty early on producing lots of errors. See the file attached. I am investigating it independently, but if you have and advice, please let me know.
I get the following error message:
"Caught amgx exception: No host implementation of the dense LU solver"
How can I fix this?
I use the following configuration: (mode=hDDI)
config_version=2
solver(main)=FGMRES
main:max_iters=300
main:convergence=RELATIVE_MAX
main:tolerance=0.00000001
main:monitor_residual=1
main:preconditioner(amg)=AMG
main:print_solve_stats=1
amg:algorithm=CLASSICAL
amg:cycle=V
amg:max_iters=1
amg:max_levels=10
amg:smoother(amg_smoother)=BLOCK_JACOBI
amg:relaxation_factor=0.75
amg:presweeps=1
amg:postsweeps=2
amg:coarsest_sweeps=4
determinism_flag=1
Exact output:
AMGX version 2.0.0.130-opensource
Built on May 21 2019, 15:47:45
Compiled with CUDA Runtime 10.0, using CUDA driver 10.2
Cannot read file as JSON object, trying as AMGX config
Caught amgx exception: No host implementation of the dense LU solver
at: /home/lucas/Repositories/AMGX-master/core/include/solvers/dense_lu_solver.h:65
More information:
Ubuntu 18.04 (gcc 7.4)
CUDA 10.0
AMGX source from git master at: 21st of may
Build using MKL 2019.3-062 and MAGMA 2.3.0
There is a software cad, what export to matrix market format?
AMGX ships with many nice sample programs, including simple implementation of Poisson-like equations. Currently, the charge function (e.g. in A x = b, the array b) is set to 1.d0 everywhere.
This is a poor choice, since integrating a constant over larger and large volumes gives you an exponentially growing solution. As a consequence, the sample cases have terrible convergence rates and do not scale as the problem size is increased.
I suggest a better choice: set b = sin( 2 * pi * x / Lx ) * sin( 2 * pi * y / Ly ) * sin( 2 * pi * z / Lz ), which satisfies either periodic or Dirichlet boundary conditions. This is more representative of the use of a Poisson solver.
Would AMGX support GPUs with compute capability 2.0? I see this PR by @niklaskarla (edit - apologies for tagging the wrong user ID) adds support for devices with 3.0. Wondering if a similar fix could accommodate lower CC?
I debug the project amgx_capi,then shows:
Usage: ./amgx_capi [-mode [hDDI | hDFI | hFFI | dDDI | dDFI | dFFI]] [-m file] [-c config_file] [-amg "variable1=value1 ... variable3=value3"]
-mode: select the solver mode
-m file: read matrix stored in the file
-c: set the amg solver options from the config file
-amg: set the amg solver options from the command line
请按任意键继续. . .
Indepdently, I got AmgX to compile on K80s on our cluster and I can now can run the examples. It seems to me that -mode
switch is missing in README.md
e.g. this
examples/amgx_capi -m ../examples/matrix.mtx -c ../core/configs/FGMRES_AGGREGATION.json
Should be probably something like:
examples/amgx_capi -mode dDDI -m ../examples/matrix.mtx -c ../core/configs/FGMRES_AGGREGATION.json
Happy to provide an independent pull request with README.md
update and a tabular description of modes if you think it's useful.
-- The C compiler identification is GNU 7.3.0
-- The CXX compiler identification is GNU 7.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- Found MPI_C: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so (found version "3.1")
-- Found MPI_CXX: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi_cxx.so (found version "3.1")
-- Found MPI: TRUE (found version "3.1")
This is a MPI build:TRUE
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Found CUDA: /usr/local/cuda (found version "10.1")
Cuda libraries: /usr/local/cuda/lib64/libcudart_static.a-lpthreaddl/usr/lib/x86_64-linux-gnu/librt.so
-- Configuring done
-- Generating done
-- Build files have been written to: /home/victor/AMGX/build
Also use gcc and g++ version 4.8.5 without success.
During installation, multiple notifications appear and as a result, the build crashes:
/home/victor/AMGX/base/include/matrix.h:247:3646: note: in C++11 destructors default to noexcept
/home/victor/AMGX/base/include/matrix.h:247:4053: warning: throw will always call terminate() [-Wterminate]
cusparseCheckError(cusparseDestroyMatDescr(cuMatDescr));
/home/victor/AMGX/base/include/matrix.h:247:4053: note: in C++11 destructors default to noexcept
CMakeFiles/Makefile2:165: recipe for target 'base/CMakeFiles/amgx_base.dir/all' failed
make[1]: *** [base/CMakeFiles/amgx_base.dir/all] Error 2
Makefile:129: recipe for target 'all' failed
make: *** [all] Error 2
A big request to help deal with the problem and build the library
My examples/amgx_capi prints
AMGX version 2.0.0.130-opensource
Built on May 17 2019, 14:26:12
Compiled with CUDA Runtime 9.2, using CUDA driver 10.1
The NVIDIA docs say CUDA driver means the highest supported version, and CUDA Runtime the actual used version.
The problem is now that I cannot reproduce my previous results, when CUDA 10.1 was not yet installed. For example, using CG on bodyy6.mtx with a custom B vector now requires 843 iterations, versus 568 earlier. cfd2.mtx does not converge at all, where it only took 22 iterations before. Does anyone else have these problems?
If you'd like to run the experiments as well, unpack the zip in examples/ (replacing amgx_capi.c), make and run with
gcc -O2 -std=c99 amgx_capi.c -c -I/usr/local/cuda-9.2/include -I../base/include
g++ -O2 amgx_capi.o -o amgx_capi -L/usr/local/cuda-9.2/lib64 -L../build -ldl -L../lib -lamgxsh -Wl,-rpath=../build
./amgx_capi -m bodyy6.mtx -b B.ltx -x X.ltx -c CG.json
./amgx_capi -m cfd2.mtx -c CG.json
AMGX is currently able to provide an estimation of the GPU memory usage. Currently, this is achieved by computing the memory used by all processes, thanks to cudaMemGetInfo function.
It would be more interesting to only provide the memory used by the AMGX process.
To get the amount of memory used by the AMGX process, you may store at launch the current memory used estimation and subtract it to each "allocated = total - free" in the updateMaxMemoryUsage function.
iter Mem Usage (GB) residual rate
--------------------------------------------------------------
Ini 0.743436 1.242538e+02
0 0.743436 1.895453e+03 15.2547
1 0.7434 2.219963e+03 1.1712
2 0.7434 6.992442e+03 3.1498
3 0.7434 4.034371e+04 5.7696
4 0.7434 2.512463e+05 6.2276
5 0.7434 1.574859e+06 6.2682
6 0.7434 9.933583e+06 6.3076
7 0.7434 6.550438e+07 6.5942
8 0.7434 4.426426e+08 6.7575
9 0.7434 2.992345e+09 6.7602
10 0.7434 2.048499e+10 6.8458
11 0.7434 1.371790e+11 6.6966
12 0.7434 8.775767e+11 6.3973
13 0.7434 5.814756e+12 6.6259
14 0.7434 3.906154e+13 6.7177
15 0.7434 2.647010e+14 6.7765
16 0.7434 1.822857e+15 6.8865
17 0.7434 1.258167e+16 6.9022
18 0.7434 8.768401e+16 6.9692
19 0.7434 6.146868e+17 7.0102
20 0.7434 4.336527e+18 7.0549
21 0.7434 3.075281e+19 7.0916
22 0.7434 2.193094e+20 7.1314
23 0.7434 1.572653e+21 7.1709
24 0.7434 1.133802e+22 7.2095
25 0.7434 8.224542e+22 7.2539
26 0.7434 5.996229e+23 7.2907
27 0.7434 4.401880e+24 7.3411
28 0.7434 3.243756e+25 7.3690
29 0.7434 2.408748e+26 7.4258
30 0.7434 1.788772e+27 7.4261
31 0.7434 1.339881e+28 7.4905
32 0.7434 9.923855e+28 7.4065
33 0.7434 7.419991e+29 7.4769
34 0.7434 5.320212e+30 7.1701
35 0.7434 3.810510e+31 7.1623
36 0.7434 2.739372e+32 7.1890
37 0.7434 1.967098e+33 7.1808
38 0.7434 1.414827e+34 7.1925
39 0.7434 1.017845e+35 7.1941
40 0.7434 7.338657e+35 7.2100
41 0.7434 5.301943e+36 7.2247
42 0.7434 3.837148e+37 7.2372
43 0.7434 2.786919e+38 7.2630
44 0.7434 2.020863e+39 7.2512
45 0.7434 1.470666e+40 7.2774
46 0.7434 1.058089e+41 7.1946
47 0.7434 7.589583e+41 7.1729
48 0.7434 5.452218e+42 7.1838
49 0.7434 3.913964e+43 7.1787
50 0.7434 2.815332e+44 7.1930
51 0.7434 2.026757e+45 7.1990
52 0.7434 1.461508e+46 7.2111
53 0.7434 1.056404e+47 7.2282
54 0.7434 7.628804e+47 7.2215
55 0.7434 5.523249e+48 7.2400
56 0.7434 3.972145e+49 7.1917
57 0.7434 2.852312e+50 7.1808
58 0.7434 2.046261e+51 7.1740
59 0.7434 1.470226e+52 7.1849
60 0.7434 1.055143e+53 7.1767
61 0.7434 7.573274e+53 7.1775
62 0.7434 5.432389e+54 7.1731
63 0.7434 3.901670e+55 7.1822
64 0.7434 2.800355e+56 7.1773
65 0.7434 2.015150e+57 7.1961
66 0.7434 1.449547e+58 7.1932
67 0.7434 1.045759e+59 7.2144
68 0.7434 7.526693e+59 7.1973
69 0.7434 5.426749e+60 7.2100
70 0.7434 3.898031e+61 7.1830
71 0.7434 2.798482e+62 7.1792
72 0.7434 2.007781e+63 7.1745
73 0.7434 1.443555e+64 7.1898
74 0.7434 1.036727e+65 7.1818
75 0.7434 7.455786e+65 7.1917
76 0.7434 5.355224e+66 7.1826
77 0.7434 3.851526e+67 7.1921
78 0.7434 2.766181e+68 7.1820
79 0.7434 1.989039e+69 7.1906
80 0.7434 1.428224e+70 7.1805
81 0.7434 1.026730e+71 7.1889
82 0.7434 7.371549e+71 7.1796
83 0.7434 5.298678e+72 7.1880
84 0.7434 3.804295e+73 7.1797
85 0.7434 2.734669e+74 7.1884
86 0.7434 1.963625e+75 7.1805
87 0.7434 1.411857e+76 7.1901
88 0.7434 1.013979e+77 7.1819
89 0.7434 7.292470e+77 7.1919
90 0.7434 5.238013e+78 7.1828
91 0.7434 3.767246e+79 7.1921
92 0.7434 2.705717e+80 7.1822
93 0.7434 1.945582e+81 7.1906
94 0.7434 1.397137e+82 7.1811
95 0.7434 1.004480e+83 7.1896
96 0.7434 7.212811e+83 7.1806
97 0.7434 5.185486e+84 7.1893
98 0.7434 3.723612e+85 7.1808
99 0.7434 2.677180e+86 7.1897
--------------------------------------------------------------
Total Iterations: 100
Avg Convergence Rate: 6.9716
Final Residual: 2.677180e+86
Total Reduction in Residual: 2.154607e+84
Maximum Memory Usage: 0.743 GB
I started playing a little bit with matrices from SuiteSparse Matrix Collection and some matrices there come together with separate files for RHS vectors. At the moment I can't get them to work for amgx_capi
and amgx_mpi_cap
applications.
The apps use AMGX_read_system
function to read the matrices. I found in AmgX documentation the following passage:
%%MatrixMarket matrix coordinate real general
%%AMGX block_dimx(int) block_dimy(int) diagonal sorted rhs solution
%% mxn matrix with nnz non-zero elements
%% m=block_dimx*n_block_rows, n=block_dimy*n_block_cols
%% nnz=block_dimx*block_dimy*n_block_entrees
m(int) n(int) nnz(int)
1 1 a_11
1 2 a_12
...
i j a_ij
...
%% these two comment lines present only for the description (to be removed)
%% optional diagonal mx1
...
a_ii
...
%% these two comment lines present only for the description (to be removed)
%% optional rhs mx1
...
b_i
...
%% these two comment lines present only for the description (to be removed)
So as an example I cat atmosmodl.mtx atmosmodl_b.mtx > atmosmodl_Ab.mtx
and remove comment lines in between. That doesn't seem to do anything and I still get the message:
Reading data...
RHS vector was not found. Using RHS b=[1,…,1]^T
Also, I am concerned that the diagonal entries are spread independently within the first block of data.
Please advise. I am happy to delve deeper in the code, but I thought I'll check first.
The compiler I use is VS 2017, and the CUDA I use is CUDA 10. I have tried it on two computers, and I got the same errors every time.
The errors I get are:
calling a host function("std::_Iterator_base12::_Iterator_base12") from a host device function("std::_Iterator_base12::_Iterator_base12 [subobject]") is not allowed amgx_core C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\include\thrust\system\cuda\detail\assign_value.h
calling a host function("std::_Iterator_base12::_Iterator_base12") from a host device function("std::_Iterator_base12::_Iterator_base12 [subobject]") is not allowed amgx_core C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\include\thrust\system\cuda\detail\assign_value.h
By the way, it take me 4 to 6 hours to build amgx_core project with MPI. Would it be possible that I set up wrong somewhere?
Hi,
I tried to build AMGX on ubuntu 16.04, but was not able to go very far.
Basic information about my system:
Ubuntu 16.04,
gcc, g++, 5.4.0
cuda 7.5
Compilation fails at 4%:
[ 4%] Building NVCC (Device) object base/CMakeFiles/amgx_base.dir/src/energymin/interpolators/amgx_base_generated_em_interpolator.cu.o
/usr/lib/gcc/x86_64-linux-gnu/5/include/mwaitxintrin.h(36): error: identifier "__builtin_ia32_monitorx" is undefined
Any suggestions on how to move forward? Thanks a lot!
@marsaev Me and @niklaskarla had been having great performance issues with cusparse< type >csrsv_analysis in CUDA 9.0 and CUDA 9.1. It is 20 times slower than in CUDA 8.0, just running the same Sample code conjugateGradientPrecond on same GPU for a matrix sufficiently large enough.
We know it is not AMGX related (from what I can see that specific function is not being called anywhere in AMGX), but since having a Windows AMGX requires CUDA 9.0+, this enforces us to use those versions. Is it a known bug? Could we direct it to the cuSparse API group?
Hi guys, how i convert one mesh matrix of openfoam to a matrix market for AMGX?
There are two errors in the 5 point 2D discretization of Poisson's equation. The A matrix is generated in the example code that comes along with AmgX library as follows:
1. for (int i = 0; i < n; i ++)
2. {
3. row_ptrs[i] = nnz;
4.
5. if (rank > 0 || i > ny)
6. {
7. col_indices[nnz] = (i + start_idx - ny);
8.
9. if (sizeof_m_val == 4)
10. {
11. ((float *)values)[nnz] = -1.f;
12. }
13. else if (sizeof_m_val == 8)
14. {
15. ((double *)values)[nnz] = -1.;
16. }
17.
18. nnz++;
19. }
20.
21. if (i % ny != 0)
22. {
23. col_indices[nnz] = (i + start_idx - 1);
24.
25. if (sizeof_m_val == 4)
26. {
27. ((float *)values)[nnz] = -1.f;
28. }
29. else if (sizeof_m_val == 8)
30. {
31. ((double *)values)[nnz] = -1.;
32. }
33.
34. nnz++;
35. }
36.
37. {
38. col_indices[nnz] = (i + start_idx);
39.
40. if (sizeof_m_val == 4)
41. {
42. ((float *)values)[nnz] = 4.f;
43. }
44. else if (sizeof_m_val == 8)
45. {
46. ((double *)values)[nnz] = 4.;
47. }
48.
49. nnz++;
50. }
51.
52. if ((i + 1) % ny == 0)
53. {
54. col_indices[nnz] = (i + start_idx + 1);
55.
56. if (sizeof_m_val == 4)
57. {
58. ((float *)values)[nnz] = -1.f;
59. }
60. else if (sizeof_m_val == 8)
61. {
62. ((double *)values)[nnz] = -1.;
63. }
64.
65. nnz++;
66. }
67.
68. if ( (rank != nranks - 1) || (i / ny != (nx - 1)) )
69. {
70. col_indices[nnz] = (i + start_idx + ny);
71.
72. if (sizeof_m_val == 4)
73. {
74. ((float *)values)[nnz] = -1.f;
75. }
76. else if (sizeof_m_val == 8)
77. {
78. ((double *)values)[nnz] = -1.;
79. }
80.
81. nnz++;
82. }
83. }
The two errors are in the following conditions:
Line 5. if (rank > 0 || i > ny)
Line 52. if ((i + 1) % ny == 0)
The correct statement should be:
Line 5. if (rank > 0 || i >= ny)
Line 52. if ((i + 1) % ny != 0)
After the above mentioned correction, the output for 4x4 matrix is attached. The solution has been verified against matlab solution.
Is there a way to easily insert debug prints in the library?
For example, I want to create a file with the X and B vector for every iteration. base/src/solver.cu:solve() contains the main iteration loop, and calls solve_iteration(b, x, xIsZero).
They are declared as 'Vector &' as parameters, the solver I use (PBICGSTAB) receives 'VVector &' type parameters.
I tried 'printf("%.5e\n", b[i]); in pbicgstab_solver.cu, but it warns 'non-POD class type passed through ellipsis' and it prints zeros for b, where it should be ones. The value of x starts at zeros (initial guess), but is 6.9e-310 (smallest positive value) in subsequent iterations.
'printf("%.5e\n", b.pod()[i]); , but it results in a SIGSEGV
Printing the types of 'x.pod()' and 'x.pod[0]' result in 'N4amgx9PODVectorIdiEE' and 'd' respectively.
If monitor_residual
is not set or set to 0 in the config, then AMGX_solver_get_status()
silently returns 0 (success) regardless of convergence. Perhaps this should be made explicit in the reference for AMGX_solver_get_status
.
I'm working on a Nix recipe for AMGX. It currently looks like this. Currently it seems to build without any issues. However, when I try to run an example I get the following,
$ examples/amgx_capi -m ../examples/matrix.mtx -c ../core/configs/FGMRES_AGGREGATION.json
AMGX version 2.0.0.130-opensource
Built on May 14 2018, 23:06:38
AMGX ERROR: file /tmp/nix-build-AmgX.drv-0/lafn8qxabfn95rh3bh3y0bi113kzwl8w-source/examples/amgx_capi.c line 245
AMGX ERROR: Error initializing amgx core.
Failed while initializing CUDA runtime in cudaRuntimeGetVersion
Any ideas?
I'm trying to test CG with DILU and ILU preconditioning to make a solver comparison and I'm having troubles getting them working. I'm using PCG_DILU.json provided by AMGX as the baseline config file, solving the example matrix provided ( examples/matrix.mtx ).
Environment:
CUDA 9.2
Ubuntu 18.04 LTS
PCG_DILU.json
{
"config_version": 2,
"solver": {
"preconditioner": {
"scope": "precond",
"solver": "MULTICOLOR_DILU"
},
"solver": "PCG",
"print_solve_stats": 1,
"obtain_timings": 1,
"max_iters": 20,
"monitor_residual": 1,
"scope": "main",
"tolerance": 1e-06,
"norm": "L2"
}
}
And I get the following results:
AMGX version 2.0.0.130-opensource
Built on Jun 27 2018, 09:03:34
Compiled with CUDA Runtime 9.2, using CUDA driver 9.2
Warning: No mode specified, using dDDI by default.
Reading data...
RHS vector was not found. Using RHS b=[1,…,1]^T
Solution vector was not found. Setting initial solution to x=[0,…,0]^T
Finished reading
iter Mem Usage (GB) residual rate
--------------------------------------------------------------
Ini 0 3.464102e+00
0 0 nan nan
1 0.0000 -nan -nan
2 0.0000 nan nan
3 0.0000 -nan -nan
4 0.0000 nan nan
5 0.0000 -nan -nan
6 0.0000 nan nan
7 0.0000 -nan -nan
8 0.0000 nan nan
9 0.0000 -nan -nan
10 0.0000 nan nan
11 0.0000 -nan -nan
12 0.0000 nan nan
13 0.0000 -nan -nan
14 0.0000 nan nan
15 0.0000 -nan -nan
16 0.0000 nan nan
17 0.0000 -nan -nan
18 0.0000 nan nan
19 0.0000 -nan -nan
--------------------------------------------------------------
Total Iterations: 20
Avg Convergence Rate: -nan
Final Residual: -nan
Total Reduction in Residual: -nan
Maximum Memory Usage: 0.000 GB
--------------------------------------------------------------
Total Time: 0.125037
setup: 0.000589824 s
solve: 0.124448 s
solve(per iteration): 0.00622238 s
For ILU Preconditioning I made a minor modification to the original PCG_DILU.json
{
"config_version": 2,
"solver": {
"preconditioner": {
"scope": "precond",
"ilu_sparsity_level": 0,
"coloring_level": 1,
"solver": "MULTICOLOR_ILU"
},
"solver": "PCG",
"print_solve_stats": 1,
"obtain_timings": 1,
"max_iters": 20,
"monitor_residual": 1,
"scope": "main",
"tolerance": 1e-06,
"norm": "L2"
}
}
This gives me the following output
AMGX version 2.0.0.130-opensource
Built on Jun 27 2018, 09:03:34
Compiled with CUDA Runtime 9.2, using CUDA driver 9.2
Warning: No mode specified, using dDDI by default.
Reading data...
RHS vector was not found. Using RHS b=[1,…,1]^T
Solution vector was not found. Setting initial solution to x=[0,…,0]^T
Finished reading
Caught amgx exception: Multicolor ILU smoother requires matrix to be reordered by color with ILU0 solver. Try setting reorder_cols_by_color=1 and insert_diag_while_reordering=1 in the multicolor_ilu solver scope in configuration file
at: /opt/software/repos/AMGX/core/src/solvers/multicolor_ilu_solver.cu:1902
Stack trace:
libamgxsh.so : amgx::multicolor_ilu_solver::MulticolorILUSolver_Base<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::pre_setup()+0x4ee
libamgxsh.so : amgx::multicolor_ilu_solver::MulticolorILUSolver_Base<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solver_setup(bool)+0x8a
libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::setup(amgx::Operator<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, bool)+0x1d4
libamgxsh.so : amgx::PCG_Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solver_setup(bool)+0x46
libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::setup(amgx::Operator<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, bool)+0x1d4
libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::setup_no_throw(amgx::Operator<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, bool)+0x7c
libamgxsh.so : amgx::AMG_Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::setup(amgx::Matrix<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&)+0x5c
libamgxsh.so : amgx::AMGX_ERROR amgx::(anonymous namespace)::set_solver_with_shared<(AMGX_Mode)8193, amgx::AMG_Solver, amgx::Matrix>(AMGX_solver_handle_struct*, AMGX_matrix_handle_struct*, amgx::Resources*, amgx::AMGX_ERROR (amgx::AMG_Solver<amgx::TemplateMode<(AMGX_Mode)8193>::Type>::*)(std::shared_ptr<amgx::Matrix<amgx::TemplateMode<(AMGX_Mode)8193>::Type> >))+0xc9
libamgxsh.so : AMGX_solver_setup()+0x183
examples/amgx_capi : main()+0x4d7
/lib/x86_64-linux-gnu/libc.so.6 : __libc_start_main()+0xe7
examples/amgx_capi : _start()+0x2a
Caught amgx exception: Error, setup must be called before calling solve
at: /opt/software/repos/AMGX/base/src/solvers/solver.cu:598
Stack trace:
libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solve(amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, bool)+0x1fe1
libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solve_no_throw(amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::AMGX_STATUS&, bool)+0x82
libamgxsh.so : amgx::AMG_Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solve(amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::AMGX_STATUS&, bool)+0x3d
libamgxsh.so : amgx::AMGX_ERROR amgx::(anonymous namespace)::solve_with<(AMGX_Mode)8193, amgx::AMG_Solver, amgx::Vector>(AMGX_solver_handle_struct*, AMGX_vector_handle_struct*, AMGX_vector_handle_struct*, amgx::Resources*, bool)+0xdf
libamgxsh.so : AMGX_solver_solve()+0x17f
examples/amgx_capi : main()+0x4eb
/lib/x86_64-linux-gnu/libc.so.6 : __libc_start_main()+0xe7
examples/amgx_capi : _start()+0x2a
As the output messages states, I added reorder_cols_by_color=1 and insert_diag_while_reordering=1
{
"config_version": 2,
"solver": {
"preconditioner": {
"scope": "precond",
"ilu_sparsity_level": 0,
"coloring_level": 1,
"reorder_cols_by_color": 1,
"insert_diag_while_reordering": 1,
"solver": "MULTICOLOR_ILU"
},
"solver": "PCG",
"print_solve_stats": 1,
"obtain_timings": 1,
"max_iters": 20,
"monitor_residual": 1,
"scope": "main",
"tolerance": 1e-06,
"norm": "L2"
}
}
Then I get the following output
AMGX version 2.0.0.130-opensource
Built on Jun 27 2018, 09:03:34
Compiled with CUDA Runtime 9.2, using CUDA driver 9.2
Warning: No mode specified, using dDDI by default.
Reading data...
RHS vector was not found. Using RHS b=[1,…,1]^T
Solution vector was not found. Setting initial solution to x=[0,…,0]^T
Finished reading
Caught amgx exception: Unsupported block size for Multicolor ILU solver, computeLUFactors
at: /opt/software/repos/AMGX/core/src/solvers/multicolor_ilu_solver.cu:1818
Stack trace:
libamgxsh.so : amgx::multicolor_ilu_solver::MulticolorILUSolver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::computeLUFactors()+0x88b
libamgxsh.so : amgx::multicolor_ilu_solver::MulticolorILUSolver_Base<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solver_setup(bool)+0xa2
libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::setup(amgx::Operator<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, bool)+0x1d4
libamgxsh.so : amgx::PCG_Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solver_setup(bool)+0x46
libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::setup(amgx::Operator<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, bool)+0x1d4
libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::setup_no_throw(amgx::Operator<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, bool)+0x7c
libamgxsh.so : amgx::AMG_Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::setup(amgx::Matrix<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&)+0x5c
libamgxsh.so : amgx::AMGX_ERROR amgx::(anonymous namespace)::set_solver_with_shared<(AMGX_Mode)8193, amgx::AMG_Solver, amgx::Matrix>(AMGX_solver_handle_struct*, AMGX_matrix_handle_struct*, amgx::Resources*, amgx::AMGX_ERROR (amgx::AMG_Solver<amgx::TemplateMode<(AMGX_Mode)8193>::Type>::*)(std::shared_ptr<amgx::Matrix<amgx::TemplateMode<(AMGX_Mode)8193>::Type> >))+0xc9
libamgxsh.so : AMGX_solver_setup()+0x183
examples/amgx_capi : main()+0x4d7
/lib/x86_64-linux-gnu/libc.so.6 : __libc_start_main()+0xe7
examples/amgx_capi : _start()+0x2a
Caught amgx exception: Error, setup must be called before calling solve
at: /opt/software/repos/AMGX/base/src/solvers/solver.cu:598
Stack trace:
libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solve(amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, bool)+0x1fe1
libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solve_no_throw(amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::AMGX_STATUS&, bool)+0x82
libamgxsh.so : amgx::AMG_Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solve(amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::AMGX_STATUS&, bool)+0x3d
libamgxsh.so : amgx::AMGX_ERROR amgx::(anonymous namespace)::solve_with<(AMGX_Mode)8193, amgx::AMG_Solver, amgx::Vector>(AMGX_solver_handle_struct*, AMGX_vector_handle_struct*, AMGX_vector_handle_struct*, amgx::Resources*, bool)+0xdf
libamgxsh.so : AMGX_solver_solve()+0x17f
examples/amgx_capi : main()+0x4eb
/lib/x86_64-linux-gnu/libc.so.6 : __libc_start_main()+0xe7
examples/amgx_capi : _start()+0x2a
Perhaps I'm doing something wrong modifying the config file for ILU preconditioner. But for DILU I'm using the original config provided in the AMGX (I get same results with other solvers that use DILU preconditioner).
Any help or feedback would be very much appreciated
Fron the paper"AMGX: A LIBRARY FOR GPU ACCELERATED ALGEBRAIC MULTIGRID AND PRECONDITIONED ITERATIVE METHODS∗", I know that AMGX library use CSR matrix itself.
Can I provide the CSR format to AMGX to calculate directly?
Is CC 3.0 really not compatible? We have tested and everything seems to be fine on 660ti. Nonetheless, there are many occurrences of hardcoded requirement of CC>=3.5. Is it safe to change to >=3.0?
Hi,
I have built AMGX and it works fine. For verification, I am trying to reproduce the results in Table 2 on page S618 in the article on AMGX by N.Naumov et.al. in SIAM J.Sci.Comput., vol 37, no.5, pp S602-S626 (2015). Trying different settings, but can not make the number of iterations agree (hardware is different so execution time will not agree) for any of the cases. It is not clear from the paper exactly which settings AMGX is using, if it is only AMG or AMG as preconditioner to another iterative method ? Is there by any chance a configuration file available for those runs ?
Best Regards
Bjorn
I have a Titan V card and I've successfully compiled the codes. But when I was trying to run the example I got the following errors. does 'invalid device function' mean that it currently does not support Titan V? Thanks.
➜ build git:(master) examples/amgx_capi -m ../examples/matrix.mtx -c ../core/configs/FGMRES_AGGREGATION.json
AMGX version 2.0.0.130-opensource
Built on Mar 16 2018, 21:21:21
Compiled with CUDA Runtime 9.1, using CUDA driver 9.1
Warning: No mode specified, using dDDI by default.
Thrust failure: parallel_for failed: invalid device function
File and line number are not available for this exception.
Caught amgx exception: Cuda failure: 'invalid device function'
@marsaev
What is the current status regarding Windows builds? We have been trying to build latest AMGX with both VS 2015/2017 and CUDA TK 8/9 versions, with no success. In https://github.com/bnase/AMGX you can find a fork with the latest fixes we have employed in order to build AMGX under Windows. Seems we have solved some, but not all of the issues.
could you compile non-mpi build on MS VS 2017 + Cmake 3.14.1 + CUDA 10.0,
I can,t solve it......
error LNK1104 无法打开文件“D:\Users\HIT\Desktop\AMGX-master\build6\base\CMakeFiles\amgx_base.dir\src\Debug\amgx_base_generated_csr_multiply_sm20.cu.obj” amgxsh D:\Users\HIT\Desktop\AMGX-master\build6\LINK 1
error calling a host function("std::_Iterator_base12::_Iterator_base12") from a host device function("std::_Iterator_base12::_Iterator_base12 [subobject]") is not allowed amgx_base C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\include\thrust\system\cuda\detail\assign_value.h 78
Which unit tests are relevant and up to date?
We have extensively test on different GPUs, CUDA 8 vs 9.0 and quite a few keep failing under Linux. Besides the missing matrices, which we have filled in from either Matrix Market or proper use of generate_poisson, there are some tests that seem basic and keep failing. Find attached the logs of unit tests on an array of GPUs and configurations.
gtx970_9.0.log
gtx970.log
gtx1080ti.log
gtx660ti.log
All unit tests were compiled with CUDA 8.0, except gtx_970_9.0.log. The tests performed on 660ti were done after we have changed some (possibly too strict? check #8) hardcoded CUDA_ARCH>=3.5 to CUDA_ARCH>=3.0.
We will be more than happy to fix them, as long as we have a list of the tests that are supposed to work, because some seem very outdated or poorly maintained.
It cost hours. I don't know it is normal or bug.
Hi! I am modifying the Poisson's equation solver example for incorporating variable coefficient Poisson's equation, however, I am unable to interpret the indexing scheme.
The precise issue is that according to my understanding for every grid-point in my original matrix (for which I am solving the 5 point discretized 2D Poisson's equation), I should have 5 coefficients (which should be -1, -1, 4, -1 and -1), however, I find that the number of coefficients are varying from 2 to 5 for different grid-points.
In order to understand the example code before modifying it for variable coefficient case, I printed the grid-point index and its coefficient value for a simple case of 4x4 matrix with single CPU and GPU processor.
I ran the code as follows:
mpirun -np 1 examples/amgx_mpi_poisson5 -mode dDDI -p 4 4 -c ../core/configs/FGMRES_AGGREGATION_JACOBI.json
The output is attached herewith.
output.log
The example code is also attached.
amgx_mpi_poisson5pt.zip
Hello guys.
I try to figure out how to get multiple eigenvalues using your example 'eigen_examples/eigensolver.c'. Can it be done by using specific config file or I have to modify the code?
I suppose that this algorithm works like this: starting from random vector and by modifying (through iteration) this vector getting closer to eigenvector?
Then i have to start from different random vectors to receive multiple eigenvalues, but this will not give me all of the eigenvalues every time for sure.
Is there any info I can read about this eigensolvers algorithms or I have to get this info by reading all the code?
Cheers.
When I use amgx_capi to solve some sparse linear system, it comes out that most of the default configs cannot be loaded into the API. Some default config will be used and the number of AMG levels is set to be 1.
Cannot read file as JSON object, trying as AMGX config
Converting config string to current config version
For some large enough sparse systems, these default settings won't converge.
For instance, ../core/configs/FGMRES_AGGREGATION_JACOBI.json
will be loaded, however, ../core/configs/FGMRES_CLASSICAL_AGGRESSIVE_HMIS.json
and ../core/configs/AMG_CLASSICAL_L1_AGGRESSIVE_HMIS.json
will not.
Hi guys, i am trying compiling my simple program usign AMGX. But when i try, initialize the AMGX, e.g:
#include<"Amgx_c.h">
main(){
AMGX_initialize()
}
I have one error, "Undefined reference to AMGX_initialize".
What happens?
Best regards.
I triede many times, and can't get useful imformations from Internet.
system information:
CUDA 10.0
Cmake cmake-3.13.0-rc3-win64-x64
IDE VS 2017
TOOOOOOO many errors showed below! This made me MAD.
Found OpenMP_C: -openmp
Found OpenMP_CXX: -openmp
Found OpenMP: TRUE
Could NOT find MPI_C (missing: MPI_C_LIB_NAMES MPI_C_HEADER_DIR MPI_C_WORKS)
Could NOT find MPI_CXX (missing: MPI_CXX_LIB_NAMES MPI_CXX_HEADER_DIR MPI_CXX_WORKS)
Could NOT find MPI (missing: MPI_C_FOUND MPI_CXX_FOUND)
This is a MPI build:FALSE
CUDA_TOOLKIT_ROOT_DIR not found or specified
Could NOT find CUDA (missing: CUDA_TOOLKIT_ROOT_DIR CUDA_NVCC_EXECUTABLE CUDA_INCLUDE_DIRS CUDA_CUDART_LIBRARY)
Cuda libraries: CUDA_CUDART_LIBRARY-NOTFOUND
CMake Error at CMakeLists.txt:247 (STRING):
STRING sub-command REGEX, mode REPLACE needs at least 6 arguments total to
command.
CUDA_TOOLKIT_ROOT_DIR not found or specified
Could NOT find CUDA (missing: CUDA_TOOLKIT_ROOT_DIR CUDA_NVCC_EXECUTABLE CUDA_INCLUDE_DIRS CUDA_CUDART_LIBRARY)
CUDA_TOOLKIT_ROOT_DIR not found or specified
Could NOT find CUDA (missing: CUDA_TOOLKIT_ROOT_DIR CUDA_NVCC_EXECUTABLE CUDA_INCLUDE_DIRS CUDA_CUDART_LIBRARY)
CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
CUDA_CUDART_LIBRARY (ADVANCED)
linked by target "amgx" in directory D:/Users/HIT/Desktop/AMGX-master
linked by target "amgxsh" in directory D:/Users/HIT/Desktop/AMGX-master
linked by target "amgx_base" in directory D:/Users/HIT/Desktop/AMGX-master/base
linked by target "amgx_core" in directory D:/Users/HIT/Desktop/AMGX-master/core
linked by target "amgx_template_plugin" in directory D:/Users/HIT/Desktop/AMGX-master/template_plugin
linked by target "amgx_eigensolvers" in directory D:/Users/HIT/Desktop/AMGX-master/eigensolvers
linked by target "generate_poisson" in directory D:/Users/HIT/Desktop/AMGX-master/examples
linked by target "generate_poisson7_dist_renum" in directory D:/Users/HIT/Desktop/AMGX-master/examples
linked by target "amgx_tests_library" in directory D:/Users/HIT/Desktop/AMGX-master/tests
linked by target "amgx_tests_launcher" in directory D:/Users/HIT/Desktop/AMGX-master/tests
CUDA_TOOLKIT_INCLUDE (ADVANCED)
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/base
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/base
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/core
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/core
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/core
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/template_plugin
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/template_plugin
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/eigensolvers
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/eigensolvers
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/examples
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/examples
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/examples
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/examples
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/examples
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/examples
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/examples
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/examples
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/eigen_examples
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/eigen_examples
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/eigen_examples
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/eigen_examples
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/tests
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/tests
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/tests
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/tests
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/tests
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/tests
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/tests
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/tests
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/tests
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/tests
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/tests
used as include directory in directory D:/Users/HIT/Desktop/AMGX-master/tests
cublas_library
linked by target "amgx" in directory D:/Users/HIT/Desktop/AMGX-master
linked by target "amgxsh" in directory D:/Users/HIT/Desktop/AMGX-master
cusolver_library
linked by target "amgx" in directory D:/Users/HIT/Desktop/AMGX-master
linked by target "amgxsh" in directory D:/Users/HIT/Desktop/AMGX-master
linked by target "generate_poisson" in directory D:/Users/HIT/Desktop/AMGX-master/examples
linked by target "generate_poisson7_dist_renum" in directory D:/Users/HIT/Desktop/AMGX-master/examples
cusparse_library
linked by target "amgx" in directory D:/Users/HIT/Desktop/AMGX-master
linked by target "amgxsh" in directory D:/Users/HIT/Desktop/AMGX-master
linked by target "generate_poisson" in directory D:/Users/HIT/Desktop/AMGX-master/examples
linked by target "generate_poisson7_dist_renum" in directory D:/Users/HIT/Desktop/AMGX-master/examples
linked by target "amgx_tests_launcher" in directory D:/Users/HIT/Desktop/AMGX-master/tests
Configuring incomplete, errors occurred!
See also "D:/Users/HIT/Desktop/AMGX-master/build/CMakeFiles/CMakeOutput.log".
Your's
Ning
Hi,
Thanks for open-sourcing AMGX! Looking forward to using it.
Meanwhile, I am trying to build it on my up-to-date Arch Linux laptop with Quadro and CUDA 9.1 on it. I am aiming for a distributed version so I do the following:
install_dir=${HOME}/apps/amgx
export OMPI_CC=/opt/cuda/bin/gcc
export OMPI_CXX=/opt/cuda/bin/g++
cmake \
-DCMAKE_INSTALL_PREFIX=$install_dir \
../
I am getting a little confusing error about setting CUDA 9.5 flags inside CMakeLists.txt
even though 9.1 was correctly detected.
-- Found CUDA: /opt/cuda (found version "9.1")
Cuda libraries: /opt/cuda/lib64/libcudart_static.a-lpthreaddl/usr/lib/librt.so
CMake Error at CMakeLists.txt:247 (message):
Default flags for CUDA 9.5 are not set. Edit CMakeLists.txt to set them
Could you please comment on this and point me out in the right direction?
git clone https://github.com/NVIDIA/AMGX.git
cd amgx; mkdir build; cd build; cmake ..
This fails, because I don't pass any intended cuda architectures, when removing CMakeLists.txt:247, cmake runs successfully.
Running 'make' then results in an error around 7%:
amgx/core/src/classical/interpolators/distance2.cu(1343): error: identifier "sign" is undefined
detected during:
instantiation of "void amgx::distance2::compute_inner_sum_kernel<Value_type,CTA_SIZE,SMEM_SIZE,WARP_SIZE>(int, const int *, const int *, const Value_type *, const int *, const __nv_bool *, const int *, const int *, const int *, const int *, const Value_type *, const int *, Value_type *, int, int *, int *) [with Value_type=double, CTA_SIZE=256, SMEM_SIZE=128, WARP_SIZE=32]"
(2418): here
instantiation of "void amgx::Distance2_Interpolator<amgx::TemplateConfig<(AMGX_MemorySpace)1, t_vecPrec, t_matPrec, t_indPrec>>::generateInterpolationMatrix_1x1(amgx::Distance2_Interpolator<amgx::TemplateConfig<(AMGX_MemorySpace)1, t_vecPrec, t_matPrec, t_indPrec>>::Matrix_d &, amgx::Distance2_Interpolator<amgx::TemplateConfig<(AMGX_MemorySpace)1, t_vecPrec, t_matPrec, t_indPrec>>::IntVector &, amgx::Distance2_Interpolator<amgx::TemplateConfig<(AMGX_MemorySpace)1, t_vecPrec, t_matPrec, t_indPrec>>::BVector &, amgx::Distance2_Interpolator<amgx::TemplateConfig<(AMGX_MemorySpace)1, t_vecPrec, t_matPrec, t_indPrec>>::IntVector &, amgx::Distance2_Interpolator<amgx::TemplateConfig<(AMGX_MemorySpace)1, t_vecPrec, t_matPrec, t_indPrec>>::Matrix_d &, void *) [with t_vecPrec=(AMGX_VecPrecision)0, t_matPrec=(AMGX_MatPrecision)0, t_indPrec=(AMGX_IndPrecision)2]"
(2521): here
Line 1343 indeed calls the function sign(), which is defined, but only as a device function.
I do have cuda-10 installed, but the cuda-9.2 preceeds cuda-10 in $PATH.
base/include/amgx_timer.h uses undefined variables (t1 and t2) in code that is activated during a Debug build.
AMGX/base/include/amgx_timer.h
Line 233 in 732338c
Where possible. Probably better to wait for CUDA 9.2 release for updated cuSolver functionality.
The GMRES solver produces a segfault when preconditioner="NOSOLVER":
[atrikut@node1144 examples]$ cat ../configs/core/GMRES.json
{
"config_version": 2,
"solver": {
"preconditioner": {
"scope": "amg",
"solver": "NOSOLVER"
},
"use_scalar_norm": 1,
"solver": "GMRES",
"print_solve_stats": 1,
"obtain_timings": 1,
"monitor_residual": 1,
"convergence": "RELATIVE_INI_CORE",
"scope": "main",
"tolerance": 1e-6,
"norm": "L2"
}
}
[atrikut@node1144 examples]$ ./amgx_capi -m matrix.mtx -c ../configs/core/GMRES.json
▽
1 {
AMGX version 2.0.0.130-opensource
Built on May 17 2018, 05:29:30
Compiled with CUDA Runtime 8.0, using CUDA driver 9.0
Warning: No mode specified, using dDDI by default.
Reading data...
RHS vector was not found. Using RHS b=[1,…,1]^T
Solution vector was not found. Setting initial solution to x=[0,…,0]^T
Finished reading
iter Mem Usage (GB) residual rate
--------------------------------------------------------------
Ini 0 3.464102e+00
0 0 1.845471e+00 0.5327
1 0.0000 1.541877e+00 0.8355
2 0.0000 1.374225e+00 0.8913
3 0.0000 1.366903e+00 0.9947
4 0.0000 1.040855e+00 0.7615
5 0.0000 1.026638e+00 0.9863
6 0.0000 8.614123e-01 0.8391
7 0.0000 6.599583e-01 0.7661
8 0.0000 6.596676e-01 0.9996
9 0.0000 6.593714e-01 0.9996
10 0.0000 6.331763e-01 0.9603
Caught signal 11 - SIGSEGV (segmentation violation)
/home/atrikut/local/AMGX/build/libamgxsh.so : amgx::handle_signals(int)+0xbb
/lib64/libpthread.so.0 : ()+0xf680
/home/atrikut/local/AMGX/build/libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solve(amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, bool)+0x11
/home/atrikut/local/AMGX/build/libamgxsh.so : amgx::GMRES_Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solve_iteration(amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, bool)+0x5d2
/home/atrikut/local/AMGX/build/libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solve(amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, bool)+0x554
/home/atrikut/local/AMGX/build/libamgxsh.so : amgx::Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solve_no_throw(amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::AMGX_STATUS&, bool)+0x77
/home/atrikut/local/AMGX/build/libamgxsh.so : amgx::AMG_Solver<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >::solve(amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::Vector<amgx::TemplateConfig<(AMGX_MemorySpace)1, (AMGX_VecPrecision)0, (AMGX_MatPrecision)0, (AMGX_IndPrecision)2> >&, amgx::AMGX_STATUS&, bool)+0x3d
/home/atrikut/local/AMGX/build/libamgxsh.so : amgx::AMGX_ERROR amgx::(anonymous namespace)::solve_with<(AMGX_Mode)8193, amgx::AMG_Solver, amgx::Vector>(AMGX_solver_handle_struct*, AMGX_vector_handle_struct*, AMGX_vector_handle_struct*, amgx::Resources*, bool)+0x268
/home/atrikut/local/AMGX/build/libamgxsh.so : AMGX_solver_solve()+0x147
./amgx_capi : main()+0x34d
/lib64/libc.so.6 : __libc_start_main()+0xf5
./amgx_capi() [0x402233]
Segmentation fault (core dumped)
But when preconditioner is not NOSOLVER
:
[atrikut@node1144 examples]$ cat ../configs/core/GMRES.json
{
"config_version": 2,
"solver": {
"preconditioner": {
"scope": "amg",
"solver": "BLOCK_JACOBI"
},
"use_scalar_norm": 1,
"solver": "GMRES",
"print_solve_stats": 1,
"obtain_timings": 1,
"monitor_residual": 1,
"convergence": "RELATIVE_INI_CORE",
"scope": "main",
"tolerance": 1e-6,
"norm": "L2"
}
}
[atrikut@node1144 examples]$ ./amgx_capi -m matrix.mtx -c ../configs/core/GMRES.json
AMGX version 2.0.0.130-opensource
Built on May 17 2018, 05:29:30
Compiled with CUDA Runtime 8.0, using CUDA driver 9.0
Warning: No mode specified, using dDDI by default.
Reading data...
RHS vector was not found. Using RHS b=[1,…,1]^T
Solution vector was not found. Setting initial solution to x=[0,…,0]^T
Finished reading
iter Mem Usage (GB) residual rate
--------------------------------------------------------------
Ini 0 3.464102e+00
0 0 1.934535e+00 0.5585
1 0.0000 1.934535e+00 1.0000
2 0.0000 1.934535e+00 1.0000
3 0.0000 1.644110e+00 0.8499
4 0.0000 1.507196e+00 0.9167
5 0.0000 1.213641e+00 0.8052
6 0.0000 1.160986e+00 0.9566
7 0.0000 1.092385e+00 0.9409
8 0.0000 1.088166e+00 0.9961
9 0.0000 8.810365e-01 0.8097
10 0.0000 4.990786e-01 0.5665
11 0.0000 4.764949e-01 0.9547
12 0.0000 1.059929e-01 0.2224
13 0.0000 1.677930e-15 0.0000
--------------------------------------------------------------
Total Iterations: 14
Avg Convergence Rate: 0.0806
Final Residual: 1.677930e-15
Total Reduction in Residual: 4.843766e-16
Maximum Memory Usage: 0.000 GB
--------------------------------------------------------------
Total Time: 0.0461597
setup: 0.000248704 s
solve: 0.045911 s
solve(per iteration): 0.00327936 s
[atrikut@node1144 examples]$
Hi all,
How to change/generate bigger problem size instead of the sample input size "matrix.mtx"?
Thanks
Is there currently a way to solve a system A*X=B, where X and B are not a single but multiple vectors, at once? The only part of the documentation I could find that approaches this topic talks about computing multiple right-hand sides in succession.
There certainly are some applications where prior knowledge of multiple right-hand side vectors b^i is given and computing the solution to them simultaneously could lead to efficieny gains.
Hello, I'm working on a Python interface to AMGX:
https://github.com/shwina/pyamgx
The project is new, but there is enough to setup and solve a system (single GPU). I'm looking for any advice/comments on the API and desired features. Thanks!
I was using vs2015 update3 and cmake 3.13.1 to compile. After meeting the problem in https://github.com/NVIDIA/AMGX/issues/36 and having solved the problem, I got another problem in building :
1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.1\include\thrust/detail/allocator/allocator_traits.inl(230): error : calling a __host__ function("std::_Iterator_base12::_Iterator_base12") from a __host__ __device__ function("std::_Iterator_base12::_Iterator_base12 [subobject]") is not allowed
1>
1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.1\include\thrust/detail/allocator/allocator_traits.inl(230): error : calling a __host__ function("std::_Iterator_base12::~_Iterator_base12") from a __host__ __device__ function("std::_Iterator_base12::~_Iterator_base12 [subobject]") is not allowed
1>
1> 2 errors detected in the compilation of "C:/Users/i/AppData/Local/Temp/tmpxft_00005b44_00000000-10_comms_mpi_hostbuffer_stream.cpp1.ii".
1> comms_mpi_hostbuffer_stream.cu
1> CMake Error at amgx_base_generated_comms_mpi_hostbuffer_stream.cu.obj.Debug.cmake:283 (message):
1> Error generating file
1> C:/AMGX-master/build/base/CMakeFiles/amgx_base.dir/src/distributed/Debug/amgx_base_generated_comms_mpi_hostbuffer_stream.cu.obj
and
2>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.1\include\thrust/system/cuda/detail/assign_value.h(78): error : calling a __host__ function("std::_Iterator_base12::_Iterator_base12") from a __host__ __device__ function("std::_Iterator_base12::_Iterator_base12 [subobject]") is not allowed
2>
2>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.1\include\thrust/system/cuda/detail/assign_value.h(78): error : calling a __host__ function("std::_Iterator_base12::~_Iterator_base12") from a __host__ __device__ function("std::_Iterator_base12::~_Iterator_base12 [subobject]") is not allowed
2>
2> 2 errors detected in the compilation of "C:/Users/i/AppData/Local/Temp/tmpxft_00002bd0_00000001-10_matrix_analysis.cpp1.ii".
2> matrix_analysis.cu
2> CMake Error at amgx_core_generated_matrix_analysis.cu.obj.Debug.cmake:283 (message):
2> Error generating file
2> C:/AMGX-master/build/core/CMakeFiles/amgx_core.dir/src/Debug/amgx_core_generated_matrix_analysis.cu.obj
I was really confused that these errors occured in the CUDA thrust head files and I couldn't figure it out.
I wonder if anyone else meets the same problem.
Hi,
is anyone from the develpers team or any user at GTC 2018 in San Jose?
I would be interested in meeting developers / other users to understand better the road-map of AMGX and share experiences in the use of AMGX.
Best Regards,
Andrea Borsic
Hello,
In 2015, an AMGx adapter was created within Trilinos's MueLU package [E. Furst, A. Prokopenko, J. Hu, 'Creating an AMGX adapter within the MueLu package', CCR Summer Proceedings 2015]. It is however restricted to a single GPU. Do you know whether work is currently being done (or planned) so as to make it work for more than a single GPU?
Thank you very much.
With best regards,
Serge
Hello, I can't sort how to extract the eigenvalues and eigenvectors from the AMGX_eigensolver_handle. I tried following the eigen_example; however it is not clear how to get this information form the example.
Thank you,
Miguel
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.