ddemidov / amgcl_benchmarks Goto Github PK

6.0 3.0 2.0 92 KB

Code accompanying AMGCL benchmarks

Home Page: http://amgcl.readthedocs.io/en/latest/benchmarks.html

CMake 10.50% C++ 76.49% Cuda 9.25% Python 2.80% Shell 0.96%

amgcl_benchmarks's Introduction

Source code accompanying AMGCL benchmarks.

The system matrix and the RHS used for the Navier-Stokes benchmarks may be retrieved here:

Description	DOI
Shared memory benchmark
Distributed memory benchmark

amgcl_benchmarks's People

Contributors

Stargazers

Watchers

Forkers

pingyuan162 lacrymose

amgcl_benchmarks's Issues

Any command line arguments needed?

Dear AMGCL developers, I am trying to get to grips with the shared memory version of the benchmarks, with a particular interest in the Navier-Stokes test. As I did not find any other information, I always launch the executables without any further command line option (the matrix and the rhs are in the same directory as the executables).
smem_ns_amgcl returns after 198 iterations with an error < 1e-4.
smem_ns_amgcl_scalar needs 201 iterations and finishes likewise with an error < 1e-4.
However, smem_ns_schur (which I expected to perform better than the two previous ones), stops after 100 iterations and has not converged yet (error = 3e-3).
Am I missing anything? Any command line options?

cuda error

Hi,

Thank you a lot for implementing this wonderful library. I really want to use it. The problem I have now is the benchmark file amgcl.cu in the shared memory for poisson equation. I have the K20m GPU installed and can compile this file. But When I run it, it gives,

terminate called after throwing an instance of 'std::runtime_error'
what(): CUDA error 2 at "./amgcl/amgcl/backend/cuda.hpp:247
Abort (core dumped)

I didn't try anything in the source code but compile it with make

I really need your help to find anything I did wrong! Thank you so much!

Here is my makefile

CUDA_VERSION = 6.5.14
LINKFLAGS = -Xcompiler "-Wl,-rpath,/usr/local/apps/cuda/cuda-$(CUDA_VERSION)/lib64"
LIBS = -L/usr/local/apps/cuda/cuda-$(CUDA_VERSION)/lib64 -lcudart -lcusparse

poisson_cu.o: poisson.cu
$(NVCC) -std=c++11 -O2 $(INCLUDES) $(CUDA_INC) poisson.cu -c -o poisson_cu.o

poisson_cu: poisson_cu.o
$(NVCC) $(LINKFLAGS) $(LIBS) poisson_cu.o -o poisson_cu

matrix files for navier-stokes benchmark

Are there copies somewhere of the data files used in this benchmark?

Compilation issues

Hi,
I am currently involved in an open-source project for particle simulations (https://www.sciencedirect.com/science/article/pii/S0010465519300852?via%3Dihub), and working on comparing backends for solving sparse linear systems of equations (Poisson problem). I have found your repository particularly handy, thank you. Unfortunately, the examples utilizing amgcl (I have started with shared memory), seem to be not compiling right with the latest version of amgcl. The errors are showing up when instantiating and applying solver, e.g.

type/value mismatch at argument 2 in template parameter list for ‘template<class Precond, class IterativeSolver> class amgcl::mpi::make_solver’

Could you please clarify whether the examples are compiling right at your side? If so, I will be looking for different possible reasons of the errors. If this is not the case, and some interfaces of the framework have changed, could you please update the examples? Though I am working on it myself, it would save me some time.
Thanks in advance!

Parallel Efficiency Benchmarks

Hi Denis,

I've been trying out the benchmarks for the 3D Poisson case (3375000 unknowns, 23490000 nonzeros) using OpenMP, basically the C++ code in shared_mem/poisson/amgcl.cpp using AMGCL 1.2.0

The parallel efficiency for the solve time doesn't seem to be that good, about a factor of 2 speedup maximum (to ~6 seconds fastest), when I go from 1 to 6 OpenMP threads, whereas the results on the AMGCL website look like the speedup should be >4x with 6 MPI processes.

Would you have any suggestions on this, or would you have some more details on the architecture/compiler on which the 3D Poisson benchmarks were done, perhaps some of these differences could explain the discrepancy?

Other setup details:
-i7-8700 CPU, 6 cores, 64GB memory
-Windows, Visual Studio/C compiler 2013/2019 (similar performance), with built in OpenMP
-Boost 1.62/1.72 (similar performance)
-solver as in the code(smoothed aggregation, spaio, bicgstabl(2))

We're using Python bindings in the final application but we wanted to improve the parallel performance on the C++ side first. Appreciate the help. Regards,

Gary

ddemidov / amgcl_benchmarks Goto Github PK

amgcl_benchmarks's Introduction

amgcl_benchmarks's People

Contributors

Stargazers

Watchers

Forkers

amgcl_benchmarks's Issues

Any command line arguments needed?

cuda error

matrix files for navier-stokes benchmark

Compilation issues

Parallel Efficiency Benchmarks

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent