I am trying to run the NNP example in input/in.nnp
but after the symmetry function setup is completed I get the following error in the SETUP: SYMMETRY FUNCTION GROUPS
section:
terminate called after throwing an instance of 'std::runtime_error'
what(): View bounds error of view AngularCounter ( 1 < 1 )
Traceback functionality not available
I am starting CabanaMD with the following command:
~/local/src/openmpi/4.0.4/build/bin/mpiexec -n 1 build/bin/cbnMD -il input/in.nnp --device-type SERIAL
The error occurs with any of the three device targets: SERIAL
, OPENMP
and CUDA
When I run with gdb
and look at the backtrace I find:
#6 0x0000555557159719 in Kokkos::Impl::throw_runtime_exception(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
#7 0x0000555556d48c19 in Kokkos::Impl::view_verify_operator_bounds<Kokkos::HostSpace, Kokkos::Impl::ViewMapping<Kokkos::ViewTraits<int*, Kokkos::LayoutRight, Kokkos::HostSpace>, void>, int> (tracker=..., map=...)
at /home/andi/local/src/kokkos/3.1.01/build/install/include/impl/Kokkos_ViewMapping.hpp:3813
#8 0x0000555556bd7b53 in Kokkos::View<int*, Kokkos::LayoutRight, Kokkos::HostSpace>::operator()<int> (i0=<optimized out>, this=<optimized out>)
at /home/andi/local/src/kokkos/3.1.01/build/install/include/Kokkos_View.hpp:1241
#9 nnpCbn::Element::setupSymmetryFunctionGroups<Kokkos::View<double** [15], Kokkos::LayoutRight, Kokkos::HostSpace>, Kokkos::View<int***, Kokkos::LayoutRight, Kokkos::HostSpace>, Kokkos::View<int*, Kokkos::LayoutRight, Kokkos::HostSpace> > (this=0x55555a3a4cf0, SF=..., SFGmemberlist=..., attype=0,
---Type <return> to continue, or q <return> to quit---
h_numSFperElem=..., h_numSFGperElem=..., maxSFperElem=27)
at /home/andi/local/src/CabanaMD/master/src/force_types/nnp_element_impl.h:375
#10 0x00005555563ec769 in nnpCbn::Mode<Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace> >::setupSymmetryFunctionGroups (this=0x55555b920de0)
at /home/andi/local/src/CabanaMD/master/src/force_types/nnp_mode_impl.h:615
#11 0x000055555633314b in ForceNNP<System<Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace>, AoSoA6>, System_NNP<Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace>, AoSoA3>, NeighborVerlet<System<Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace>, AoSoA6>, Cabana::FullNeighborTag, Cabana::VerletLayout2D>, Cabana::SerialOpTag, Cabana::SerialOpTag>::init_coeff (this=0x55555c21f950,
args=std::vector of length 1, capacity 1 = {...})
at /home/andi/local/src/CabanaMD/master/src/force_types/force_nnp_cabana_neigh_impl.h:59
#12 0x0000555555f0c0aa in CbnMD<System<Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace>, AoSoA6>, NeighborVerlet<System<Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace>, AoSoA6>, Cabana::FullNeighborTag, Cabana::VerletLayout2D> >::init (
this=0x55555b9bb680, commandline=...)
at /home/andi/local/src/CabanaMD/master/src/cabanamd_impl.h:178
which brings me here:
and then descends into Kokkos... do you have any idea why this error happens and how I can resolve it?
I used the following setup to compile Kokkos, Cabana and CabanaMD:
My system:
- LInux Mint 19.3
- gcc version 7.5.0
- CUDA version 11.0
- OpenMPI version 4.0.4 (compiled with CUDA support)
- NVIDIA GTX 1060 6GB GPU (Pascal61 architecture)
Kokkos (version 3.1.01) build flags:
In the nvcc_wrapper
script I set default_arch="sm_61"
.
-DCMAKE_CXX_COMPILER=${KOKKOS_SRC_DIR}/bin/nvcc_wrapper \
-DCMAKE_INSTALL_PREFIX=${KOKKOS_SRC_DIR}/build/install \
-DKokkos_CUDA_DIR=/usr/local/cuda-11.0/ \
-DKokkos_ENABLE_SERIAL=On \
-DKokkos_ENABLE_OPENMP=On \
-DKokkos_ENABLE_CUDA=On \
-DKokkos_ENABLE_CUDA_LAMBDA=On \
-DKokkos_ENABLE_CUDA_UVM=On \
-DKokkos_ARCH_PASCAL61=On \
-DKokkos_ENABLE_HWLOC=On \
-DKokkos_ENABLE_TESTS=On \
-DKokkos_ENABLE_DEBUG=On \
-DKokkos_ENABLE_DEBUG_BOUNDS_CHECK=On \
Cabana (66c94f6) build flags:
-DCMAKE_BUILD_TYPE="Debug" \
-DCMAKE_PREFIX_PATH="${KOKKOS_INSTALL_DIR};${HOME}/local/src/openmpi/4.0.4/build/" \
-DCMAKE_INSTALL_PREFIX=${CABANA_INSTALL_DIR} \
-DCMAKE_CXX_COMPILER=${KOKKOS_SRC_DIR}/bin/nvcc_wrapper \
-DMPI_CXX_COMPILER=${HOME}/local/src/openmpi/4.0.4/build/bin/mpic++ \
-DCabana_REQUIRE_CUDA=On \
-DCabana_ENABLE_MPI=On \
-DCabana_ENABLE_EXAMPLES=On \
-DCabana_ENABLE_TESTING=On \
CabanaMD (562600e) build flags:
-DCMAKE_BUILD_TYPE="Debug" \
-DCMAKE_CXX_COMPILER=${KOKKOS_DIR}/bin/nvcc_wrapper \
-DCMAKE_PREFIX_PATH="${CABANA_DIR};${HOME}/local/src/openmpi/4.0.4/build/" \
-DCMAKE_INSTALL_PREFIX=${CABANAMD_INSTALL_DIR} \
-DMPI_CXX_COMPILER=${HOME}/local/src/openmpi/4.0.4/build/bin/mpic++ \
-DCabana_ENABLE_MPI=On \
-DCabanaMD_VECTORLENGTH=32 \
-DN2P2_DIR=${HOME}/local/src/n2p2-singraber/ \
-DCabanaMD_ENABLE_NNP=On \
-DCabanaMD_MAXSYMMFUNC_NNP=30 \
-DCabanaMD_VECTORLENGTH_NNP=1 \
-DCabanaMD_ENABLE_TESTING=ON \
There is also an additional issue with the tests of CabanaMD which may be unrelated but who knows...:
The tests of Kokkos and Cabana pass without any errors but when I run ctest -VV
in the CabanaMD build directory I get the same error for both CUDA-related tests (Integrator_test_CUDA
and Neighbor_test_CUDA
):
[ RUN ] cuda.reversibility_test
Kokkos::View ERROR: attempt to access inaccessible memory space
Thread 1 "Integrator_test" received signal SIGABRT, Aborted.
Running the tests manually and backtracing with gdb
shows:
#3 0x000055555556caa0 in Kokkos::abort (
message=0x55555567fe30 "Kokkos::View ERROR: attempt to access inaccessible memory space")
at /home/andi/local/src/kokkos/3.1.01/build/install/include/impl/Kokkos_Error.hpp:175
#4 0x0000555555576ee7 in Kokkos::View<double*, Kokkos::Device<Kokkos::Cuda, Kokkos::CudaSpace> >::verify_space<Kokkos::HostSpace, false>::check ()
at /home/andi/local/src/kokkos/3.1.01/build/install/include/Kokkos_View.hpp:882
#5 Kokkos::View<double*, Kokkos::Device<Kokkos::Cuda, Kokkos::CudaSpace> >::operator()<int> (i0=<optimized out>, this=0x7fffffffc730)
at /home/andi/local/src/kokkos/3.1.01/build/install/include/Kokkos_View.hpp:1241
#6 Test::createParticles<System<Kokkos::Device<Kokkos::Cuda, Kokkos::CudaSpace>, AoSoA6> > (num_particle=1000, num_ghost=200, box_min=-12.295999999999999,
box_max=10.904)
at /home/andi/local/src/CabanaMD/master/unit_test/tstIntegrator.hpp:38
#7 0x000055555556e6d2 in Test::testIntegratorReversibility<System<Kokkos::Device<Kokkos::Cuda, Kokkos::CudaSpace>, AoSoA6> > (steps=100)
at /home/andi/local/src/CabanaMD/master/unit_test/tstIntegrator.hpp:91
and
#3 0x000055555557190b in Kokkos::abort (
message=0x5555556c2ca8 "Kokkos::View ERROR: attempt to access inaccessible memory space")
at /home/andi/local/src/kokkos/3.1.01/build/install/include/impl/Kokkos_Error.hpp:175
#4 0x000055555557bc05 in Kokkos::View<double*, Kokkos::Device<Kokkos::Cuda, Kokkos::CudaSpace> >::verify_space<Kokkos::HostSpace, false>::check ()
at /home/andi/local/src/kokkos/3.1.01/build/install/include/Kokkos_View.hpp:882
#5 Kokkos::View<double*, Kokkos::Device<Kokkos::Cuda, Kokkos::CudaSpace> >::operator()<int> (i0=<optimized out>, this=0x7fffffffc730)
at /home/andi/local/src/kokkos/3.1.01/build/install/include/Kokkos_View.hpp:1241
#6 Test::createAtoms<System<Kokkos::Device<Kokkos::Cuda, Kokkos::CudaSpace>, AoSoA6> > (num_atom=1000, num_ghost=200, box_min=-12.295999999999999,
box_max=10.904)
at /home/andi/local/src/CabanaMD/master/unit_test/tstNeighbor.hpp:255
#7 0x0000555555573790 in Test::testNeighborListPartialRange<System<Kokkos::Device<Kokkos::Cuda, Kokkos::CudaSpace>, AoSoA6>, NeighborVerlet<System<Kokkos::Device<Kokkos::Cuda, Kokkos::CudaSpace>, AoSoA6>, Cabana::FullNeighborTag, Cabana::V--Type <RET> for more, q to quit, c to continue without paging--c
erletLayout2D> > (half_neigh=false) at /home/andi/local/src/CabanaMD/master/unit_test/tstNeighbor.hpp:303
for Integrator_test_CUDA
and Neighbor_test_CUDA
, respectively.
Sorry for this overly long post... I am out of ideas for now, any help is greatly appreciated!
Thank you!!