Comments (19)
What worked was to execute bootstrap.sh, then having the following modules loaded:
`[tkurth@cori08 ~]$ module list
Currently Loaded Modulefiles:
- modules/3.2.6.7 9) pmi/5.0.10-1.0000.11050.0.0.ari 17) atp/2.0.2
- nsg/1.2.0 10) dmapp/7.1.0-12.37 18) PrgEnv-intel/6.0.3
- modules/3.2.10.4 11) gni-headers/5.0.7-3.1 19) craype-mic-knl
- craype-network-aries 12) xpmem/0.1-4.5 20) cray-shmem/7.4.0
- craype/2.5.5 13) job/1.5.5-3.58 21) cray-mpich/7.4.0
- cray-libsci/16.06.1 14) dvs/2.5_0.9.0-2.155 22) intel/17.0.0.098
- udreg/2.3.2-4.6 15) alps/6.1.3-17.12 23) altd/2.0
- ugni/6.0.12-2.1 16) rca/1.0.0-6.21 24) cray-memkind`
Then running the following configure
../src/configure --prefix=${installpath} \
--enable-simd=AVX512MIC \
--enable-precision=double \
--enable-comms=mpi \
--host=x86_64-unknown-linux \
CXX="CC" \
CXXFLAGS="-mkl -xMIC-AVX512 -std=c++11 -I/project/projectdirs/mpccc/tkurth/NESAP/GRID/src/include" \
CC="cc" \
CFLAGS="-mkl -xMIC=AVX512 -std=c99 -I/project/projectdirs/mpccc/tkurth/NESAP/GRID/src/include" \
LDFLAGS="-mkl -lmemkind"
The -lmemkind linking can be required because in intel 2016 and 2017, MKL makes use of hbw_alloc calls but does not link against libmemkind by default. So if you don't link and run the "wrong routine", then you will see segfaults.
In MKL 2017, this problem was solved for most of the standard BLAS/LAPACK routines but not for the newly introduced Deep Learning optimized routines (such as convolutional routines, pooling routines etc.). I think GRID does not use any of that but it is good to make sure and link properly from the begin with.
from grid.
Hello Thorsten,
- did you run the
bootstrap.h
script to generate the configure? - What is your configure command line?
The configure command line should be
CXX=<your compiler> ./configure ....
- We acknowledge this, a fix will be released soon.
- I think it is related to 1)
from grid.
-
Not surprised. Antonin reworked the build system and I was worried about the complexities
of modules and CC wrappers on the Crays and I worried something might go wrong. -
did you specify CXX=CC. I'm able to override CXX=mpicxx or example on other machines,
or CXX=clang++-3.9 and am surprised you weren't able to override.
That said, Shoji seems to be able to compile on an XC40 just fine (except the missing typecast I committed last night). Please try develop again on that.
-
Please specify the full configure command line, the configure output, and
the output from "make V=1" -
Personally, I don't overly like default hiding of the compile flow details
e.g.
CXX Benchmark_comms.o
CXXLD Benchmark_comms
and would prefer not to by default hide since there really is complexity and pretending it
is all magic just makes things harder to debug. It will be the death of open source.
But others disagree with me so feedback from many people welcome to get a feel
for the average opinion.
from grid.
p.s. I committed a patch to Travis for the typecast in Stencil.h
Important : are any of the NERSC Cray systems available for remote login and compile just now?
It would be good if we could try it ourselves, especially since Travis provides neither
the Intel compiler, nor the Cray wrappers so this is very hard to catch in our continuous
integration framework.
from grid.
Hi Thorsten,
We cannot really help you if we don't have the specifics of the build. Please:
- confirm that you are using the
HEAD
ofdevelop
- give us the configure command line
- give us the configure summary (at the end of configure output)
- give us the
config.log
file - give us the output of make V=1
from grid.
Hello, using mpicxx or something related is not a good option as that would basically disable priority access to Aries interconnect. To my knowledge, there is no good way of circumventing the cray wrappers and static linking when one wants a good performance at scale on a XC-40.
from grid.
Here are my build details:
commit:
commit 7af9b8731847667eaf3b2e33a2457b977a7254ae Author: paboyle <[email protected]> Date: Tue Oct 18 09:51:37 2016 +0100
build script:
#!/bin/bash
installpath=$(pwd)/install/grid_dp
mkdir -p build
cd build
../src/configure --prefix=${installpath} \
--enable-simd=AVX512MIC \
--enable-precision=double \
--enable-comms=mpi \
--host=x86_64-unknown-linux \
CXX="CC" \
CXXFLAGS="-mkl -xMIC-AVX512 -std=c++11 -I/project/projectdirs/mpccc/tkurth/NESAP/GRID/src/include" \
CC="cc" \
CFLAGS="-mkl -xMIC=AVX512 -std=c99 -I/project/projectdirs/mpccc/tkurth/NESAP/GRID/src/include" \
LDFLAGS="-mkl -lmemkind"
make -j12
cd ..
configure output:
[tkurth@cori08 src (develop)]$ cat config.log
This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake.
It was created by Grid configure 1.0, which was
generated by GNU Autoconf 2.63. Invocation command line was
$ ./configure
## --------- ##
## Platform. ##
## --------- ##
hostname = cori12
uname -m = x86_64
uname -r = 3.12.51-52.39-default
uname -s = Linux
uname -v = #1 SMP Fri Jan 15 20:03:12 UTC 2016 (16f5bac)
/usr/bin/uname -p = x86_64
/bin/uname -X = unknown
/bin/arch = x86_64
/usr/bin/arch -k = unknown
/usr/convex/getsysinfo = unknown
/usr/bin/hostinfo = unknown
/bin/machine = unknown
/usr/bin/oslevel = unknown
/bin/universe = unknown
PATH: /usr/common/software/darshan/3.0.1/bin
PATH: /usr/common/software/bin
PATH: /usr/common/mss/bin
PATH: /usr/common/nsg/bin
PATH: /global/homes/t/tkurth/MODULES/spack/bin
PATH: /usr/common/software/intel/compilers_and_libraries_2017.0.064/linux/bin/intel64
PATH: /opt/cray/pe/mpt/7.4.0/gni/bin
PATH: /opt/cray/rca/1.0.0-6.21/bin
PATH: /opt/cray/alps/6.1.3-17.12/sbin
PATH: /opt/cray/job/1.5.5-3.58/bin
PATH: /opt/cray/pe/pmi/5.0.10-1.0000.11050.0.0.ari/bin
PATH: /opt/cray/pe/craype/2.5.5/bin
PATH: /opt/cray/pe/modules/3.2.10.4/bin
PATH: /usr/syscom/nsg/sbin
PATH: /usr/syscom/nsg/bin
PATH: /opt/modules/3.2.6.7/bin
PATH: /global/homes/t/tkurth/bin
PATH: /usr/local/bin
PATH: /usr/bin
PATH: /bin
PATH: /usr/bin/X11
PATH: /usr/games
PATH: /usr/lib/mit/bin
PATH: /usr/lib/mit/sbin
PATH: /opt/cray/pe/bin
PATH: /global/homes/t/tkurth/src/xmldiff/bin
## ----------- ##
## Core tests. ##
## ----------- ##
## ---------------- ##
## Cache variables. ##
## ---------------- ##
ac_cv_env_CCC_set=
ac_cv_env_CCC_value=
ac_cv_env_CC_set=
ac_cv_env_CC_value=
ac_cv_env_CFLAGS_set=
ac_cv_env_CFLAGS_value=
ac_cv_env_CPPFLAGS_set=
ac_cv_env_CPPFLAGS_value=
ac_cv_env_CXXCPP_set=
ac_cv_env_CXXCPP_value=
ac_cv_env_CXXFLAGS_set=
ac_cv_env_CXXFLAGS_value=
ac_cv_env_CXX_set=
ac_cv_env_CXX_value=
ac_cv_env_LDFLAGS_set=
ac_cv_env_LDFLAGS_value=
ac_cv_env_LIBS_set=
ac_cv_env_LIBS_value=
ac_cv_env_build_alias_set=
ac_cv_env_build_alias_value=
ac_cv_env_host_alias_set=
ac_cv_env_host_alias_value=
ac_cv_env_target_alias_set=
ac_cv_env_target_alias_value=
## ----------------- ##
## Output variables. ##
## ----------------- ##
ACLOCAL=''
AMDEPBACKSLASH=''
AMDEP_FALSE=''
AMDEP_TRUE=''
AMTAR=''
AUTOCONF=''
AUTOHEADER=''
AUTOMAKE=''
AWK=''
BUILD_CHROMA_REGRESSION_FALSE=''
BUILD_CHROMA_REGRESSION_TRUE=''
BUILD_COMMS_MPI_FALSE=''
BUILD_COMMS_MPI_TRUE=''
BUILD_COMMS_NONE_FALSE=''
BUILD_COMMS_NONE_TRUE=''
BUILD_COMMS_SHMEM_FALSE=''
BUILD_COMMS_SHMEM_TRUE=''
BUILD_ZMM_FALSE=''
BUILD_ZMM_TRUE=''
CC=''
CCDEPMODE=''
CFLAGS=''
CPPFLAGS=''
CXX=''
CXXCPP=''
CXXDEPMODE=''
CXXFLAGS=''
CYGPATH_W=''
DEFS=''
DEPDIR=''
ECHO_C=''
ECHO_N='-n'
ECHO_T=''
EGREP=''
EXEEXT=''
GREP=''
INSTALL_DATA=''
INSTALL_PROGRAM=''
INSTALL_SCRIPT=''
INSTALL_STRIP_PROGRAM=''
LDFLAGS=''
LIBOBJS=''
LIBS=''
LTLIBOBJS=''
MAKEINFO=''
MKDIR_P=''
OBJEXT=''
OPENMP_CXXFLAGS=''
PACKAGE=''
PACKAGE_BUGREPORT='[email protected]'
PACKAGE_NAME='Grid'
PACKAGE_STRING='Grid 1.0'
PACKAGE_TARNAME='grid'
PACKAGE_VERSION='1.0'
PATH_SEPARATOR=':'
RANLIB=''
SET_MAKE=''
SHELL='/bin/sh'
SIMD_FLAGS=''
STRIP=''
USE_LAPACK_FALSE=''
USE_LAPACK_LIB_FALSE=''
USE_LAPACK_LIB_TRUE=''
USE_LAPACK_TRUE=''
VERSION=''
ac_ct_CC=''
ac_ct_CXX=''
am__fastdepCC_FALSE=''
am__fastdepCC_TRUE=''
am__fastdepCXX_FALSE=''
am__fastdepCXX_TRUE=''
am__include=''
am__isrc=''
am__leading_dot=''
am__quote=''
am__tar=''
am__untar=''
bindir='${exec_prefix}/bin'
build=''
build_alias=''
build_cpu=''
build_os=''
build_vendor=''
datadir='${datarootdir}'
datarootdir='${prefix}/share'
docdir='${datarootdir}/doc/${PACKAGE_TARNAME}'
dvidir='${docdir}'
exec_prefix='NONE'
host=''
host_alias=''
host_cpu=''
host_os=''
host_vendor=''
htmldir='${docdir}'
includedir='${prefix}/include'
infodir='${datarootdir}/info'
install_sh=''
libdir='${exec_prefix}/lib'
libexecdir='${exec_prefix}/libexec'
localedir='${datarootdir}/locale'
localstatedir='${prefix}/var'
mandir='${datarootdir}/man'
mkdir_p=''
oldincludedir='/usr/include'
pdfdir='${docdir}'
prefix='NONE'
program_transform_name='s,x,x,'
psdir='${docdir}'
sbindir='${exec_prefix}/sbin'
sharedstatedir='${prefix}/com'
sysconfdir='${prefix}/etc'
target=''
target_alias=''
target_cpu=''
target_os=''
target_vendor=''
## ----------- ##
## confdefs.h. ##
## ----------- ##
#define PACKAGE_NAME "Grid"
configure: caught signal 2
configure: exit 1
environment:
`[tkurth@cori08 src (develop)]$ module list
Currently Loaded Modulefiles:
- modules/3.2.6.7 7) udreg/2.3.2-4.6 13) job/1.5.5-3.58 19) craype-mic-knl
- nsg/1.2.0 8) ugni/6.0.12-2.1 14) dvs/2.5_0.9.0-2.155 20) cray-shmem/7.4.0
- modules/3.2.10.4 9) pmi/5.0.10-1.0000.11050.0.0.ari 15) alps/6.1.3-17.12 21) cray-mpich/7.4.0
- craype-network-aries 10) dmapp/7.1.0-12.37 16) rca/1.0.0-6.21 22) intel/17.0.0.098
- craype/2.5.5 11) gni-headers/5.0.7-3.1 17) atp/2.0.2 23) altd/2.0
- cray-libsci/16.06.1 12) xpmem/0.1-4.5 18) PrgEnv-intel/6.0.3 24) cray-memkind`
from grid.
That is the makefile output (relevant part)
make all-am
make[2]: Entering directory '/global/project/projectdirs/mpccc/tkurth/NESAP/GRID/build/lib'
depbase=`echo Init.o | sed 's|[^/]_$|.deps/&|;s|.o$||'`;\
g++ -DHAVE_CONFIG_H -I. -I../../src/lib -I/global/project/projectdirs/mpccc/tkurth/NESAP/GRID/src/include -mavx512f -mavx512pf -mavx512er -mavx512cd -fopenmp -O3 -std=c++11 -MT Init.o -MD -MP -MF $depbase.Tpo -c -o Init.o ../../src/lib/Init.cc &&\
mv -f $depbase.Tpo $depbase.Po
g++: error: unrecognized command line option '-mavx512f'
g++: error: unrecognized command line option '-mavx512pf'
g++: error: unrecognized command line option '-mavx512er'
g++: error: unrecognized command line option '-mavx512cd'
Makefile:1059: recipe for target 'Init.o' failed
make[2]: *_\* [Init.o] Error 1
make[2]: Leaving directory '/global/project/projectdirs/mpccc/tkurth/NESAP/GRID/build/lib'
Makefile:784: recipe for target 'all' failed
make[1]: **\* [all] Error 2
make[1]: Leaving directory '/global/project/projectdirs/mpccc/tkurth/NESAP/GRID/build/lib'
Makefile:369: recipe for target 'all-recursive' failed
make: **\* [all-recursive] Error 1
it tries using g++/gcc, not CC/cc
from grid.
- The config.log would be useful
- also the final summary of the output of the configure step.
- What version is your gcc, and why you are not using the intel compiler?
from grid.
I have added the config log.
I want to use the cray compiler wrappers for intel CC/cc when PrgEnv-intel is loaded, so CC points to icpc and cc points to icc, but it wants to take gnu instead for the lib build. I think this is a bug. It should use the compiler selected by the user.
from grid.
We need the config log after your configure, the one posted says that the run command was just
./configure
I do not think this is the one you ran.
from grid.
Oh, maybe my build script is buggy
Am 19.10.2016 um 13:09 schrieb Guido Cossu [email protected]:
We need the config log after your configure, the one posted says that the run command was just
./configure
I do no think this is the one you ran.—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
from grid.
Here it is (updated, now memkind loaded)
from grid.
This looks still a problem in the environment
/usr/bin/ld: cannot find -lmemkind
configure correctly recognized icpc but some libs are missing, maybe not in the libraries path.
from grid.
I did that before but now I switched to intel 2016 and it seems to work. Previously I used a 2017 beta.
from grid.
Ok, we can close the issue, seems to work.
Before we do that: the bin folder only contains the Benchmarks, is that right?
Additionally, shall I run some benchmarks on Cori Peter? If yes, which ones are you most interested in?
from grid.
one last thing: I can the comms benchmark test with:
srun -n 64 -c 68 --cpu_bind=cores numactl -p 1 ./Benchmark_comms --threads 64 --mpi 2.2.4.4 --grid 128.128.128.128
and it ran the test but when trying to compute the summary:
Grid : Message : 24906 ms : 30 4 10368000 1198.7 2397.41
Grid : Message : 26629 ms : 30 8 20736000 1199.82 2399.64
Grid : Message : 30097 ms : 30 16 41472000 1208.64 2417.28
Grid : Message : 30599 ms : 32 1 3145728 710.73 1421.46
Grid : Message : 31088 ms : 32 2 6291456 1173.55 2347.1
Grid : Message : 32166 ms : 32 4 12582912 1159.37 2318.73
Grid : Message : 34211 ms : 32 8 25165824 1231.09 2462.19
Grid : Message : 38414 ms : 32 16 50331648 1210.31 2420.63
Grid : Message : 38500 ms : ====================================================================================================
Grid : Message : 38500 ms : = Benchmarking sequential halo exchange in 4 dimensions
Grid : Message : 38500 ms : ====================================================================================================
Grid : Message : 38500 ms : L Ls bytes MB/s uni MB/s bidi
srun: error: nid12126: task 23: Floating point exception
are the parameters chosen wrong?
Am 19.10.2016 um 13:35 schrieb Guido Cossu [email protected]:
This looks still a problem in the environment
/usr/bin/ld: cannot find -lmemkindconfigure correctly recognized icpc but some libs are missing, maybe not in the libraries path.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub #57 (comment), or mute the thread https://github.com/notifications/unsubscribe-auth/ABAQ5pxFhrbAXfFCD_9QcIc0XM_gn3Ycks5q1n8PgaJpZM4KUY1L.
from grid.
Can I ask you a couple of things?
Since the compilation issue seems solved can you summarize in few lines the solution, including your environment and the configure command?
It would be a good reference for other people in the same situation.
Could you open a new thread for the last request?
from grid.
g++ is not yet known good on AVX512 intrinsics for us. Can you try current develop with ICPC?
Peter
On 19 Oct 2016, at 16:47, Thorsten Kurth [email protected] wrote:
That is the makefile output (relevant part)
``make[1]: Entering directory '/global/project/projectdirs/mpccc/tkurth/NESAP/GRID/build/lib'
make all-am
make[2]: Entering directory '/global/project/projectdirs/mpccc/tkurth/NESAP/GRID/build/lib'
depbase=echo Init.o | sed 's|[^/]$|.deps/&|;s|.o$||'`;
g++ -DHAVE_CONFIG_H -I. -I../../src/lib -I/global/project/projectdirs/mpccc/tkurth/NESAP/GRID/src/include -mavx512f -mavx512pf -mavx512er -mavx512cd -fopenmp -O3 -std=c++11 -MT Init.o -MD -MP -MF $depbase.Tpo -c -o Init.o ../../src/lib/Init.cc &&
mv -f $depbase.Tpo $depbase.Po
g++: error: unrecognized command line option '-mavx512f'
g++: error: unrecognized command line option '-mavx512pf'
g++: error: unrecognized command line option '-mavx512er'
g++: error: unrecognized command line option '-mavx512cd'
Makefile:1059: recipe for target 'Init.o' failed
make[2]: ** [Init.o] Error 1
make[2]: Leaving directory '/global/project/projectdirs/mpccc/tkurth/NESAP/GRID/build/lib'
Makefile:784: recipe for target 'all' failed
make[1]: *** [all] Error 2
make[1]: Leaving directory '/global/project/projectdirs/mpccc/tkurth/NESAP/GRID/build/lib'
Makefile:369: recipe for target 'all-recursive' failed
make: *** [all-recursive] Error 1it tries using g++/gcc, not CC/cc
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub #57 (comment), or mute the thread https://github.com/notifications/unsubscribe-auth/AHMczV8wec23JYmBchmadXvgYlKsAUvCks5q1juSgaJpZM4KUY1L.
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
from grid.
Related Issues (20)
- Compilation errors and warnings build targeting Nvidia GPUs HOT 2
- GPU Benchmark_ITT segfaults with MPI and ranks > 1 HOT 9
- Create a version of Benchmark_ITT including Clover instead of Wilson
- Grid fails to build for Nc != 3
- hipcc on Crusher: function bcopy undefined (compiler does not have openmp enabled?) HOT 1
- Certain operations involving SitePropagator::scalar_object won't compile with CUDA for Nc > 3
- make install doesn't install all headers due to duplicate Config.h and Version.h HOT 3
- Using ILDG checkpointer causes a crash during write HOT 2
- Develop is broken HOT 1
- ARM NEON is broken HOT 2
- Feature request: provenance tracking
- Add hint to shm error message
- Cuda error invalid device ordinal
- Recent commit causing Grid build to fail
- The configure options --enable-setdevice and --diable-setdevice have no effect
- Grid does not compile on Arm with CUDA HOT 9
- invalid configuration argument when running with 1 GPU
- FlightRecorder.cc breaks compilation for --enable-comms=none HOT 1
- Propagator has incorrect dimensions for higher representations
- Incorrect results on ROCM 5.7 HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from grid.