Comments (14)
Hi Oliver, can you please build with other instruction sets and see which ones are succeeding? I would try SSE and the generic code.
from grid.
Hi Antonin,
I've followed your suggestion and tried configuring / compiling Grid with different simd options on this machine. While GEN works, SSE4 does not; the binary only reports a message like above. Going back to the AVX option, I get around the problem by only adding CXXFLAGS="-mavx" i.e.
CXX=icpc ../Grid/configure --enable-simd=AVX CXXFLAGS="-mavx" --enable-comms=mpi-auto
Except for explicit '-mavx', both config.log files are equal. I don't see what the explicit '-mavx' is forcing but it seems to be necessary for icpc v16 on an AMD chip. -- It's not on an intel chip.
Thanks,
Oliver
from grid.
Hi Oliver can I have the confit log and output for the working case please?
from grid.
Sorry, didn't notice that github refused my files because of the unknown suffix.
Oliver
config.log-AVX.txt
config.out-AVX.txt
from grid.
Ok I got it, from the Intel Compiler doc the -x
options are only supported by Intel CPUs. Then the order of flags matters, so when you pu -mavx
at the end it takes over -xavx
. I'll find a way to fine-tune the configure scripts and have a target for AMD. Could you please send me the exact model reference of this AMD CPU? What machine is that?
EDIT: Ok the model is in your original post so don't worry about it.
from grid.
The CPU you are using is supporting FMA instructions so you should definitely use them. I have committed a little fix. Please update develop
on your machine and configure using --with-simd=AVXFMA4
and please confirm to me that it worked.
On a different note, I am not entirely sure how good the Intel compiler is to generate the FMA4 instructions specific to AMD platforms. You might want to compare performances with clang and GCC (still using --with-simd=AVXFMA4
).
from grid.
Thank you, Antonin. Intel is not happy when using --enable-simd=AVXFMA4; the code does not compile first printing
icpc: command line warning #10353: option '-mfma' ignored, suggest using '-march=core-avx2'
and then errors
n file included from /project/fourpluseight/GRID/Grid/include/Grid/simd/Grid_vector_types.h(48),
from /project/fourpluseight/GRID/Grid/include/Grid/Simd.h(175),
from ../../Grid/lib/Grid.h(68),
from ../../Grid/lib/PerfCount.cc(29):
/project/fourpluseight/GRID/Grid/include/Grid/simd/Grid_avx.h(234): error: identifier "_mm256_maddsub_ps" is undefined
return _mm256_maddsub_ps( a_real, b, a_imag ); // Ar Br , Ar Bi +- Ai Bi = ArBr-AiBi , ArBi+AiBr
Full output is attached.
config.log.AVFFMA4.txt
config.out.AVFFMA4.txt
make.out.AVFFMA4.txt
I also tried compiling --enable-simd=AVXFMA4 using gcc 4.9.1 but likewise couldn't build Grid.
config.log.gcc.AVFFMA4.txt
config.out.gcc.AVFFMA4.txt
make.out.gcc.AVFFMA4.txt
from grid.
I'll add here that the CPU supports FMA but not FMA4, just FMA3.
-mfma should be enough using GCC but it seems that it is still looking for fma4 functions even with the fix.
The configure.ac is using fma4 for the generic clang gnu compilation
GCC supports FMA4 with -mfma4 since version 4.5.0[15] and FMA3 with -mfma since version 4.7.0.
from grid.
Oliver, would you try the last commit in develop?
I added another SIMD type: AVXFMA that should work for gnu compilation.
G
from grid.
Hi Guido, where is that coming from? The Opteron 6320 does support FMA4 (Piledriver arch).
from grid.
You are right.
There is a suspicious line in the Grid_avx.h
#if defined (AVX2) || defined (AVXFMA4) #define _mm256_alignr_epi32(ret,a,b,n) ret=(__m256) _mm256_alignr_epi8((__m256i)a,(__m256i)b,(n*4)%16) #define _mm256_alignr_epi64(ret,a,b,n) ret=(__m256d) _mm256_alignr_epi8((__m256i)a,(__m256i)b,(n*8)%16) #endif
the _mm256_alignr_epi8
is not an FMA4 function, only AVX2. Then the compiler complaining about a target mismatch: asking for an AVX2 function but with an -mavx flag.
Why the || defined (AVXFMA4)
is there?
I did not delete in the commit with the fix because I first would like to know if there was a compelling reason.
For full support of FMA4 but not AVX2 we should carefully separate the calls inside the Grid_avx.h file.
G
from grid.
Thank you, Guido. gcc compiles fine using --enable-simd=AVXFMA and also runs without issues.
Knowing icpc is not favoured on AMD chips, I gave it a try. At compile time I get the warnings
icpc: command line warning #10353: option '-mfma' ignored, suggest using '-march=core-avx2'
and found on 'man icc' that intel is by default using '-fma'. Anyhow, I get a binary which starts but it crashes after reporting nan:
Grid : Message : 3 ms : Grid is setup to use 16 threads
Grid : Message : 239 ms : MobiusFermion (b=1.5,c=0.5) with Ls= 12 Tanh approx
Grid : Message : 303 ms : Filling the smeared set
Grid : Message : 303 ms : [HMC parameter] Trajectories : 1
Grid : Message : 303 ms : [HMC parameter] Start trajectory : 0
Grid : Message : 304 ms : [HMC parameter] Metropolis test (on/off): 1
Grid : Message : 304 ms : [HMC parameter] Thermalization trajs : 10
Grid : Message : 304 ms : -- # Trajectory = 0
Grid : Message : 1150 ms : Momentum action H_p = -nan
Test_BSM10f_sMDWF_SymanzikGauge: /project/fourpluseight/GRID/Grid/include/Grid/algorithms/iterative/ConjugateGradient.h:66: void Grid::ConjugateGradient::operator()(Grid::LinearOperatorBase &, const Field &, Field &) [with Field = Grid::LatticeGrid::iScalar<Grid::iVector<Grid::iVector<Grid::Grid_simd<std::complex<double, __m256d>, 3>, 4>>>]: Assertion `std::isnan(guess) == 0' failed.
Aborted
I guess __m256d again hints at avx2? Maybe AVXFMA shoud not be permitted for intel compilers or downgraded to AVX by the configure script?
from grid.
mm256 type is for AVX and AVX2. I should investigate on this with an intel compiler. Btw the problem seems wrong instructions executed because the nan results from a computation (a simple sum...).
from grid.
@coppolachan: that's a good spot, let's remember to ask Peter tomorrow. We should be able to make AVXFMA4
on that platform and it looks like this is the problem.
@witzel: _m256d
is no AVX2 but is just an AVX 256bit vector. From the discussion today, it sounds clear to me that we are not going to provide support for Intel compiler on AMD platforms. So as far as this issue is concerned, let us focus only on GCC.
from grid.
Related Issues (20)
- Very low acceptance for SU(2) 1 adjoint flavour RHMC HOT 2
- NERSC and ILDG files always claim to be SU(3) HOT 2
- HMC on A100 spends large amounts of time in memory copy HOT 3
- MPI2 romio321 library fails when reading >= 2GB per rank HOT 2
- Cannot compile the gparity and adjoint versions of the CompactWilsonCloverAction
- Compilation errors and warnings build targeting Nvidia GPUs HOT 2
- GPU Benchmark_ITT segfaults with MPI and ranks > 1 HOT 9
- Create a version of Benchmark_ITT including Clover instead of Wilson
- Grid fails to build for Nc != 3
- hipcc on Crusher: function bcopy undefined (compiler does not have openmp enabled?) HOT 1
- Certain operations involving SitePropagator::scalar_object won't compile with CUDA for Nc > 3
- make install doesn't install all headers due to duplicate Config.h and Version.h HOT 3
- Using ILDG checkpointer causes a crash during write HOT 2
- Develop is broken HOT 1
- ARM NEON is broken HOT 2
- Feature request: provenance tracking
- Add hint to shm error message
- Cuda error invalid device ordinal
- Recent commit causing Grid build to fail
- The configure options --enable-setdevice and --diable-setdevice have no effect
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from grid.