Coder Social home page Coder Social logo

Comments (14)

aportelli avatar aportelli commented on August 18, 2024

Hi Oliver, can you please build with other instruction sets and see which ones are succeeding? I would try SSE and the generic code.

from grid.

witzel avatar witzel commented on August 18, 2024

Hi Antonin,

I've followed your suggestion and tried configuring / compiling Grid with different simd options on this machine. While GEN works, SSE4 does not; the binary only reports a message like above. Going back to the AVX option, I get around the problem by only adding CXXFLAGS="-mavx" i.e.
CXX=icpc ../Grid/configure --enable-simd=AVX CXXFLAGS="-mavx" --enable-comms=mpi-auto

Except for explicit '-mavx', both config.log files are equal. I don't see what the explicit '-mavx' is forcing but it seems to be necessary for icpc v16 on an AMD chip. -- It's not on an intel chip.

Thanks,
Oliver

from grid.

aportelli avatar aportelli commented on August 18, 2024

Hi Oliver can I have the confit log and output for the working case please?

from grid.

witzel avatar witzel commented on August 18, 2024

Sorry, didn't notice that github refused my files because of the unknown suffix.
Oliver
config.log-AVX.txt
config.out-AVX.txt

from grid.

aportelli avatar aportelli commented on August 18, 2024

Ok I got it, from the Intel Compiler doc the -x options are only supported by Intel CPUs. Then the order of flags matters, so when you pu -mavx at the end it takes over -xavx. I'll find a way to fine-tune the configure scripts and have a target for AMD. Could you please send me the exact model reference of this AMD CPU? What machine is that?

EDIT: Ok the model is in your original post so don't worry about it.

from grid.

aportelli avatar aportelli commented on August 18, 2024

The CPU you are using is supporting FMA instructions so you should definitely use them. I have committed a little fix. Please update develop on your machine and configure using --with-simd=AVXFMA4 and please confirm to me that it worked.

On a different note, I am not entirely sure how good the Intel compiler is to generate the FMA4 instructions specific to AMD platforms. You might want to compare performances with clang and GCC (still using --with-simd=AVXFMA4).

from grid.

witzel avatar witzel commented on August 18, 2024

Thank you, Antonin. Intel is not happy when using --enable-simd=AVXFMA4; the code does not compile first printing
icpc: command line warning #10353: option '-mfma' ignored, suggest using '-march=core-avx2'
and then errors
n file included from /project/fourpluseight/GRID/Grid/include/Grid/simd/Grid_vector_types.h(48),
from /project/fourpluseight/GRID/Grid/include/Grid/Simd.h(175),
from ../../Grid/lib/Grid.h(68),
from ../../Grid/lib/PerfCount.cc(29):
/project/fourpluseight/GRID/Grid/include/Grid/simd/Grid_avx.h(234): error: identifier "_mm256_maddsub_ps" is undefined
return _mm256_maddsub_ps( a_real, b, a_imag ); // Ar Br , Ar Bi +- Ai Bi = ArBr-AiBi , ArBi+AiBr

Full output is attached.
config.log.AVFFMA4.txt
config.out.AVFFMA4.txt
make.out.AVFFMA4.txt

I also tried compiling --enable-simd=AVXFMA4 using gcc 4.9.1 but likewise couldn't build Grid.
config.log.gcc.AVFFMA4.txt
config.out.gcc.AVFFMA4.txt
make.out.gcc.AVFFMA4.txt

from grid.

coppolachan avatar coppolachan commented on August 18, 2024

I'll add here that the CPU supports FMA but not FMA4, just FMA3.
-mfma should be enough using GCC but it seems that it is still looking for fma4 functions even with the fix.
The configure.ac is using fma4 for the generic clang gnu compilation

GCC supports FMA4 with -mfma4 since version 4.5.0[15] and FMA3 with -mfma since version 4.7.0.

from grid.

coppolachan avatar coppolachan commented on August 18, 2024

Oliver, would you try the last commit in develop?
I added another SIMD type: AVXFMA that should work for gnu compilation.
G

from grid.

aportelli avatar aportelli commented on August 18, 2024

Hi Guido, where is that coming from? The Opteron 6320 does support FMA4 (Piledriver arch).

from grid.

coppolachan avatar coppolachan commented on August 18, 2024

You are right.

There is a suspicious line in the Grid_avx.h
#if defined (AVX2) || defined (AVXFMA4) #define _mm256_alignr_epi32(ret,a,b,n) ret=(__m256) _mm256_alignr_epi8((__m256i)a,(__m256i)b,(n*4)%16) #define _mm256_alignr_epi64(ret,a,b,n) ret=(__m256d) _mm256_alignr_epi8((__m256i)a,(__m256i)b,(n*8)%16) #endif

the _mm256_alignr_epi8 is not an FMA4 function, only AVX2. Then the compiler complaining about a target mismatch: asking for an AVX2 function but with an -mavx flag.

Why the || defined (AVXFMA4) is there?
I did not delete in the commit with the fix because I first would like to know if there was a compelling reason.

For full support of FMA4 but not AVX2 we should carefully separate the calls inside the Grid_avx.h file.

G

from grid.

witzel avatar witzel commented on August 18, 2024

Thank you, Guido. gcc compiles fine using --enable-simd=AVXFMA and also runs without issues.

Knowing icpc is not favoured on AMD chips, I gave it a try. At compile time I get the warnings
icpc: command line warning #10353: option '-mfma' ignored, suggest using '-march=core-avx2'
and found on 'man icc' that intel is by default using '-fma'. Anyhow, I get a binary which starts but it crashes after reporting nan:

Grid : Message : 3 ms : Grid is setup to use 16 threads
Grid : Message : 239 ms : MobiusFermion (b=1.5,c=0.5) with Ls= 12 Tanh approx
Grid : Message : 303 ms : Filling the smeared set
Grid : Message : 303 ms : [HMC parameter] Trajectories : 1
Grid : Message : 303 ms : [HMC parameter] Start trajectory : 0
Grid : Message : 304 ms : [HMC parameter] Metropolis test (on/off): 1
Grid : Message : 304 ms : [HMC parameter] Thermalization trajs : 10
Grid : Message : 304 ms : -- # Trajectory = 0
Grid : Message : 1150 ms : Momentum action H_p = -nan
Test_BSM10f_sMDWF_SymanzikGauge: /project/fourpluseight/GRID/Grid/include/Grid/algorithms/iterative/ConjugateGradient.h:66: void Grid::ConjugateGradient::operator()(Grid::LinearOperatorBase &, const Field &, Field &) [with Field = Grid::LatticeGrid::iScalar<Grid::iVector<Grid::iVector<Grid::Grid_simd<std::complex<double, __m256d>, 3>, 4>>>]: Assertion `std::isnan(guess) == 0' failed.
Aborted

I guess __m256d again hints at avx2? Maybe AVXFMA shoud not be permitted for intel compilers or downgraded to AVX by the configure script?

from grid.

coppolachan avatar coppolachan commented on August 18, 2024

mm256 type is for AVX and AVX2. I should investigate on this with an intel compiler. Btw the problem seems wrong instructions executed because the nan results from a computation (a simple sum...).

from grid.

aportelli avatar aportelli commented on August 18, 2024

@coppolachan: that's a good spot, let's remember to ask Peter tomorrow. We should be able to make AVXFMA4 on that platform and it looks like this is the problem.
@witzel: _m256d is no AVX2 but is just an AVX 256bit vector. From the discussion today, it sounds clear to me that we are not going to provide support for Intel compiler on AMD platforms. So as far as this issue is concerned, let us focus only on GCC.

from grid.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.