Comments (5)
be careful -- AVXFMA4 is the AMD opteron version for BlueWaters which has one set of multiply add instructions (four operand).
These are sadly incompatible with AVX2 FMA instructions (three operand).
from grid.
Ok thanks for the explanation I certainly did not understand that.
from grid.
There's some interesting history.
AMD and Intel published new instructions to try to converge on common ground.
AMD decided to implement Intel's
Intel decided to implement AMD's
The chance of converging was lost.
from grid.
https://en.wikipedia.org/wiki/FMA_instruction_set
The incompatibility between Intel's FMA3 and AMD's FMA4 is due to both companies changing plans without coordinating coding details with each other. AMD changed their plans from FMA3 to FMA4 while Intel changed their plans from FMA4 to FMA3 almost at the same time. The history can be summarized as follows:
August 2007: AMD announces the SSE5 instruction set, which includes 3-operand FMA instructions. A new coding scheme (DREX) is introduced for allowing instructions to have three operands.[6]
April 2008: Intel announces their AVX and FMA instruction sets, including 4-operand FMA instructions. The coding of these instructions uses the new VEX coding scheme which is more flexible than AMD's DREX scheme. (Section requires an actual source, Intel sources are not acceptable for debatable specifics.)[7]
December 2008: Intel changes the specification for their FMA instructions from 4-operand to 3-operand instructions. The VEX coding scheme is still used.[8]
May 2009: AMD changes the specification of their FMA instructions from the 3-operand DREX form to the 4-operand VEX form, compatible with the April 2008 Intel specification rather than the December 2008 Intel specification.[9]
October 2011: AMD Bulldozer processor supports FMA4.[10]
January 2012: AMD announces FMA3 support in future processors codenamed Trinity and Vishera; they are based on the Piledriver architecture.[11]
May 2012: AMD Piledriver processor supports both FMA3 and FMA4.[10]
June 2013: Intel Haswell processor supports FMA3.[12]
AMD explicitly revealed that Zen, its 3rd-generation x86-64 architecture in its first iteration (znver1 – Zen, version 1); would drop support for FMA4 in a patch to the GNU Binutils package.[13] There has been initial confusion regarding whether FMA4 was implemented or not due to errata in the initial patch that has since then been rectified.[14]
from grid.
Ok that's interesting!
from grid.
Related Issues (20)
- MPI2 romio321 library fails when reading >= 2GB per rank HOT 2
- Cannot compile the gparity and adjoint versions of the CompactWilsonCloverAction
- Compilation errors and warnings build targeting Nvidia GPUs HOT 2
- GPU Benchmark_ITT segfaults with MPI and ranks > 1 HOT 9
- Create a version of Benchmark_ITT including Clover instead of Wilson
- Grid fails to build for Nc != 3
- hipcc on Crusher: function bcopy undefined (compiler does not have openmp enabled?) HOT 1
- Certain operations involving SitePropagator::scalar_object won't compile with CUDA for Nc > 3
- make install doesn't install all headers due to duplicate Config.h and Version.h HOT 3
- Using ILDG checkpointer causes a crash during write HOT 2
- Develop is broken HOT 1
- ARM NEON is broken HOT 2
- Feature request: provenance tracking
- Add hint to shm error message
- Cuda error invalid device ordinal
- Recent commit causing Grid build to fail
- The configure options --enable-setdevice and --diable-setdevice have no effect
- Grid does not compile on Arm with CUDA HOT 9
- invalid configuration argument when running with 1 GPU
- FlightRecorder.cc breaks compilation for --enable-comms=none HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from grid.