Comments (12)
Hi Filippo, thanks for your feedback.
Before running cmake, can you try setting the compilers, e.g. with:
export CC=`which cc`
export CXX=`which CC`
or explicitly e.g. with:
export CC=gcc-9
export CXX=g++-9
or whichever compiler you use. Then, CMake should be able to find the proper MPI as well.
from cosma.
Thank you for your quick answer.
Unfortunately, this does not work. I'm working with Intel compilers, so I exported
export CC=`which icc`
export CXX=`which icpc`
export FC=`which ifort`
I also tried exporting the MPI Intel compilers (export MPI_<lang>_COMPILER=...
) as well. Finally, I tried using GCC compiler with OpenMPI. With any combination, I get the following error
-- Setting build type to 'Release' as none was specified.
-- Selected BLAS backend for COSMA: MKL
-- Selected ScaLAPACK backend for COSMA: MKL
-- The CXX compiler identification is GNU 10.1.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /swstat/gcc/10.1.0/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Could NOT find MPI_CXX (missing: MPI_CXX_WORKS)
CMake Error at /home/sw/cmake/3.19.1/share/cmake-3.19/Modules/FindPackageHandleStandardArgs.cmake:218 (message):
Could NOT find MPI (missing: MPI_CXX_FOUND)
Reason given by package: MPI component 'C' was requested, but language C is not enabled. MPI component 'Fortran' was requested, but language Fortran is not enabled.
Call Stack (most recent call first):
/home/sw/cmake/3.19.1/share/cmake-3.19/Modules/FindPackageHandleStandardArgs.cmake:577 (_FPHSA_FAILURE_MESSAGE)
/home/sw/cmake/3.19.1/share/cmake-3.19/Modules/FindMPI.cmake:1721 (find_package_handle_standard_args)
CMakeLists.txt:76 (find_package)
-- Configuring incomplete, errors occurred!
from cosma.
This is a good insight. Seems to be related to a recent change in CMake FindMPI module. Which version of CMake are you using?
I just did some changes in the CMakeLists.txt in a new branch cmake
. Can you please try building COSMA from this new branch?
from cosma.
Ok, thank you very much. This is gone one step further, but still there are some issues, and I am not able to compile. I now get the following error
-- Setting build type to 'Release' as none was specified.
-- Selected BLAS backend for COSMA: MKL
-- Selected ScaLAPACK backend for COSMA: MKL
-- The CXX compiler identification is GNU 10.1.0
-- The C compiler identification is GNU 10.1.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /swstat/gcc/10.1.0/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /swstat/gcc/10.1.0/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Found MPI_C: /home/varrigoni/intel/oneapi/mpi/2021.1.1/lib/release/libmpi.so (found version "3.1")
-- Could NOT find MPI_CXX (missing: MPI_CXX_WORKS)
CMake Error at /home/sw/cmake/3.19.1/share/cmake-3.19/Modules/FindPackageHandleStandardArgs.cmake:218 (message):
Could NOT find MPI (missing: MPI_CXX_FOUND) (found version "3.1")
Reason given by package: MPI component 'Fortran' was requested, but language Fortran is not enabled.
Call Stack (most recent call first):
/home/sw/cmake/3.19.1/share/cmake-3.19/Modules/FindPackageHandleStandardArgs.cmake:577 (_FPHSA_FAILURE_MESSAGE)
/home/sw/cmake/3.19.1/share/cmake-3.19/Modules/FindMPI.cmake:1721 (find_package_handle_standard_args)
CMakeLists.txt:76 (find_package)
Again, the error arises if I set either FC
to gfortran
or ifort
(the GCC one and the Intel one), and it does not depend on the MPI distribution I load (this happens with both the Intel one and with OpenMPI).
I'm using CMake version 3.19.1, which is the only one I have available on this machine.
Thank you very much once again for your support!
from cosma.
Thanks a lot for the feedback. I just pushed another commit to the cmake
branch.
If this doesn't work, I would have to ask @rasolca or @teonnik to help out. If it doesn't get resolved by then, I will have a better look at it next week once I am back from holidays.
from cosma.
Good evening Marko, and thank you for your kind support.
Unfortunately, I have to say that this still does not work. It reverted back to the original error (hence, both the C and Fortran compilers are now disabled).
I don't know if this could help you, but I also tried a small modification to your CMakeLists.txt
. Just below the first line, I added the following code
enable_language(C)
enable_language(Fortran)
and this seemed to solve part of the problem. However, if I do this, CMake is still not able to find MPI_CXX
. In fact, it now gives the following error
-- The C compiler identification is unknown
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /home/varrigoni/intel/oneapi/compiler/2021.1.1/linux/bin/intel64/icc - skipped
-- The Fortran compiler identification is Intel 20.2.1.20201112
-- Detecting Fortran compiler ABI info
-- Detecting Fortran compiler ABI info - done
-- Check for working Fortran compiler: /home/varrigoni/intel/oneapi/compiler/2021.1.1/linux/bin/intel64/ifort - skipped
-- Checking whether /home/varrigoni/intel/oneapi/compiler/2021.1.1/linux/bin/intel64/ifort supports Fortran 90
-- Checking whether /home/varrigoni/intel/oneapi/compiler/2021.1.1/linux/bin/intel64/ifort supports Fortran 90 - yes
-- Setting build type to 'Release' as none was specified.
-- Selected BLAS backend for COSMA: MKL
-- Selected ScaLAPACK backend for COSMA: MKL
-- The CXX compiler identification is unknown
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /home/varrigoni/intel/oneapi/compiler/2021.1.1/linux/bin/intel64/icpc - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Could NOT find MPI_CXX (missing: MPI_CXX_WORKS)
CMake Error at /home/sw/cmake/3.19.1/share/cmake-3.19/Modules/FindPackageHandleStandardArgs.cmake:218 (message):
Could NOT find MPI (missing: MPI_CXX_FOUND CXX)
Call Stack (most recent call first):
/home/sw/cmake/3.19.1/share/cmake-3.19/Modules/FindPackageHandleStandardArgs.cmake:577 (_FPHSA_FAILURE_MESSAGE)
/home/sw/cmake/3.19.1/share/cmake-3.19/Modules/FindMPI.cmake:1721 (find_package_handle_standard_args)
CMakeLists.txt:81 (find_package)
Once again, thank you for your help, I'll wait for other updates.
from cosma.
I can reproduce this problem on my machine with CMake 3.19.1.
However CMake 3.19.2 work as expected, therefore I suspect a bug in the find MPI script shipped with 3.19.1.
from cosma.
Hi Raffaele, and thank you for the suggestion.
I tried to install CMake 3.19.2, and now it works, I can properly build COSMA! However, I still have some issues:
- The building process works only with GCC compilers + OpenMPI. When using Intel compilers and Intel MPI it still generates the same error. This is not a problem for me, at the moment, but someone could have this issue too. Also, the lines
enable_language(C)
enable_language(Fortran)
resulted to be necessary for properly building the codebase. Finally, CMake was not able to link MKL until I moved the library files one level up in the directory tree (standard MKL installation places them in mkl/latest/libs/intel64
, but CMake searches for them in mkl/latest/libs
).
- Even if the compilation succeeds, it seems COSMA does not work properly. If I try to run your example
mpirun -np ./build/miniapp/cosma_miniapp
, it raises the following error
terminate called after throwing an instance of 'std::domain_error'
what(): No value
terminate called after throwing an instance of 'std::domain_error'
what(): No value
terminate called after throwing an instance of 'std::domain_error'
what(): No value
terminate called after throwing an instance of 'std::domain_error'
what(): No value
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 0 on node n2 exited on signal 6 (Aborted).
--------------------------------------------------------------------------
the same happens if I unset OMP/MKL_NUM_THREADS
or if I set them to any value, and also using mpiexec
in place of mpirun
. Do you have any idea of what is going on? Am I missing something?
from cosma.
Hi Filippo,
you are right find MPI with gcc requires both C and Fortran to be enabled (for clang it doesn't and I don't have the intel compiler to test), but I still think these are problem are given by CMake find modules. We will investigate it.
Regarding the miniapp runs there is a bug with one of the option not having a default (I opened #70 for this).
Please add the option -s ""
as a workaround: mpirun -n 4 miniapp/cosma_miniapp -s ""
.
Please also note that the default values of m, n, k is 1000, which is a very small multiplication, so you may want to specify them as well.
from cosma.
@filthynobleman would you mind providing more details about your setup : OS, MPI version, gcc version and Intel-MKL version. Did you build COSMA's master
or did you build a released version (e.g. 2.3)
Finally, CMake was not able to link MKL until I moved the library files one level up in the directory tree (standard MKL installation places them in mkl/latest/libs/intel64, but CMake searches for them in mkl/latest/libs).
This PR should make it so that you don't have to move files. We will try to make a release next week including it. If you want to test it earlier, you are welcome to.
Also, the lines
enable_language(C) enable_language(Fortran)
resulted to be necessary for properly building the codebase.
Do you need the lines when you use gcc+OpenMPI or only when you use icc+IntelMPI? gcc+OpenMPI works for me without the suggested lines. I don't have icc and IntelMPI so I can't test this configuration unfortunately.
Word of advice:
-
You are unlikely to get any performance gain from compiling the code with icc. If you have a choice of compiler, I would suggest using gcc or clang. (Note: you can still use IntelMPI with gcc.)
-
COSMA is mostly used with MPICH (or it's derivatives: Cray-MPICH, IntelMPI, etc), OpenMPI hasn't been tested as thoroughly yet and there may be issues (e.g. : #75).
We will try to include a COSMA + OpenMPI setup in our CI before the next release to improve support.
from cosma.
Hello Raffaele and Teodor.
As suggested, I set the -s option and now everything works.
About your question, I'm working on a Linux environment, and I have the following
- GCC v7.5.0
- OpenMPI v4.0.3
- Intel MKL v2021.1.1
- COSMA built from your
cmake
branch
I was using Intel compilers just because it was easier for me at the moment, but I was able to switch to GCC+OpenMPI without noticeable problems. I can confirm the two lines above for enabling C and Fortran resulted to be necessary even with GCC, but I cannot understand why.
I also didn't recognize any issues in using COSMA with OpenMPI backend, and now everything works properly.
To me, this issue can be closed. If I can, I only would like to suggest to add a note about CMake 3.19.1, since I suppose it was the main cause of problems.
Thank you very much once again for your incredible and fast support. You have been very kind and helpful.
from cosma.
To me, this issue can be closed. If I can, I only would like to suggest to add a note about CMake 3.19.1, since I suppose it was the main cause of problems.
The thing is that both 3.19.1 and 3.19.2 work for me. I am not entirely sure that's where the problem is. Looking at the release notes for the two versions and the change history of the FindMPI module:
- https://cmake.org/cmake/help/latest/release/3.19.html#id1
- https://gitlab.kitware.com/cmake/cmake/-/commits/master/Modules/FindMPI.cmake
there is nothing that would imply that something broke with MPI between the two versions.
I can close the issue but it would be nice to understand what was causing the problem. If you want to proceed, it would be helpful to post the exact command line / script you used to build the library
Also, the FindMPI module has a section listing various ways you can make CMake aware of the location of the library, the section is worth checking out in case you haven't seen it already (link). It is usually sufficient to do one of these:
- have
mpicxx
,mpicc
andmpirun
/mpiexec
in yourPATH
- export/set at least one of the environment variables
MPICC=<some_path>/mpicc
,MPICXX=<some_path>/mpic++
(maybeMPIF77
andMPIF90
?)
from cosma.
Related Issues (20)
- COSMA cublas crash after job finished HOT 5
- Fixing CI/CD issues HOT 1
- timings in comsa_miniapp HOT 2
- GPU-Aware MPI Version HOT 2
- Error crash at the end of the job execution HOT 7
- Unable to use an internal RCCL build HOT 4
- COSMA crash on Perlmutter when dealing with complex values HOT 9
- Crashes with the latest COSMA release HOT 25
- (Still) Excessive memory usage HOT 3
- cmake project version v2.6.1 does not match git tag v2.6.2
- Add ability to disable NCCL at runtime HOT 1
- build failure with nccl HOT 1
- enable overlap comm with computation for cosma_miniapp HOT 2
- Configure fails to find costa submodule and build fails to build with costa-2.2 HOT 1
- Link errors HOT 6
- cmake project version v2.6.4 does not match git tag v2.6.5
- COSMA build fails on CRAY HOT 1
- How to run pdgemm on multiple GPUs? HOT 26
- undefined reference to `void costa::transform<float>(std::__1::vector<std::__1:: ... HOT 3
- Switching to a proper memory-pool implementation HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cosma.