Hi, Im trying to build GPUE with the instructions provided, but i ge

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Difficulties with the build process about gpue HOT 16 CLOSED

gpue-group commented on September 11, 2024

Difficulties with the build process

from gpue.

Comments (16)

leios commented on September 11, 2024

I was able to replicate the error with the cmake build; however, the makefile build should still work in this case.

git checkout Makefile and modify the lines to point to your cuda install and to use the correct architecture. Does this still provide an error?

We'll be working on the cmake stuff today. We do explicitly specify C++11, so it is weird that an error is coming up.

from gpue.

markbasham commented on September 11, 2024

@leios thanks for the help, if I just use the makefile as specified I get this. I have K80 gpus, so I think I have the GPU_ARCH correct.

[ssg37927@cs04r-sc-com14-05 test]$ git clone https://github.com/GPUE-group/GPUE.git
Initialized empty Git repository in /dls/tmp/ssg37927/test/GPUE/.git/
remote: Enumerating objects: 66, done.
remote: Counting objects: 100% (66/66), done.
remote: Compressing objects: 100% (46/46), done.
remote: Total 2705 (delta 33), reused 45 (delta 20), pack-reused 2639
Receiving objects: 100% (2705/2705), 3.83 MiB | 3.28 MiB/s, done.
Resolving deltas: 100% (2057/2057), done.

[ssg37927@cs04r-sc-com14-05 test]$ cd GPUE/

[ssg37927@cs04r-sc-com14-05 GPUE]$ module load cuda/8.0

[ssg37927@cs04r-sc-com14-05 GPUE]$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61

[ssg37927@cs04r-sc-com14-05 GPUE]$ gcc --version
gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

[ssg37927@cs04r-sc-com14-05 GPUE]$ which nvcc
/dls_sw/apps/cuda/8.0/bin/nvcc

[ssg37927@cs04r-sc-com14-05 GPUE]$ vim Makefile

[ssg37927@cs04r-sc-com14-05 GPUE]$ head Makefile
CUDA_HOME = /dls_sw/apps/cuda/8.0
#CUDA_HOME = /apps/free/cuda/7.5.18/
#CUTT_DIR = cutt/lib
GPU_ARCH = sm_37
OS:= $(shell uname)
ifeq ($(OS),Darwin)
CUDA_LIB = $(CUDA_HOME)/lib
CUDA_HEADER = $(CUDA_HOME)/include
CC = $(CUDA_HOME)/bin/nvcc -ccbin /usr/bin/clang --ptxas-options=-v#-save-temps
CFLAGS = -g -std=c++11 -Wno-deprecated-gpu-targets

[ssg37927@cs04r-sc-com14-05 GPUE]$ make
/dls_sw/apps/cuda/8.0/bin/nvcc --ptxas-options=-v --compiler-options -Wall -c -o fileIO.o -I/dls_sw/apps/cuda/8.0/include -g -O0 -std=c++11 -Xcompiler '-std=c++11' -Xcompiler '-fopenmp' -L/dls_sw/apps/cuda/8.0/lib64 -lcufft -Xcompiler "-fopenmp" -arch=sm_37 src/fileIO.cu -dc
nvcc warning : The -std=c++11 flag is not supported with the configured host compiler. Flag will be ignored.
cc1plus: error: unrecognized command line option "-std=c++11"
make: *** [fileIO.o] Error 1

from gpue.

mlxd commented on September 11, 2024

GCC 4.4.x is quite an old compiler, so I expect there to be problems with the C++11 requirements in the build process. Even if there is some compilation, the String ABI differences may cause problems between GCC 5.2-ish introduced changes (though again, this is something I may be wrong with). It may be that we have to specify a minimum compiler version in this case, such as 4.8/4.9. Is there any way for you to maybe use of a newer GCC version? If not, we may need to consider giving a recommendation to use Spack as an optional means to get a newer GCC version, though this is just an option.

Conda can also provide GCC 7.2 as an option, or we can opt for Clang as the host compiler too, though build-from-scratch can be quite a long process.

from gpue.

mlxd commented on September 11, 2024

Just as a note, most development I have used most recent version of GCC , or Clang as the host compilers. I am curious as to the errors seen with higher GCC versions. We have mostly used Centos 7, Ubuntu 16.04, or Arch Linux, which tend to have GCC 4.8, 6+, and 8+ respectively.

from gpue.

markbasham commented on September 11, 2024

tell me about it :) RHEL6 is very old but very stable apparently.....

However it works fine with gcc 4.9.3, could that be added to the build requirements, then this can be closed :)

Cheers

Mark

from gpue.

mlxd commented on September 11, 2024

I can certainly appreciate stability. Arch Linux on the other hand has sometimes caused me problems for looking at it the wrong way.

Awesome, thanks for the detailed logs, they were really helpful to pin things down. I'll make the changes to the documentation in the morning, and recommend GCC 4.9.3 as a minimum recommended version with RH releases.

I'll do some testing of this tomorrow too with GCC versions on Arch just to confirm everything, and maybe @leios would be fine with likewise on Centos? I'd be happy to detail the process and results with what works and what does not.

Thanks again for the help @markbasham
We will get the final approval from you after this testing before closing this.

P.S. I recall @mgalloy also stating build issues on a Mac, which we may also be able to take care of in this instance (hopefully). If not, we can open a separate issue for that.

from gpue.

leios commented on September 11, 2024

This error came about due to a recent change to the CMakeLists.txt file for building out of tree. Basically, building the unit tests requires us to test the fileIO, which means we need to specify the directory of an input file. This meant that we needed special build instructions for that file and we forgot to specify c++11 for that.

This has been fixed for gcc4.8 as of the latest commit. I will see if I can find a gcc4.4 build around to try with too. I think we also have macs to try building on with nvidia gpu's somewhere in the lab, so I will give them a go too.

from gpue.

mgalloy commented on September 11, 2024

On my Mojave (10.14) Mac, I was getting the following error on configure:

nvcc fatal : The version ('10.0') of the host compiler ('Apple clang') is not supported

I downgraded to Command Line Tools for Xcode 7.3 because of comments in some Apple forums. This let me configure and compile, but linking causes the following error:

[ 95%] Linking CUDA device code CMakeFiles/gpue.dir/cmake_device_link.o
nvcc fatal : Unknown option 'Wl,-rpath,/usr/local/cuda/lib'

Attaching CMakeCache.txt file and the log of the build process.

from gpue.

leios commented on September 11, 2024

Ah, thanks! I just got this same error on an old mac I found in the lab. I'll see if we can figure this out.

from gpue.

leios commented on September 11, 2024

@mgalloy I am updating my mac right now (it looks to be a problem with some command line tools, so I am getting the latest versions of everything).

I think this is a general CUDA error and not a problem with GPUE building. Can you compile any CUDA code? For example:

nvcc file.cu

#include <iostream>
#include <math.h>

__global__ void vecAdd(double *a, double *b, double *c, int n){

    // First we need to find our global threadID
    int id = blockIdx.x*blockDim.x + threadIdx.x;

    // Make sure we are not out of range
    if (id < n){
        c[id] = a[id] + b[id];
    }
}

int main(){

    // size of vectors
    int n = 1000;

    // Host vectors
    double *h_a, *h_b, *h_c;

    // Device vectors
    double *d_a, *d_b, *d_c;

    // allocating space on host and device
    h_a = (double*)malloc(sizeof(double)*n);
    h_b = (double*)malloc(sizeof(double)*n);
    h_c = (double*)malloc(sizeof(double)*n);

    // Allocating space on GPU
    cudaMalloc(&d_a, sizeof(double)*n);
    cudaMalloc(&d_b, sizeof(double)*n);
    cudaMalloc(&d_c, sizeof(double)*n);

    //initializing host vectors
    for (int i = 0; i < n; ++i){
        h_a[i] = 1;
        h_b[i] = 1;
    }

    // copying these components to the GPU
    cudaMemcpy(d_a, h_a, sizeof(double)*n, cudaMemcpyHostToDevice);
    cudaMemcpy(d_b, h_b, sizeof(double)*n, cudaMemcpyHostToDevice);

    // Creating blocks and grid ints
    int threads, grid;

    threads = 64;
    grid = (int)ceil((float)n/threads);

    vecAdd<<<grid, threads>>>(d_a, d_b, d_c, n);

    // Now to copy c back
    cudaMemcpy(h_c, d_c, sizeof(double)*n, cudaMemcpyDeviceToHost);

    double sum = 0;
    for (int i = 0; i < n; ++i){
        sum += h_c[i];
    }

    std::cout << "Sum is: " << sum << '\n';

    // Release memory
    cudaFree(d_a);
    cudaFree(d_b);
    cudaFree(d_c);

    free(h_a);
    free(h_b);
    free(h_c);
}

from gpue.

mlxd commented on September 11, 2024

On the Linux side, GCC4.9 and above seem to work fine with CUDA 8 and so will pin these as the minimum compiler versions required (4.8 also works for me, but I will pin it to @markbasham 's tested version ). These changes have been added here: https://gpue-group.github.io/build/

from gpue.

leios commented on September 11, 2024

@mgalloy I was working with the mac we have in the office which had a similar error. In our case, it was a problem with conflicting xcode and cuda requirements (last comment here: pytorch/pytorch#3047).

With CUDA 10, you will need xcode version 9.4. After this, try compiling the test code I provided. It should give an answer of 2000 (it's just a vector sum to test a simple kernel). After verifying that, I was able to build GPUE and run it on my machine; however, sm_20 has been deprecated in cuda 10, and that was the architecture of the card on the mac. I could not install xcode < 9.4 on Mojave, so I don't think I can use an older CUDA version to test out older architecture on this mac.

from gpue.

mlxd commented on September 11, 2024

I am pretty sure those 2012 iMacs had GTX675M (or similar) cards, which would be Kepler generation (ie sm_3x). I think it should still be fine to build and run on that (please feel free to correct me if I am wrong).

from gpue.

leios commented on September 11, 2024

@mgalloy Further updates on CUDA and macOS 10.14: https://devtalk.nvidia.com/default/topic/1042279/cuda-setup-and-installation/cuda-10-and-macos-10-14/

It seems that Mojave does not support CUDA at this time. The 410.130 cuda driver seems to only support 10.13 (High Sierra). On my end, the compilation worked and GPUE should have run after the xcode reversion; however, it doesn't seem that any cuda code runs correctly with cuda 10 (and cuda 9 is not possible because of the xcode conflicts mentioned before).

I am not sure where to go from here, but I'll see if there's anything else to do to get this working on the latest mac release.

from gpue.

mgalloy commented on September 11, 2024

I'm looking into using our supercomputer, there are nodes with Teslas which I believe I can use.

from gpue.

leios commented on September 11, 2024

I believe the current difficulties have been resolved. Please reopen if this is not the case.

from gpue.

Difficulties with the build process about gpue HOT 16 CLOSED

Comments (16)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent