Coder Social home page Coder Social logo

Comments (12)

standfest avatar standfest commented on July 18, 2024

ps: i found that In CUDA 5.0 and CUDA 5.5, the CUBLAS routine SGEMM() for operations NN and NT can give wrong results on Kepler Architecture SM35 when the following conditions are met :
4 * ldc * n >= 2^32 and m >= 256
where m, n, and ldc are respectively the number of rows, the number of columns, and the leading dimension of the resulting matrix C. [http://docs.nvidia.com/cuda/cuda-toolkit-release-notes/, http://nvlabs.github.io/moderngpu/performance.html](btw i run an old fx4800) i do not know, maybe this input is helping.

from somoclu.

peterwittek avatar peterwittek commented on July 18, 2024

The cstdlib dependency was added, thank you for spotting the problem.

Both CUDA 5.0 and 5.5 work fine on Fermi architecture. Unfortunately I
do not have access to Kepler hardware, so I am unable to do any kind of
testing. The configure.in file has the following lines:

if test nvcc --version|grep release|awk '{print $5}'|cut -d. -f1 -ge 5
; then
GENCODE_SM30="-gencode arch=compute_30,code=sm_30 -gencode
arch=compute_35,code=sm_35"
fi

If you remove these lines, only Compute Capability 2.0 code will be
generated by NVCC, which might just work on Kepler. It is certainly not
an optimal solution, but let me know if it works.

Thanks again.

On 2014-03-18 01:41, standfest wrote:

ps: i found that In CUDA 5.0 and CUDA 5.5, the CUBLAS routine SGEMM()
for operations NN and NT can give wrong results on Kepler Architecture
SM35 when the following conditions are met :
4 * ldc * n >= 2^32 and m >= 256
where m, n, and ldc are respectively the number of rows, the number
of columns, and the leading dimension of the resulting matrix C.
http://docs.nvidia.com/cuda/cuda-toolkit-release-notes/,
http://nvlabs.github.io/moderngpu/performance.html [1] i do not know,
maybe this input is helping.

Reply to this email directly or view it on GitHub [2].

Links:

[1] http://sg161.singhost.net/btw%20i%20run%20an%20old%20fx4800
[2]
#1 (comment)

from somoclu.

standfest avatar standfest commented on July 18, 2024

thanks for your response. digging further into the problem (and changing my hardware back to a tesla c2070) i still struggle with this linking problem while compiling:

make -C src all
make[1]: Entering directory `/home/standfem/Downloads/peterwittek-somoclu-f9336f2/src'
/usr/local/cuda//bin/nvcc -DHAVE_CONFIG_H -I/usr/local/cuda//include -use_fast_math -gencode arch=compute_10,code=sm_10 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -Xcompiler "-O3 -fPIC -fopenmp" -I/usr/lib64/openmpi//include -I. -I.. -o denseGpuKernels.cu.co -c ./denseGpuKernels.cu
/usr/lib64/openmpi/bin//mpic++ -DHAVE_CONFIG_H -O3 -fPIC -fopenmp -L/usr/local/cuda//lib64 -L/usr/lib64/openmpi//lib -o somoclu sparseCpuKernels.o io.o denseCpuKernels.o mapDistanceFunctions.o training.o denseGpuKernels.cu.co somoclu.o  -lcudart -lcublas -lmpi
somoclu.o: In function `main':
somoclu.cpp:(.text.startup+0x56c): undefined reference to `setDevice'
collect2: error: ld returned 1 exit status
make[1]: *** [somoclu] Error 1
make[1]: Leaving directory `/home/standfem/Downloads/peterwittek-somoclu-f9336f2/src'
make: *** [all] Error 2

do you have any ideas what to do? maybe my configure output helps:

./configure --with-mpi-compilers=/usr/lib64/openmpi/bin/ --with-mpi=/usr/lib64/openmpi/ --with-cuda=/usr/local/cuda/
checking for g++... g++
checking whether the C++ compiler works... yes
checking for C++ compiler default output file name... a.out
checking for suffix of executables... 
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C++ compiler... yes
checking whether g++ accepts -g... yes
checking for a BSD-compatible install... /usr/bin/install -c
checking for gcc... gcc
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking for gcc option to support OpenMP... -fopenmp
checking for nvcc... yes
checking MPI C++ compiler in /usr/lib64/openmpi/bin/... /usr/lib64/openmpi/bin//mpic++
checking MPI directory... /usr/lib64/openmpi/
checking how to run the C++ preprocessor... g++ -E
checking whether special compile flag for MPICH is required... no
configure: creating ./config.status
config.status: creating Makefile
config.status: creating src/Makefile
config.status: creating config.h
config.status: config.h is unchanged
-------------------------------------------------

 Somoclu Version 1.2

 Prefix: /usr/local.
 Compiler: /usr/lib64/openmpi/bin//mpic++ -O3 -fPIC -fopenmp -I/usr/lib64/openmpi//include -L/usr/lib64/openmpi//lib -lmpi

 Package features:
   OpenMP enabled: yes
   MPI enabled: yes
   CUDA enabled: yes

 Now type 'make [<target>]'
   where the optional <target> is:
     all                - build all binaries
     install            - install everything

--------------------------------------------------

from somoclu.

peterwittek avatar peterwittek commented on July 18, 2024

There was a logical flaw with the preprocessor statements in the
setDevice function when CUDA was enabled but MPI was not. It was
corrected. This function is necessary even a single-GPU configuration,
as it contains a cudaSetDevice call, without which the GPU context may
or may not be initialized. Could you try the update?

from somoclu.

standfest avatar standfest commented on July 18, 2024

Thanks, now it is compiling without complaining - but with a persistent linking flaw:

somoclu -x 400 -y 300 file folder -e 20 -k 1
somoclu: error while loading shared libraries: libcudart.so.5.5: cannot open shared object file: No such file or directory

if i set

export LD_LIBRARY_PATH=/usr/local/cuda/lib64
export PATH=$PATH:/usr/local/cuda/bin

i cannot find libmpi.so.1 and vice versa. Maybe including the path in the β€˜-rpath’ linker option could help - sadly i am a c++ noob and so far all my approaches in modifying the makefile fail. Any hints?

from somoclu.

peterwittek avatar peterwittek commented on July 18, 2024

As for the MPI dependency, do not worry about it, unless you have more
than one GPU or more than one node. Just disable MPI with the configure
script.

Not finding the CUDA libraries is more troubling. You said earlier that
you tried both CUDA 5.0 and 5.5. Just to double check a trivial error,
is 5.5 the version sitting in /usr/local/cuda?

If the error persist, please post again the parameters for the configure
script.

Thanks and apologies for the delay.

from somoclu.

standfest avatar standfest commented on July 18, 2024

originally i had a symbolic link called CUDA pointing to CUDA-5.5, but now i deleted it and renamed CUDA-5.5 to CUDA. additionally i set the --without-mpi flag and was able to compile and run without the linking flaw - as long as i set

export LD_LIBRARY_PATH=/usr/local/cuda/lib64
export PATH=$PATH:/usr/local/cuda/bin

unfortunately i get this when testing it with -k 1 (0 and 2 are working, but there is a long waiting period after the final training iteration - i cannot imagine saving the data is taking so long, or is it?)

somoclu -x 400 -y 300 file folder -e 20 -k 1
nVectors: 417 nVectorsPerRank: 417 nDimensions: 0 
Epoch: 0 Radius: 200
 ** On entry to SGEMM  parameter number 8 had an illegal value
!!!! kernel execution error.
Aborted
terminate called after throwing an instance of 'thrust::system::system_error'
  what():  unload of CUDA runtime failed
Aborted (core dumped)

so back to square one. at least here my log:

$ ./configure --without-mpi
checking for g++... g++
checking whether the C++ compiler works... yes
checking for C++ compiler default output file name... a.out
checking for suffix of executables... 
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C++ compiler... yes
checking whether g++ accepts -g... yes
checking for a BSD-compatible install... /usr/bin/install -c
checking for gcc... gcc
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking for gcc option to support OpenMP... -fopenmp
checking for nvcc... yes
./configure: line 3263: nvcc: command not found
./configure: line 3263: test: -ge: unary operator expected
configure: creating ./config.status
config.status: creating Makefile
config.status: creating src/Makefile
config.status: creating config.h
config.status: config.h is unchanged
-------------------------------------------------

 Somoclu Version 1.2

 Prefix: /usr/local.
 Compiler: g++ -O3 -fPIC -fopenmp   

 Package features:
   OpenMP enabled: yes
   MPI enabled: no
   CUDA enabled: yes

 Now type 'make [<target>]'
   where the optional <target> is:
     all                - build all binaries
     install            - install everything

--------------------------------------------------
$ make
make -C src all
make[1]: Entering directory `/home/standfem/Downloads/peterwittek-somoclu-f9336f2/src'
g++ -DHAVE_CONFIG_H -O3 -fPIC -fopenmp  -I. -I.. -o sparseCpuKernels.o -c ./sparseCpuKernels.cpp
g++ -DHAVE_CONFIG_H -O3 -fPIC -fopenmp  -I. -I.. -o io.o -c ./io.cpp
g++ -DHAVE_CONFIG_H -O3 -fPIC -fopenmp  -I. -I.. -o denseCpuKernels.o -c ./denseCpuKernels.cpp
g++ -DHAVE_CONFIG_H -O3 -fPIC -fopenmp  -I. -I.. -o mapDistanceFunctions.o -c ./mapDistanceFunctions.cpp
g++ -DHAVE_CONFIG_H -O3 -fPIC -fopenmp  -I. -I.. -o training.o -c ./training.cpp
/usr/local/cuda/bin/nvcc -DHAVE_CONFIG_H -I/usr/local/cuda/include -use_fast_math -gencode arch=compute_10,code=sm_10 -gencode arch=compute_20,code=sm_20  -Xcompiler "-O3 -fPIC -fopenmp"  -I. -I.. -o denseGpuKernels.cu.co -c ./denseGpuKernels.cu
g++ -DHAVE_CONFIG_H -O3 -fPIC -fopenmp  -I. -I.. -o somoclu.o -c ./somoclu.cpp
g++ -DHAVE_CONFIG_H -O3 -fPIC -fopenmp -L/usr/local/cuda/lib64  -o somoclu sparseCpuKernels.o io.o denseCpuKernels.o mapDistanceFunctions.o training.o denseGpuKernels.cu.co somoclu.o  -lcudart -lcublas 
make[1]: Leaving directory `/home/standfem/Downloads/peterwittek-somoclu-f9336f2/src'
$ sudo make install
make -C src install
make[1]: Entering directory `/home/standfem/Downloads/peterwittek-somoclu-f9336f2/src'
/usr/bin/install -c -d /usr/local/bin
/usr/bin/install -c -m 0755 somoclu \
 /usr/local/bin
make[1]: Leaving directory `/home/standfem/Downloads/peterwittek-somoclu-f9336f2/src'

thank you for thinking about it!

from somoclu.

peterwittek avatar peterwittek commented on July 18, 2024

I want to find out what goes wrong here. I will install a Fedora on my
laptop over the weekend to reproduce your error. Playing with a live
Fedora 20 distribution today, I noticed that it is an incredible pain to
get the proprietary driver and CUDA working. Since deviceQuery works for
you, I assume that CUDA is otherwise operational.

from somoclu.

peterwittek avatar peterwittek commented on July 18, 2024

I cannot reproduce the problem. I started with a plain vanilla Fedora 20 install. Then I followed these instructions to get the proprietary driver working:

http://www.if-not-true-then-false.com/2014/fedora-20-nvidia-guide/

yum update kernel* selinux-policy*
reboot
yum localinstall --nogpgcheck http://download1.rpmfusion.org/free/fedora/rpmfusion-free-release-$(rpm -E %fedora).noarch.rpm http://download1.rpmfusion.org/nonfree/fedora/rpmfusion-nonfree-release-$(rpm -E %fedora).noarch.rpm
yum install akmod-nvidia xorg-x11-drv-nvidia-libs kernel-devel acpid

Genuine weird stuff was going on with the initramfs, but eventually os-detect on Arch Linux figured out the correct boot configuration, blacklisted nouveau, and I had the Nvidia driver working:

lsmod|grep nvidia
nvidia              10686781  44 
drm                   283937  4 nvidia
i2c_core               38476  4 drm,i2c_i801,nvidia,videodev

Then I followed the instructions here:

http://fedoraproject.org/wiki/Cuda

I installed the prerequisites, also adding git, automake, and perl-Env:

yum install wget make gcc-c++ freeglut-devel libXi-devel libXmu-devel mesa-libGLU-devel git perl-Env automake

Then I switched over to these instructions for CUDA 5.5:

http://hobiger.org/blog/2013/12/19/fedora-20-and-cuda/

issuing the command

sh cuda_5.5.22_linux_64.run -override

I accepted the EULA, said yes to attempting the install on an unsupported configuration, did not install the drivers, said yes to installing, the path was /opt/cuda, and the CUDA samples were also installed to the default location ($HOME/NVIDIA_CUDA-5.5_Samples].

After compiling deviceQuery, it complained that the driver did not support this CUDA version. I downloaded the latest driver and installed it:

systemctl stop gdm
sh NVIDIA-Linux-x86_64-331.49.run
reboot

After this, deviceQuery reported my GPU, an old 330M with Compute Capability 1.2.

I cloned and compiled the git version of Somoclu:

git clone https://github.com/peterwittek/somoclu
cd somoclu
./autogen.sh
./configure --without-mpi
make -s
export  LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/cuda/lib64
src/somoclu -k 1 data/rgbs.txt data/gpu_test

A memory deallocation glitch crept in yesterday, I fixed it. Otherwise, it runs without problems. So I do not know what could be the issue on your machine.

from somoclu.

standfest avatar standfest commented on July 18, 2024

thanks for trying. i will look into my machine the day after tomorrow, maybe i'm going to reset it. i will update you on any findings.

from somoclu.

standfest avatar standfest commented on July 18, 2024

after all i finally found the time to redo the whole installation again, and now it worked quite well. the only thing not found instantly was libcudart.so.6 (apparently others have this problem too http://stackoverflow.com/questions/10808958/why-cant-libcudart-so-4-be-found-when-compiling-the-cuda-samples-under-ubuntu ) but following line helped:

sudo ldconfig /usr/local/cuda/lib64

thank you again for all your help and of course for your library,
cheers
matthias

from somoclu.

peterwittek avatar peterwittek commented on July 18, 2024

I am glad it finally works.

Peter

from somoclu.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.