Comments (12)
ps: i found that In CUDA 5.0 and CUDA 5.5, the CUBLAS routine SGEMM() for operations NN and NT can give wrong results on Kepler Architecture SM35 when the following conditions are met :
4 * ldc * n >= 2^32 and m >= 256
where m, n, and ldc are respectively the number of rows, the number of columns, and the leading dimension of the resulting matrix C. [http://docs.nvidia.com/cuda/cuda-toolkit-release-notes/, http://nvlabs.github.io/moderngpu/performance.html](btw i run an old fx4800) i do not know, maybe this input is helping.
from somoclu.
The cstdlib dependency was added, thank you for spotting the problem.
Both CUDA 5.0 and 5.5 work fine on Fermi architecture. Unfortunately I
do not have access to Kepler hardware, so I am unable to do any kind of
testing. The configure.in file has the following lines:
if test nvcc --version|grep release|awk '{print $5}'|cut -d. -f1
-ge 5
; then
GENCODE_SM30="-gencode arch=compute_30,code=sm_30 -gencode
arch=compute_35,code=sm_35"
fi
If you remove these lines, only Compute Capability 2.0 code will be
generated by NVCC, which might just work on Kepler. It is certainly not
an optimal solution, but let me know if it works.
Thanks again.
On 2014-03-18 01:41, standfest wrote:
ps: i found that In CUDA 5.0 and CUDA 5.5, the CUBLAS routine SGEMM()
for operations NN and NT can give wrong results on Kepler Architecture
SM35 when the following conditions are met :
4 * ldc * n >= 2^32 and m >= 256
where m, n, and ldc are respectively the number of rows, the number
of columns, and the leading dimension of the resulting matrix C.
http://docs.nvidia.com/cuda/cuda-toolkit-release-notes/,
http://nvlabs.github.io/moderngpu/performance.html [1] i do not know,
maybe this input is helping.Reply to this email directly or view it on GitHub [2].
Links:
[1] http://sg161.singhost.net/btw%20i%20run%20an%20old%20fx4800
[2]
#1 (comment)
from somoclu.
thanks for your response. digging further into the problem (and changing my hardware back to a tesla c2070) i still struggle with this linking problem while compiling:
make -C src all
make[1]: Entering directory `/home/standfem/Downloads/peterwittek-somoclu-f9336f2/src'
/usr/local/cuda//bin/nvcc -DHAVE_CONFIG_H -I/usr/local/cuda//include -use_fast_math -gencode arch=compute_10,code=sm_10 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -Xcompiler "-O3 -fPIC -fopenmp" -I/usr/lib64/openmpi//include -I. -I.. -o denseGpuKernels.cu.co -c ./denseGpuKernels.cu
/usr/lib64/openmpi/bin//mpic++ -DHAVE_CONFIG_H -O3 -fPIC -fopenmp -L/usr/local/cuda//lib64 -L/usr/lib64/openmpi//lib -o somoclu sparseCpuKernels.o io.o denseCpuKernels.o mapDistanceFunctions.o training.o denseGpuKernels.cu.co somoclu.o -lcudart -lcublas -lmpi
somoclu.o: In function `main':
somoclu.cpp:(.text.startup+0x56c): undefined reference to `setDevice'
collect2: error: ld returned 1 exit status
make[1]: *** [somoclu] Error 1
make[1]: Leaving directory `/home/standfem/Downloads/peterwittek-somoclu-f9336f2/src'
make: *** [all] Error 2
do you have any ideas what to do? maybe my configure output helps:
./configure --with-mpi-compilers=/usr/lib64/openmpi/bin/ --with-mpi=/usr/lib64/openmpi/ --with-cuda=/usr/local/cuda/
checking for g++... g++
checking whether the C++ compiler works... yes
checking for C++ compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C++ compiler... yes
checking whether g++ accepts -g... yes
checking for a BSD-compatible install... /usr/bin/install -c
checking for gcc... gcc
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking for gcc option to support OpenMP... -fopenmp
checking for nvcc... yes
checking MPI C++ compiler in /usr/lib64/openmpi/bin/... /usr/lib64/openmpi/bin//mpic++
checking MPI directory... /usr/lib64/openmpi/
checking how to run the C++ preprocessor... g++ -E
checking whether special compile flag for MPICH is required... no
configure: creating ./config.status
config.status: creating Makefile
config.status: creating src/Makefile
config.status: creating config.h
config.status: config.h is unchanged
-------------------------------------------------
Somoclu Version 1.2
Prefix: /usr/local.
Compiler: /usr/lib64/openmpi/bin//mpic++ -O3 -fPIC -fopenmp -I/usr/lib64/openmpi//include -L/usr/lib64/openmpi//lib -lmpi
Package features:
OpenMP enabled: yes
MPI enabled: yes
CUDA enabled: yes
Now type 'make [<target>]'
where the optional <target> is:
all - build all binaries
install - install everything
--------------------------------------------------
from somoclu.
There was a logical flaw with the preprocessor statements in the
setDevice function when CUDA was enabled but MPI was not. It was
corrected. This function is necessary even a single-GPU configuration,
as it contains a cudaSetDevice call, without which the GPU context may
or may not be initialized. Could you try the update?
from somoclu.
Thanks, now it is compiling without complaining - but with a persistent linking flaw:
somoclu -x 400 -y 300 file folder -e 20 -k 1
somoclu: error while loading shared libraries: libcudart.so.5.5: cannot open shared object file: No such file or directory
if i set
export LD_LIBRARY_PATH=/usr/local/cuda/lib64
export PATH=$PATH:/usr/local/cuda/bin
i cannot find libmpi.so.1 and vice versa. Maybe including the path in the β-rpathβ linker option could help - sadly i am a c++ noob and so far all my approaches in modifying the makefile fail. Any hints?
from somoclu.
As for the MPI dependency, do not worry about it, unless you have more
than one GPU or more than one node. Just disable MPI with the configure
script.
Not finding the CUDA libraries is more troubling. You said earlier that
you tried both CUDA 5.0 and 5.5. Just to double check a trivial error,
is 5.5 the version sitting in /usr/local/cuda?
If the error persist, please post again the parameters for the configure
script.
Thanks and apologies for the delay.
from somoclu.
originally i had a symbolic link called CUDA pointing to CUDA-5.5, but now i deleted it and renamed CUDA-5.5 to CUDA. additionally i set the --without-mpi flag and was able to compile and run without the linking flaw - as long as i set
export LD_LIBRARY_PATH=/usr/local/cuda/lib64
export PATH=$PATH:/usr/local/cuda/bin
unfortunately i get this when testing it with -k 1 (0 and 2 are working, but there is a long waiting period after the final training iteration - i cannot imagine saving the data is taking so long, or is it?)
somoclu -x 400 -y 300 file folder -e 20 -k 1
nVectors: 417 nVectorsPerRank: 417 nDimensions: 0
Epoch: 0 Radius: 200
** On entry to SGEMM parameter number 8 had an illegal value
!!!! kernel execution error.
Aborted
terminate called after throwing an instance of 'thrust::system::system_error'
what(): unload of CUDA runtime failed
Aborted (core dumped)
so back to square one. at least here my log:
$ ./configure --without-mpi
checking for g++... g++
checking whether the C++ compiler works... yes
checking for C++ compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C++ compiler... yes
checking whether g++ accepts -g... yes
checking for a BSD-compatible install... /usr/bin/install -c
checking for gcc... gcc
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking for gcc option to support OpenMP... -fopenmp
checking for nvcc... yes
./configure: line 3263: nvcc: command not found
./configure: line 3263: test: -ge: unary operator expected
configure: creating ./config.status
config.status: creating Makefile
config.status: creating src/Makefile
config.status: creating config.h
config.status: config.h is unchanged
-------------------------------------------------
Somoclu Version 1.2
Prefix: /usr/local.
Compiler: g++ -O3 -fPIC -fopenmp
Package features:
OpenMP enabled: yes
MPI enabled: no
CUDA enabled: yes
Now type 'make [<target>]'
where the optional <target> is:
all - build all binaries
install - install everything
--------------------------------------------------
$ make
make -C src all
make[1]: Entering directory `/home/standfem/Downloads/peterwittek-somoclu-f9336f2/src'
g++ -DHAVE_CONFIG_H -O3 -fPIC -fopenmp -I. -I.. -o sparseCpuKernels.o -c ./sparseCpuKernels.cpp
g++ -DHAVE_CONFIG_H -O3 -fPIC -fopenmp -I. -I.. -o io.o -c ./io.cpp
g++ -DHAVE_CONFIG_H -O3 -fPIC -fopenmp -I. -I.. -o denseCpuKernels.o -c ./denseCpuKernels.cpp
g++ -DHAVE_CONFIG_H -O3 -fPIC -fopenmp -I. -I.. -o mapDistanceFunctions.o -c ./mapDistanceFunctions.cpp
g++ -DHAVE_CONFIG_H -O3 -fPIC -fopenmp -I. -I.. -o training.o -c ./training.cpp
/usr/local/cuda/bin/nvcc -DHAVE_CONFIG_H -I/usr/local/cuda/include -use_fast_math -gencode arch=compute_10,code=sm_10 -gencode arch=compute_20,code=sm_20 -Xcompiler "-O3 -fPIC -fopenmp" -I. -I.. -o denseGpuKernels.cu.co -c ./denseGpuKernels.cu
g++ -DHAVE_CONFIG_H -O3 -fPIC -fopenmp -I. -I.. -o somoclu.o -c ./somoclu.cpp
g++ -DHAVE_CONFIG_H -O3 -fPIC -fopenmp -L/usr/local/cuda/lib64 -o somoclu sparseCpuKernels.o io.o denseCpuKernels.o mapDistanceFunctions.o training.o denseGpuKernels.cu.co somoclu.o -lcudart -lcublas
make[1]: Leaving directory `/home/standfem/Downloads/peterwittek-somoclu-f9336f2/src'
$ sudo make install
make -C src install
make[1]: Entering directory `/home/standfem/Downloads/peterwittek-somoclu-f9336f2/src'
/usr/bin/install -c -d /usr/local/bin
/usr/bin/install -c -m 0755 somoclu \
/usr/local/bin
make[1]: Leaving directory `/home/standfem/Downloads/peterwittek-somoclu-f9336f2/src'
thank you for thinking about it!
from somoclu.
I want to find out what goes wrong here. I will install a Fedora on my
laptop over the weekend to reproduce your error. Playing with a live
Fedora 20 distribution today, I noticed that it is an incredible pain to
get the proprietary driver and CUDA working. Since deviceQuery works for
you, I assume that CUDA is otherwise operational.
from somoclu.
I cannot reproduce the problem. I started with a plain vanilla Fedora 20 install. Then I followed these instructions to get the proprietary driver working:
http://www.if-not-true-then-false.com/2014/fedora-20-nvidia-guide/
yum update kernel* selinux-policy*
reboot
yum localinstall --nogpgcheck http://download1.rpmfusion.org/free/fedora/rpmfusion-free-release-$(rpm -E %fedora).noarch.rpm http://download1.rpmfusion.org/nonfree/fedora/rpmfusion-nonfree-release-$(rpm -E %fedora).noarch.rpm
yum install akmod-nvidia xorg-x11-drv-nvidia-libs kernel-devel acpid
Genuine weird stuff was going on with the initramfs, but eventually os-detect on Arch Linux figured out the correct boot configuration, blacklisted nouveau, and I had the Nvidia driver working:
lsmod|grep nvidia
nvidia 10686781 44
drm 283937 4 nvidia
i2c_core 38476 4 drm,i2c_i801,nvidia,videodev
Then I followed the instructions here:
http://fedoraproject.org/wiki/Cuda
I installed the prerequisites, also adding git, automake, and perl-Env:
yum install wget make gcc-c++ freeglut-devel libXi-devel libXmu-devel mesa-libGLU-devel git perl-Env automake
Then I switched over to these instructions for CUDA 5.5:
http://hobiger.org/blog/2013/12/19/fedora-20-and-cuda/
issuing the command
sh cuda_5.5.22_linux_64.run -override
I accepted the EULA, said yes to attempting the install on an unsupported configuration, did not install the drivers, said yes to installing, the path was /opt/cuda, and the CUDA samples were also installed to the default location ($HOME/NVIDIA_CUDA-5.5_Samples].
After compiling deviceQuery, it complained that the driver did not support this CUDA version. I downloaded the latest driver and installed it:
systemctl stop gdm
sh NVIDIA-Linux-x86_64-331.49.run
reboot
After this, deviceQuery reported my GPU, an old 330M with Compute Capability 1.2.
I cloned and compiled the git version of Somoclu:
git clone https://github.com/peterwittek/somoclu
cd somoclu
./autogen.sh
./configure --without-mpi
make -s
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/cuda/lib64
src/somoclu -k 1 data/rgbs.txt data/gpu_test
A memory deallocation glitch crept in yesterday, I fixed it. Otherwise, it runs without problems. So I do not know what could be the issue on your machine.
from somoclu.
thanks for trying. i will look into my machine the day after tomorrow, maybe i'm going to reset it. i will update you on any findings.
from somoclu.
after all i finally found the time to redo the whole installation again, and now it worked quite well. the only thing not found instantly was libcudart.so.6 (apparently others have this problem too http://stackoverflow.com/questions/10808958/why-cant-libcudart-so-4-be-found-when-compiling-the-cuda-samples-under-ubuntu ) but following line helped:
sudo ldconfig /usr/local/cuda/lib64
thank you again for all your help and of course for your library,
cheers
matthias
from somoclu.
I am glad it finally works.
Peter
from somoclu.
Related Issues (20)
- /home/docker/R/Rsomoclu/libs/Rsomoclu.so: undefined symbol: _ZTI8Snapshot
- Licensing and GPL HOT 6
- MATLAB interfece Batch algorithm HOT 4
- (core dumped) HOT 3
- Attempting to use an MPI routine before initializing MPI HOT 4
- single dimensional clustering
- upgradation problem HOT 1
- errors importing native C library (python 3.8.6 & swig 4.0.1) HOT 4
- update official repos (pypi and conda) HOT 3
- TypeError: train expected 23 arguments, got 22 HOT 2
- Can I set the random seed? HOT 4
- Get bmu of testing data HOT 2
- Numpy requirement in setup.py HOT 5
- conda-forge PackagesNotFoundError HOT 6
- How can I assign a cluster to new data? HOT 1
- Batch mode and learning rate HOT 3
- Can't build wheel with somoclu and pip 23.1 HOT 2
- Warning: the binary library cannot be imported. You cannot train maps, but you can load and analyze ones that you have already saved. If you installed Somoclu with pip on Windows, this typically means missing DLLs. Please refer to the documentation. HOT 4
- linux/arm64 for conda-forge
- About UMatrix visualization (question)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from somoclu.