cumf / cumf_als Goto Github PK

CUDA Matrix Factorization Library with Alternating Least Square (ALS)

License: Apache License 2.0

Makefile 4.79% Cuda 71.05% C 14.03% Shell 0.96% C++ 5.38% Jupyter Notebook 1.02% Python 2.76%

gpu matrix-factorization als machine machine-learning cuda

cumf_als's Issues

Issue: ./als_tf.so: undefined symbol: _ZTIN10tensorflow8OpKernelE

Hi, I am trying to run the TensorFlow example.
I first run the build script, which builds OK and creates als_tf.so and libALS.so in the current directory, but when I execute "cumf_as_tensorflow_ops_test.py" it says:

tensorflow.python.framework.errors.NotFoundError: libALS.so: cannot open shared object file: No such file or directory

I notice that in a previous closed issue " shall we provide a python interface? #1", somebody mentioned this issue, so that I follow his instruction to move the libraries to /usr/lib :

*cp .so /usr/lib

and then running it generates the following error:

File "cumf_as_tensorflow_ops_test.py", line 25, in
als_module = tf.load_op_library(lib_path)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/load_library.py", line 56, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename, status)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 473, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: ./als_tf.so: undefined symbol: _ZTIN10tensorflow8OpKernelE

Can you please give me a clue what went wrong?

Yahoo Music dataset

Hi,
I downloaded the Yahoo music dataset and trying to run CUMF on it. I wrote my own script to sort and load the dataset. But sorting is taking too long and getting killed. Do you happen to have the binary or a efficient script to create the binaries?

Thanks for your time.

sgd?

it runs very fast, great work! just wonder did you try sgd, how much can it be optimized?

shall we provide a python interface?

py interface may be slower, but easier to use....

MemoryError

Hi,
We are trying to run your code on our machine as it is.
Our machine has a 16GB ram with free memory around 10GB.
Even then we are facing MemoryError in the following line:
train_j,train_i,train_rating = np.loadtxt(train_data_file,dtype=np.int32,skiprows=3,unpack=True)

Can you give us pointers on why and how we could be resolving this issue?
Thanks

Extracting Outputs

How and where do I extract X and Theta after convergence? Are these being written to a file on the host device?

als.cu(205): error: more than one instance of overloaded function "isnan" matches the argument list

Hi.
I am using Linux
I uncommented in als.cu the line #define SURPASS_NAN
and build by
make clean build

But encountered error messages as follows:

rm -f host_utilities.o device_utilities.o als.o main main.o cg.o
/usr/local/cuda/bin/nvcc -ccbin g++ -m64 -std=c++11 -Xcompiler -DADD_ -gencode arch=compute_35,code=sm_35 -gencode arch=compute_35,code=compute_35 -lineinfo -o host_utilities.o -c host_utilities.cpp
/usr/local/cuda/bin/nvcc -ccbin g++ -m64 -std=c++11 -Xcompiler -DADD_ -gencode arch=compute_35,code=sm_35 -gencode arch=compute_35,code=compute_35 -lineinfo -o device_utilities.o -c device_utilities.cu
/usr/local/cuda/bin/nvcc -ccbin g++ -m64 -std=c++11 -Xcompiler -DADD_ -gencode arch=compute_35,code=sm_35 -gencode arch=compute_35,code=compute_35 -lineinfo -o cg.o -c cg.cu
/usr/local/cuda/bin/nvcc -ccbin g++ -m64 -std=c++11 -Xcompiler -DADD_ -gencode arch=compute_35,code=sm_35 -gencode arch=compute_35,code=compute_35 -lineinfo -o als.o -c als.cu
als.cu(205): error: more than one instance of overloaded function "isnan" matches the argument list:
function "isnan(float)"
function "std::isnan(float)"
argument types are: (float)

als.cu(205): error: more than one instance of overloaded function "isnan" matches the argument list:
function "isnan(float)"
function "std::isnan(float)"
argument types are: (float)

2 errors detected in the compilation of "/tmp/tmpxft_00006c73_00000000-7_als.cpp1.ii".

Could you please help me?
Thanks

identifier "cusparseScsrmm2" is undefined

When I use the make command to build the project, an 'identifier "cusparseScsrmm2" is undefined' problem occurs. How do I solve it?

Provide a direct link to prepared netflix dataset

The readme provides this link:

https://ibm.box.com/s/5vmh77up8reodvihiq0ri66jltg9h4uh

Unforunately, it is difficult to fetch this data without invoking a web browser. Can you provide a direct link to the file that would be suitable to pass to wget or curl, for example?

Thanks!

Illegal memory access when k=100 for Netflix dataset

Hi,
Thanks for sharing the code.
I am using a k80 machine. The code works fine for k=10,20,..40, but not anything more than 60. Did you ever encounter that problem?
[ss@gpu04 CuMF]$ ./main 70 .058 1
F = 70, lambda = .058, THETA_BATCH = 1
*starting loading training and testing sets to host.
***start allocating memory on GPU...
***_start copying memory to GPU...
CUDA Error:
File = als.cu
Line = 844
_Reason = out of memory
////////////////////////////////////////
[ss@gpu04 CuMF]$ ./main 100 .058 1
F = 100, lambda = .058, THETA_BATCH = 1
*starting loading training and testing sets to host.
***start allocating memory on GPU...
***_start copying memory to GPU...
_CUDA failure als.cu:869: 'an illegal memory access was encountered'

Thanks,
Israt

Dataset for hugewiki

Thanks for a prompt resolution of the previous query. Could you also point out the url for the hugewiki dataset?

Does CuMF support implicit feedback data?

Missing datasets

Hi, I tried creating the dataset as mentioned in prepare_netflix_data.py.
However, the URL 'http://www.select.cs.cmu.edu/code/graphlab/datasets/ seems to be invalid.
Could you please let me know a workaround?

Input data format?

It's not very clear what the input data format is.

It seems to me that there are at least several input files used by main.cu:

"./netflix/R_test_coo.data.bin"
"./netflix/R_test_coo.row.bin"
"./netflix/R_test_coo.col.bin"
"./yahoo/yahoo_R_test_coo.data.bin"
"./yahoo/yahoo_R_test_coo.row.bin"
"./yahoo/yahoo_R_test_coo.col.bin"
"./netflix/R_train_csr.data.bin"
"./netflix/R_train_csr.indptr.bin"
"./netflix/R_train_csr.indices.bin"
"./yahoo/yahoo_R_train_csr.data.bin"
"./yahoo/yahoo_R_train_csr.indptr.bin"
"./yahoo/yahoo_R_train_csr.indices.bin"
"./netflix/R_train_csc.data.bin"
"./netflix/R_train_csc.indices.bin"
"./netflix/R_train_csc.indptr.bin"
"./yahoo/yahoo_R_train_csc.data.bin"
"./yahoo/yahoo_R_train_csc.indices.bin"
"./yahoo/yahoo_R_train_csc.indptr.bin"
"./netflix/R_train_coo.row.bin"
"./yahoo/yahoo_R_train_coo.row.bin"

Unable to run MovieLens10m Dataset

Hi Mr Tan,
We downloaded the movielens dataset from http://grouplens.org/datasets/movielens/10m/
We divideded the dataset into training and test dataset.

Following is the metadata for that.

m,n = 71567, 65133
nnz in train = 9301274

nnz in test = 698780

(Just to inform : we were able to run your code on the netflix dataset successfully.)
We tried running your code for the the above dataset but we are getting the following output:
./main 10 0.1 1
F = 10, lambda = 0.1, THETA_BATCH = 1
**starting loading training and testing sets to host.
***parameters: m: 71567, n: 65133, f: 10, nnz: 9301274
***start allocating memory on GPU...
******start copying memory to GPU...
--------- Train RMSE in iter 0: nan
--------- Test RMSE in iter 0: nan
--------- Train RMSE in iter 1: nan
--------- Test RMSE in iter 1: nan
--------- Train RMSE in iter 2: nan
--------- Test RMSE in iter 2: nan
--------- Train RMSE in iter 3: nan
--------- Test RMSE in iter 3: nan
--------- Train RMSE in iter 4: nan
--------- Test RMSE in iter 4: nan
--------- Train RMSE in iter 5: nan
--------- Test RMSE in iter 5: nan
--------- Train RMSE in iter 6: nan
--------- Test RMSE in iter 6: nan
--------- Train RMSE in iter 7: nan
--------- Test RMSE in iter 7: nan
--------- Train RMSE in iter 8: nan
--------- Test RMSE in iter 8: nan
--------- Train RMSE in iter 9: nan
--------- Test RMSE in iter 9: nan

doALS takes seconds: 3.000 for F= 10

ALS Done.

all elements in XTHost and ThetaTHost are all Nan

I ran it on ml10M dataset. And #define SURPASS_NAN is used to avoid Nan test error.

But all elements in XTHost and ThetaTHost are all NaN. Could you help me figure it out?

Thanks very much.
Best.

When rank is 70, it does not converge ······

Hi：
CuMF is very efficient. It is a amazing results.But I have three questions, the first is that RMSE does not converge when lambda is 0.05 and rank is 70,this is a very strange situation. The second is that should I process the matrix R into block form and Stored as CSR and CSC format before run the SU-ALS? The last is that can you send me the code of SU-ALS? I am very interested in this algorithm.

it cause 'core dump' on GTX950m(4G) with netflex data....

envy@ub1404:~/os_pri/github/CuMF$ tree
.
├── als.cu
├── als.h
├── als.o
├── host_utilities.cpp
├── host_utilities.h
├── host_utilities.o
├── images
│   ├── mf.png
│   └── spark-gpu.png
├── LICENSE
├── main
├── main.cu
├── main.o
├── Makefile
├── netflix_mme
├── netflix_mm.txt
├── print-test-result.sh
├── README.md
├── scripts
│   ├── prepare_input.ipynb
│   ├── R_test_coo.col.bin
│   ├── R_test_coo.data.bin
│   └── R_test_coo.row.bin
├── tensorflow
│   ├── als_tf.cc
│   ├── build_tf_op.sh
│   ├── cumf_as_tensorflow_ops_test.ipynb
│   └── cumf_as_tensorflow_ops_test.py
├── test_als.sh
└── yknote_build_debug_log

3 directories, 27 files
envy@ub1404:/os_pri/github/CuMF$
envy@ub1404:/os_pri/github/CuMF$
envy@ub1404:~/os_pri/github/CuMF$ ./main 100 0.058 3
F = 100, lambda = 0.058, THETA_BATCH = 3
*******starting loading training and testing sets to host.

loading COO...
Unable to open file!
loading CSR...
Unable to open file!
loading CSC...
Unable to open file!
loading COO Row...
Segmentation fault (core dumped)
envy@ub1404:~/os_pri/github/CuMF$

cumf / cumf_als Goto Github PK

cumf_als's Issues

Following is the metadata for that.

nnz in test = 698780

Recommend Projects

Recommend Topics

Recommend Org