Coder Social home page Coder Social logo

neural_collaborative_filtering's Introduction

Neural Collaborative Filtering

This is our implementation for the paper:

Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu and Tat-Seng Chua (2017). Neural Collaborative Filtering. In Proceedings of WWW '17, Perth, Australia, April 03-07, 2017.

Three collaborative filtering models: Generalized Matrix Factorization (GMF), Multi-Layer Perceptron (MLP), and Neural Matrix Factorization (NeuMF). To target the models for implicit feedback and ranking task, we optimize them using log loss with negative sampling.

Please cite our WWW'17 paper if you use our codes. Thanks!

Author: Dr. Xiangnan He (http://www.comp.nus.edu.sg/~xiangnan/)

Environment Settings

We use Keras with Theano as the backend.

  • Keras version: '1.0.7'
  • Theano version: '0.8.0'

Example to run the codes.

The instruction of commands has been clearly stated in the codes (see the parse_args function).

Run GMF:

python GMF.py --dataset ml-1m --epochs 20 --batch_size 256 --num_factors 8 --regs [0,0] --num_neg 4 --lr 0.001 --learner adam --verbose 1 --out 1

Run MLP:

python MLP.py --dataset ml-1m --epochs 20 --batch_size 256 --layers [64,32,16,8] --reg_layers [0,0,0,0] --num_neg 4 --lr 0.001 --learner adam --verbose 1 --out 1

Run NeuMF (without pre-training):

python NeuMF.py --dataset ml-1m --epochs 20 --batch_size 256 --num_factors 8 --layers [64,32,16,8] --reg_mf 0 --reg_layers [0,0,0,0] --num_neg 4 --lr 0.001 --learner adam --verbose 1 --out 1

Run NeuMF (with pre-training):

python NeuMF.py --dataset ml-1m --epochs 20 --batch_size 256 --num_factors 8 --layers [64,32,16,8] --num_neg 4 --lr 0.001 --learner adam --verbose 1 --out 1 --mf_pretrain Pretrain/ml-1m_GMF_8_1501651698.h5 --mlp_pretrain Pretrain/ml-1m_MLP_[64,32,16,8]_1501652038.h5

Note on tuning NeuMF: our experience is that for small predictive factors, running NeuMF without pre-training can achieve better performance than GMF and MLP. For large predictive factors, pre-training NeuMF can yield better performance (may need tune regularization for GMF and MLP).

Docker Quickstart

Docker quickstart guide can be used for evaluating models quickly.

Install Docker Engine

Build a keras-theano docker image

docker build --no-cache=true -t ncf-keras-theano .

Example to run the codes with Docker.

Run the docker image with a volume (Run GMF):

docker run --volume=$(pwd):/home ncf-keras-theano python GMF.py --dataset ml-1m --epochs 20 --batch_size 256 --num_factors 8 --regs [0,0] --num_neg 4 --lr 0.001 --learner adam --verbose 1 --out 1

Run the docker image with a volume (Run MLP):

docker run --volume=$(pwd):/home ncf-keras-theano python MLP.py --dataset ml-1m --epochs 20 --batch_size 256 --layers [64,32,16,8] --reg_layers [0,0,0,0] --num_neg 4 --lr 0.001 --learner adam --verbose 1 --out 1

Run the docker image with a volume (Run NeuMF -without pre-training):

docker run --volume=$(pwd):/home ncf-keras-theano python NeuMF.py --dataset ml-1m --epochs 20 --batch_size 256 --num_factors 8 --layers [64,32,16,8] --reg_mf 0 --reg_layers [0,0,0,0] --num_neg 4 --lr 0.001 --learner adam --verbose 1 --out 1

Run the docker image with a volume (Run NeuMF -with pre-training):

docker run --volume=$(pwd):/home ncf-keras-theano python NeuMF.py --dataset ml-1m --epochs 20 --batch_size 256 --num_factors 8 --layers [64,32,16,8] --num_neg 4 --lr 0.001 --learner adam --verbose 1 --out 1 --mf_pretrain Pretrain/ml-1m_GMF_8_1501651698.h5 --mlp_pretrain Pretrain/ml-1m_MLP_[64,32,16,8]_1501652038.h5
  • Note: If you are using zsh and get an error like zsh: no matches found: [64,32,16,8], should use single quotation marks for array parameters like --layers '[64,32,16,8]'.

Dataset

We provide two processed datasets: MovieLens 1 Million (ml-1m) and Pinterest (pinterest-20).

train.rating:

  • Train file.
  • Each Line is a training instance: userID\t itemID\t rating\t timestamp (if have)

test.rating:

  • Test file (positive instances).
  • Each Line is a testing instance: userID\t itemID\t rating\t timestamp (if have)

test.negative

  • Test file (negative instances).
  • Each line corresponds to the line of test.rating, containing 99 negative samples.
  • Each line is in the format: (userID,itemID)\t negativeItemID1\t negativeItemID2 ...

Last Update Date: December 23, 2018

neural_collaborative_filtering's People

Contributors

cakirmuha avatar hexiangnan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

neural_collaborative_filtering's Issues

Negative samples purpose

Hi, I want to train your network with a new dataset, but I don't understand the purpose of the negative samples, so I have no idea on how to generate them. Could you please explain this?

Thanks

negative instances

Hi,

I was wondering how did you generate the negative instances for the pinterest dataset.

Thanks.

what is nonlinear interaction?

I understand why the linear function alone does not work.
However, what kind of correlation can be grasped by using a non-linear function (ReLU function)? Please give me a theoretical explanation

What ARE negative samples?

I couldn't understand the meaning of negative samples? I thought we were trying to predict the rating? if so, why choose a binary classification method? And also, how many negative samples should we use? 99 per user? or 99 for all the dataset? thank you.

Neighbor-based code

Could you please also provide neighbor-based (Item KNN) code in this repo ?

what criteria did you choose test.negative

In ml-1m.test.negative, did you randomly select 99 items per user? Or did you select 99 popular items?

What criteria did you choose for 99 of the negative feedbacks?

Parameters

Thank you for publishing awesome code !

Please let me know about parameters (lr, epoch... etc) when you pretrain this model

Thank you

one-hot

where can show input data by one-hot?

rasie AttributeError(attr+"not found"

how to deal with this problem?thank you very much.
Traceback (most recent call last):
in
user_input, item_input, labels = get_train_instances(train, num_negatives)
in get_train_instances
while train.has_key((u, j)):
in getattr
raise AttributeError(attr + " not found")
AttributeError: has_key not found

The program will hang out when setting num_thread to be more than 1

When I set the num_thread to be more than 1, e.g., 2, the program will hang out after print the Init: HR and NDCG information forever. Is there anything wrong with the multiprocessing version of the evaluate_model() function? The code can work perfectly when the num_thread = 1.

how to actually make prediction ??

sorry for this naive question, i'm just getting started with machine learning and was wondering how to actually make predictions , i ran the code but only training takes place even in the case of pre-training only takes place ??

so how do i actually make prediction ??

val_loss

hello,Mr He:
I divide one-tenth of the data from your training set as a cross-validation set. After training, the verification set error grows from the beginning. Shouldn't it be a gradual decline?

NDCG

Hello!
I have a question regarding to line 85 file evaluate.py :

return math.log(2) / math.log(i+2)

It seems to me that there should be :

return math.log(2) / math.log(i+1)

Input data syntax and size

Thanks for publishing your code! I have two questions regarding the input data:

  1. Must the userID and itemID be indices or IDs (as described in the Readme)? Must they be numeric only or can they be alphanumeric such as GUIDs (https://pt.wikipedia.org/wiki/Identificador_%C3%BAnico_universal)?

  2. What is the maximum number of interactions that your implementation handles well? Can it deal with 10^10 interactions and > 95% sparsity of the interaction matrix on a system with 1TByte memory and 72 CPU cores?

Item id

The item ids are different from the original item ids in the MovieLens dataset.

The modification is not only minus one, which is applied to the user ids.

Your evaluation method is unreasonable

It's unreasonable to blend the items in test data with negative samples. It contradicts the rule that your evaluation input shouldn't have a knowledge of the test data. I think your method is a cheat which sharply narrows down the scope of the ground-truth.
And it's UNFAIR to compare your results with that of eALS and BPR.

detail evaluation table

There are only line chart with different top-K evaluation in your paper : Neural Collaborative Filtering , I can't get the exactly of the result.

like this
image

Would you please offer the exactly number of each HR@K and NDCG@K ?

Thank you

Mean Average Precision

The code reports NDCG and HIT Rate. Can you please advise on how to calculate MAP@k or some modification on code which can help to calculate MAP@k, where k could be varied?

Update code with new keras version

I am using keras 2.2.1 & I can not run codes.
I saw that you used 1st version on keras , how can I update codes in order to run them on keras 2?
Thanks for help

I can't get a good result in predicting ratings

i use mseloss instead of CrossEntropyLoss to predict ratings, the inputs are userId and itemId, and the target is an integer between 0 to 10. I transform it to 0,0.1,0.2...,1.0 and use sigmod as the last layer. but the output is very similar.why? please help me.

How to train model with larger dataset?

Hi, Xiangnan:
I've read your paper respectfully. In your realization, the embedding input shape of user is length of user. It's work for user num when the number is not very large, but when the number become larger(maybe a million or a billion) it would be hard to realize. What's more, by constructing (user, item) pair can easily train model, but when user * item >> 1 billion, how to predict with NeuCF?

Why nb_epoch=1?

Hey guys, I was looking into your code and couldn't figure out why you guys use nb_epoch=1 in the model.fit function.
Is that so it well receive new negative examples on the next for iteration?
Thanks for your attention.

`evaluation_threads` does not take effect

I tried evaluation_threads = 56, but the time spent for each epoch is pretty the same:

evaluation_threads = 1

Iteration 0 [112.8 s]: HR = 0.4967, NDCG = 0.2792, loss = 0.3568 [2.6 s]
Iteration 1 [25.0 s]: HR = 0.5631, NDCG = 0.3147, loss = 0.3129 [2.5 s]
Iteration 2 [24.1 s]: HR = 0.5887, NDCG = 0.3314, loss = 0.2943 [2.5 s]
Iteration 3 [23.8 s]: HR = 0.6166, NDCG = 0.3512, loss = 0.2832 [2.5 s]
Iteration 4 [23.6 s]: HR = 0.6215, NDCG = 0.3562, loss = 0.2764 [2.5 s]

vs.

evaluation_threads = 56

Iteration 0 [26.1 s]: HR = 0.4965, NDCG = 0.2790, loss = 0.3659 [1.2 s]
Iteration 1 [23.5 s]: HR = 0.5618, NDCG = 0.3141, loss = 0.3094 [1.2 s]
Iteration 2 [23.1 s]: HR = 0.5952, NDCG = 0.3352, loss = 0.2923 [4.5 s]
Iteration 3 [24.0 s]: HR = 0.6075, NDCG = 0.3474, loss = 0.2807 [3.3 s]
Iteration 4 [24.2 s]: HR = 0.6174, NDCG = 0.3528, loss = 0.2756 [4.3 s]

Question - negative sampling

Am I correct that each epoch the negative examples can change and that there is no safeguard that a negative example is not actually a positive example for that user?

Input shape (1, ) but ...

Hi, I have more questions about your code.
user_input, item_input =Input(shape (1, ))

but ( hist = model.fit([ np.array(user_input), np.array(item_input)] ) this code, np.array(user_input) shape is (4970845, ) and mini-batch input data shape is (batch_size , ).

I think the mini-batch shape should be (batch, 1), but your code why go into (batch, )?

  • example
    your code :
    np.array(user_input) = np.array([0,0,0,1,1,1,1,2,2,2,3,3,4,4,5,5....]) --> (4970845 , )
    minibatch input shape --> (batch_size , )

i think :
np.array(user_input) = np.array([[0],[0],[0],[1],[1],[1],[1],[2],[2]....]]) --> (4970845, 1)
minibatch input shape --> (batch_size , 1)

Embedding Layer

Please explain how does this work?
MF_Embedding_User = Embedding(input_dim = num_users, output_dim = latent_dim, name = 'user_embedding', init = init_normal, W_regularizer = l2(regs[0]), input_length=1)
user_latent = Flatten()(MF_Embedding_User(user_input))

the number of negative instances?

HI,anybody can tell me why you just select 4 negative instances, i think it is a too small proportion. is there any influence in training?thanks!!!
'--num_neg', type=int, default=4

while executing GMF.py

Hello,

I am getting this error while error GMF.Py
kindly guide me....

ValueError: Only call sigmoid_cross_entropy_with_logits with named arguments (labels=..., logits=..., ...)

why the predicting ratings has negative number?

I think the predicting rating should be a positive number. But the result of

logits = tf.keras.layers.Dense( # sigmoid
      1, activation=None, kernel_initializer="lecun_uniform",
      name=movielens.RATING_COLUMN)(predict_vector)

has some negative number.
Help me please, thanks!

Pinterest dataset

Hi,

By any chance are you able to provide the train-test splits for Pinterest dataset used in the paper.

Would really appreciate it!

Thanks!

One hot encode vs embedding

I noticed in the paper you mentioned the input is one hot encoded user / item vectors connect to a fully connected layer to get user / item latent vector. But in your Keras code you did embedding directly on user / item to get their latent vector. Can you please explain to us what is the difference here? Thanks in advance!

what is Num_neg?

Hi Henry,
I couldn't understand the purpose of num_neg?
Could you please enlighten me?
thank you.

version

what is the version of keras and python????

when run your code , it occurs some problems.

Connected to pydev debugger (build 182.4323.49)
Using Theano backend.
MLP arguments: Namespace(batch_size=256, dataset='ml-1m', epochs=100, layers='[64,32,16,8]', learner='adam', lr=0.001, num_neg=4, out=1, path='Data/', reg_layers='[0,0,0,0]', verbose=1)
Load data done [46.5 s]. #user=6040, #item=3706, #train=994169, #test=6040
Traceback (most recent call last):
File "/home/zxj/software/pycharm-2018.2.3/helpers/pydev/pydevd.py", line 1664, in
main()
File "/home/zxj/software/pycharm-2018.2.3/helpers/pydev/pydevd.py", line 1658, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/zxj/software/pycharm-2018.2.3/helpers/pydev/pydevd.py", line 1068, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/zxj/software/pycharm-2018.2.3/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/zxj/PycharmProjects/neural_collaborative_filtering/MLP.py", line 136, in
model = get_model(num_users, num_items, layers, reg_layers)
File "/home/zxj/PycharmProjects/neural_collaborative_filtering/MLP.py", line 72, in get_model
user_latent = Flatten()(MLP_Embedding_User(user_input))
File "/usr/local/lib/python3.5/dist-packages/keras/engine/topology.py", line 484, in call
self.build(input_shapes[0])
File "/usr/local/lib/python3.5/dist-packages/keras/layers/embeddings.py", line 95, in build
name='{}_W'.format(self.name))
File "/home/zxj/PycharmProjects/neural_collaborative_filtering/MLP.py", line 57, in init_normal
return initializations.normal(shape, scale=0.01, name=name)
File "/usr/local/lib/python3.5/dist-packages/keras/initializations.py", line 36, in normal
return K.random_normal_variable(shape, 0.0, scale, name=name)
File "/usr/local/lib/python3.5/dist-packages/keras/backend/theano_backend.py", line 115, in random_normal_variable
return variable(np.random.normal(loc=0.0, scale=scale, size=shape),
File "mtrand.pyx", line 1652, in mtrand.RandomState.normal
File "mtrand.pyx", line 242, in mtrand.cont2_array_sc
TypeError: 'float' object cannot be interpreted as an integer

The are some problems, when i run MLP.py , keras and Theano version is right . I hope you can hope me to solved this problems

what is "gtItem"?

@hexiangnan ! Thanks for publishing this code.
In evaluate.py, there's code like below
`def eval_one_rating(idx):

rating = _testRatings[idx]

items = _testNegatives[idx]

u = rating[0]

gtItem = rating[1]

items.append(gtItem)`

In this code, what is gtItem ?

Personal Issue about Paper

Greetings, Dr. He and other experts. I have read your paper. Why is the vector p_i in Figure 1 limited to two dimensions? It seems that the conclusion "incurring a large ranking loss." was made because the space was limited to two dimensions.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.