Coder Social home page Coder Social logo

chenchongthu / enmf Goto Github PK

View Code? Open in Web Editor NEW
149.0 5.0 28.0 25.06 MB

This is our implementation of ENMF: Efficient Neural Matrix Factorization (TOIS. 38, 2020). This also provides a fair evaluation of existing state-of-the-art recommendation models.

License: MIT License

Python 100.00%
deep-learning recommender-system efficient-algorithm recommendation evaluation state-of-the-art collaborative-filtering reproducibility reproducible-research sigir

enmf's People

Contributors

chenchongthu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

enmf's Issues

After training by the default args, the result isn't good

Excuse me, after training by the default arguments, I get the recall and NDCG score, but the result isn't as good as the report in the paper.
Here is my result after 500 epochs:

recall@50, ndcg@50
0.0912251655629139 0.04411650302936786
recall@100, ndcg@100
0.28807947019867547 0.08591003366756622
recall@200, ndcg@200
0.43559602649006623 0.10980291110041669

Why is the result of pretty lower than the report of the paper?

ml-1m数据集的validation data

您好,看完ENMF的论文后有两个疑问:
1.ml-1m的数据集的validation data是ml.train.txt中每个用户交互序列的最后一个吗
2.另外,我发现和官网的数据集有不一样,这个数据集是经过什么方法处理过的吗

The nDCG formula may not consistent between your paper and implementation?

Hello, I read your paper(TOIS) and your implementation.
I have a question about your implementation of the formula of nDCG.

In your paper, you write the formula of nDCG using log2 (logarithmic base is 2), but in your implementation I think you use np.log (logarithmic base is natural logarithm). (c.g., L209 or L303 in code/ENMF.py)

Should you use np.log2 in your implementation for consistency?

Anyway, thank you for putting together a good paper and implementation!

Recall指标对齐

您好,我看到ENMF与LightGCN 和NBPO等方法,但我发现ENMF代码中recall指标的计算与其他几种方法没有对齐。我想确定您给定的ENMF结果是使用下面第一种还是第二种的结果?

第一种:ENMF中使用的是len(hit_items) / min(topk, len(ground_truth))

第二种使用的是 len(hit_items) / len(ground_truth),如下
NBPO: https://github.com/Wenhui-Yu/NBPO/blob/master/Library.py#L14
LightGCN: https://github.com/kuandeng/LightGCN/blob/master/evaluator/python/evaluate_foldout.py#L20

Can not reproduce the results of ENMF on ml-lcfn dataset as claimed in README

Hi! You have done a good work! I have been trying your code these days and got some expected results, but I found it hard to reproduce the results on ml-lcfn dataset as claimed in README.

Here are my trials with the provided code:

  1. When using default hyperparameters of provided code, i.e., dropout keep_prob=0.7 and negative weight=0.1, the best results I got were: NDCG@5=0.22135453305703484, NDCG@10=0.22871178869000672, NDCG@20=0.2525169010557999.
  2. When using the suggested hyperparameters in README, i.e., dropout keep_prob=0.5 and negative weight=0.5, the best results I got were: NDCG@5=0.24160408294952565, NDCG@10=0.24239649929731227, NDCG@20=0.25935423043524214.
  3. When setting the hyperparameters as dropout keep_prob=0.7 and negative weight=0.5 (which is the best pair I have tried), the best results I got were: NDCG@5=0.24156951242563474, NDCG@10=0.24269257187356102, NDCG@20=0.26141558703625023.

Note that none of above meets the promising results in README, i.e., NDCG@5=0.2457, NDCG@10=0.2475, NDCG@20=0.2656. Could you help me figure out how to reproduce them?

ml-1m数据集结果不一致

Hi,thanks for sharing the code.
With your source code unmodified (dropout: 0.5, neg-weight: 0.5), I have tried on ml-1m and get the following results:
First col: Recall
Second col: NDCG

loss,loss_no_reg,loss_reg -20357.158033288044 -20357.158033288044 0.0
TopK: [10, 20, 50]
Recall@10: 0.1 NDCG@10: 0.04893161658667473
Recall@20: 0.16258278145695365 NDCG@20: 0.06466930639306495
Recall@50: 0.29817880794701984 NDCG@50: 0.09132394256584671

Which is much lower than in readme:
NDCG@5, 10, 20
0.2457 0.2475 0.2656

May I have your help to reproduce your results on ml1m. Thanks.

DHCF实现细节讨论

看到你们有比较Dual Channel Hypergraph Collaborative Filtering这篇文章提出的DHCF方法,想来请教一下。
这篇文章似乎存在一些问题。首先公式6和公式16对不上,到底\Theta是应该乘在哪里不明确,按描述似乎公式16是符合图2的。公式8这个min函数比较的对象似乎不对,标量1和矩阵进行比较,个人觉得这个形式有问题。我的理解是k>1返回power,不大于1返回单位矩阵。其次item的2阶可达用户构造的incidence matrix (i.e. HH^TH)已经十分密集了,就文章所用到的movielens数据集而言,2阶Incidence matrix基本上非0的已经很少了。item数量稍大的情况下,需要存储一个size为M^2的dense matrix,没有扩展性可言,运算量也较大。为每个batch单独构造一个incidence matrix虽然可行,但对比LightGCN这样的方法,效率低下。
文章构造hyperedge的方法也非常heuristic,divide-and-conquer只在摘要引言和结论出现过,怎么体现到方法中的,感觉也没有讲。我个人按论文复现后按作者的实验设置放到LastFM数据集上运行,效果也远逊NGCF。不知道您怎么看这篇文章以及DHCF这个方法。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.