Coder Social home page Coder Social logo

reczoo / recbox Goto Github PK

View Code? Open in Web Editor NEW
91.0 3.0 19.0 4.97 MB

A box of core libraries for recommendation model development

License: Apache License 2.0

Python 100.00%
collaborative-filtering sequential-recommendation gnn recommendation candidate-matching two-tower-models

recbox's Introduction

RecZoo

RecZoo: A curated model zoo for recommendation tasks

Matching

No Model Publication
1 UltraGCN Kelong Mao, Jieming Zhu, Xi Xiao, Biao Lu, Zhaowei Wang, Xiuqiang He. UltraGCN: Ultra Simplification of Graph Convolutional Networks for Recommendation, in CIKM 2021.
2 SimpleX Kelong Mao, Jieming Zhu, Jinpeng Wang, Quanyu Dai, Zhenhua Dong, Xi Xiao, Xiuqiang He. SimpleX: A Simple and Strong Baseline for Collaborative Filtering, in CIKM 2021.

Ranking

No Model Publication
1 FinalMLP Kelong Mao, Jieming Zhu, Liangcai Su, Guohao Cai, Yuru Li, Zhenhua Dong. FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction, in AAAI 2023.
2 FinalNet Jieming Zhu, Qinglin Jia, Guohao Cai, Quanyu Dai, Jingjie Li, Zhenhua Dong, Ruiming Tang, Rui Zhang. FINAL: Factorized Interaction Layer for CTR Prediction, in SIGIR 2023.
3 RAT Yushen Li, Jinpeng Wang, Tao Dai, Jieming Zhu, Jun Yuan, Rui Zhang, Shu-Tao Xia. RAT: Retrieval-Augmented Transformer for Click-Through Rate Prediction, in WWW 2024.
4 STEM Liangcai Su, Junwei Pan, Ximei Wang, Xi Xiao, Shijie Quan, Xihua Chen, Jie Jiang. STEM: Unleashing the Power of Embeddings for Multi-task Recommendation, in AAAI 2024.
5 Helen Zirui Zhu, Yong Liu, Zangwei Zheng, Huifeng Guo, Yang You. Helen: Optimizing CTR Prediction Models with Frequency-wise Hessian Eigenvalue Regularization, in WWW 2024.
6 Combined-Pair Zhutian Lin, Junwei Pan, Shangyu Zhang, Ximei Wang, Xi Xiao, Shudong Huang, Lei Xiao, Jie Jiang. Understanding the Ranking Loss for Recommendation with Sparse User Feedback, in KDD 2024.
7 AdaGIN Lei Sang, Honghao Li, Yiwen Zhang, Yi Zhang, Yun Yang. AdaGIN: Adaptive Graph Interaction Network for Click-Through Rate Prediction, in TOIS 2024.
8 SimCEN Honghao Li, Lei Sang, Yi Zhang, Yiwen Zhang. SimCEN: Simple Contrast-enhanced Network for CTR Prediction, in MM 2024.
9 RecSys Qi Zhang, Jieming Zhu, Jiansheng Sun, Guohao Cai, Ruining Yu, Bangzheng He, Liangbi Li. Enhancing News Recommendation with Real-Time Feedback and Generative Sequence Modeling, in RecSys Challenge Workshop 2024.
10 DCNv3 Honghao Li, Yiwen Zhang, Yi Zhang, Hanwei Li, Lei Sang, Jieming Zhu. DCNv3: Towards Next Generation Deep Cross Network for CTR Prediction, in Arxiv 2024.

Reranking

Pretraining

No Model Publication
1 UNBERT Qi Zhang, Jingjie Li, Qinglin Jia, Chuyuan Wang, Jieming Zhu, Zhaowei Wang, Xiuqiang He. UNBERT: User-News Matching BERT for News Recommendation, in IJCAI 2021.

Personalization

No Model Publication
1 PMG Xiaoteng Shen, Rui Zhang, Xiaoyan Zhao, Jieming Zhu, Xi Xiao. PMG: Personalized Multimodal Generation with Large Language Models, in WWW 2024.

recbox's People

Contributors

acnowa avatar xpai avatar zhujiem avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

recbox's Issues

Datasets for SimpleX: Format

Hello,

On running:
python run_param_tuner.py --config Yelp18/SimpleX_yelp18_x0/SimpleX_yelp18_x0_tuner_config.yaml --gpu 0

for instance the errors thrown demand a csv with specific columns whereas the Yelp dataset provided only has train.txt and test.txt.

I tried to first convert train and test to an edge list representation train.csv, test.csv but now there are errors such as KeyError 'corpus_index' which is not clear how to resolve.

Could you please provide the training and test files formatted as the run_param_tuner.py script expects?

Dataset Issue

At the end of SimpleX paper, you tested the model on more datasets like Amazon-Beauty, Amazon-Movies, Movielens-20M and MillionSongData. Could you please also provide these datasets so that we could reproduce your model for fair comparisons? Thank you

Configuration is missing

Only the Yelp18 configuration exists.
What configuration is required to run AmazonBooks dataset

Error: "generator raised StopIteration"

When I run with the following command: "cd benchmarks; python run_param_tuner.py --config Yelp18/MF_CCL_yelp18_x0/MF_CCL_yelp18_x0_tuner_config.yaml --gpu 0", I get an error at the second epoch.
error_log

Also, when enable parallel in evaluate_metrics, the code will get stuck here, so I had to set parallel to False when evalution.
error_log2

Another problem is the code is quite CPU consuming, I run the code on a 32G memory PC, but the memory rate become 100% during evaluation (paralle is set to False, otherwise it will get stuck as mentioned above).

Any solutions?

Role of query_index, user_id, corpus_index, and item_id

@xpai
Hello. Thank you for your contribution to standardizing RecSys benchmarks. I have a question regarding data preprocessing.

I wonder why the query_index / user_id and corpus_index / item_id are separated. At first, I assumed that the relationship between query-user and corpus-item was a mapping of idx-id, but it doesn't seem to be the case. Could you kindly explain the meaning of query_index, corpus_index, user_id, item_id, and the role of each?

Thank you in advance for your help.

Correspondence between user_id in the model and user_id in the dataset

It seems that "user_id" in model (SimpleX) is not exactly equal to the "user_id" in the dataset (gowalla).
For example, i printed user-item pairs in model training, and one of the positive (user_id,item_id) pair is (1220, 10807).
But it is not found in train.txt of gowalla dataset(I got the gowalla dataset from LightGCN https://github.com/gusye1234/LightGCN-PyTorch/tree/master/data/gowalla)

How can i get the correspondence between user_id in the model and user_id in the dataset (and item id)? Thanks a lot!


1.printing user-item pair
SimpleX_py

2.got an positive (user_id,item_id) pair (1220, 10807)
printed_content

3.positive (user_id,item_id) pair (1220, 10807) is not found in gowalla train.txt
gowalla_train

What is the config or yaml file for AmazonBooks dataset?

I was able to repeat the MF-CCL results on Yelp18 and Gowalla datasets in table 1 based on the command and yaml file provided.

When I try to run the similar command on AmazonBooks files, I found that sampling time is too much long than your logs.

https://github.com/openbenchmark/BARS/tree/master/candidate_matching/benchmarks/MF_CCL/MF_CCL_amazonbooks_x1

And results are not reasonable.

Could you please help me to reproduce the results of MF-CCL on AmazonBooks datasets.

Thanks,

About the required package

Thanks for the great job.
Please let me know the python version and the required package (and its version).

Hyper parameter on movielens

Hi, nice work!
May I ask you about parameter of SimpleX on movielens-1M as shown in table 6. And why do you report different evaluation metrics on different datasets?
Thanks a lot and looking forward to your reply.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.