hkuds / hgcl Goto Github PK

[WSDM'2023] "HGCL: Heterogeneous Graph Contrastive Learning for Recommendation"

Home Page: https://arxiv.org/abs/2303.00995

Python 100.00%

graph-contrastive-learning recommendation self-supervised-learning graph-neural-networks heterogeneous-graph-learning collaborative-filtering

hgcl's Introduction

HGCL

Torch version is available now! This repository contains pyTorch code and datasets for the paper: Heterogeneous Graph Contrastive Learning for Recommendation, *Paper in arXiv, Paper in ACM. In WSDM'23, Singapore, 2023

Inroduction

Heterogeneous Graph Contrastive Learning for Recommendation (HGCL) advances the recommender system with heterogeneous graph contrastive learning. HGCL integrates meta network with contrastive learning for adaptive augmentation to enable user-specific and item-specific knowledge transfer. It advances graph contrastive learning with customized cross-view augmentation.

Environment

The codes of HGCL are implemented and tested under the following development environment: pyTorch: Python=3.7.10 Torch=1.8.1 Numpy=1.20.3 Scipy=1.6.2

Datasets

We utilized three datasets to evaluate HGCL: Yelp, Epinions, and CiaoDVD. Following the common settings of implicit feedback, if user u_ihas rated item v_j, then the element (u_i,v_j) is set as 1, otherwise 0. We filtered out users and items with too few interactions. The datasets are divided into training set and testing set by 1: (n-1).

You can download all three datasets from Google Drive. Feel free to fire an issue if this link doesn't work.

How to Run the Code

Please unzip the datasets first. Also you need to create the History/ and the Models/ directories. The command to train HGCL on the Yelp/Epinions/CiaoDVD dataset is as follows. The commands specify the hyperparameter settings that generate the reported results in the paper.

Yelp

python main.py --dataset Yelp --ssl_temp 0.5 --ssl_ureg 0.06 --ssl_ireg 0.07 --lr 0.058 --reg 0.05 --ssl_beta 0.45 --rank 3

Epinions

python main.py --dataset Epinions --ssl_temp 0.5 --ssl_ureg 0.04 --ssl_ireg 0.05 --lr 0.055 --reg 0.043 --ssl_beta 0.32 --rank 3

CiaoDVD

python main.py --dataset CiaoDVD --ssl_temp 0.6 --ssl_ureg 0.04 --ssl_ireg 0.05 --lr 0.055 --reg 0.065 --ssl_beta 0.3 --rank 3

Important arguments

--ssl_temp It is the temperature factor in the InfoNCE loss in our contrastive learning. The value is selected from {0.1, 0.3, 0.45, 0.5, 0.55,0.6, 0.65}.
--ssl_ureg, ssl_ireg They are the weights for the contrastive learning loss of user’s and item’s aspect respectively. The value of this pair are tuned from {(3e-2,4e-2),( 4e-2,5e-2),( 5e-2,6e-2), (6e-2,7e-2),( 7e-2,8e-2)}.
--lr The learning rate of the mode. We tuned it from {1e-2, 3e-2, 4e-2, 4.5e-2, 5e-2, 5.5e-2, 6e-2}.
--Reg It is the weight for weight-decay regularization. We tune this hyperparameter from the set {1e-2, 3e-2, 4.3e-2, 5e-2, 6e-2, 6.5e-2, 6.8e-2}.
--ssl_beta This is the balance cofficient of the total contrastive loss , which is tuned from{0.2, 0.27, 0.3, 0.32, 0.4, 0.45, 0.48, 0.5}.
--rank A hyperparameter of the dimension of low rank matrix decomposition, This parameter is recommended to tune from{1, 2, 3, 4, 5}.

Experimental Results

Performance comparison of all methods on different datasets in terms of NDCG and HR:

Citation

If you find this work helpful to your research, please kindly consider citing our paper.

@inproceedings{chen2023heterogeneous,
  title={Heterogeneous graph contrastive learning for recommendation},
  author={Chen, Mengru and Huang, Chao and Xia, Lianghao and Wei, Wei and Xu, Yong and Luo, Ronghua},
  booktitle={Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining},
  pages={544--552},
  year={2023}
}

hgcl's People

Contributors

Stargazers

Watchers

Forkers

ychuest ywhuazhong imengru renameart y-yujie dr-pipi mehrdadkiam aneeshk1412 thatsshirleylee qin87

hgcl's Issues

new bug

I have make the directory HGCL/History and other directory HGCL/Models

HGT-evaluation

hi，您将异构信息融入推荐系统的工作很棒！

我想问问您，能否提供一下，根据本数据集用HGT训练和测试的代码？

如可以，麻烦您发到[email protected]邮箱中。

再次感谢您的工作！

代码理解问题

请问这个函数 metaregular所求的损失，是什么意思，论文中好像没有这个公式

some question about Meta transform net Eq.5 and Eq.6

您好，我理解的Eq.5和Eq.6是代码中的第113和114行（因为Eq.5只有emb的连接，而没有self.meta_net）。请问后面的Personalized transformation parameter matrix中为什么又出现了self.mlp1,self.mlp2 。并且第122~129行代码中的内容对应原文的哪一部分？这种bias和low weight的求法似乎和通常的MLP不一样。
期待您的答复

Yelp数据集

你好，认真地读了你的论文，但是对于数据集有些疑问，下载了数据集发现和官方提供地Yelp数据集不太一样。所以能够对Yelp数据集地数据做一些说明吗？

比如，在这张截图中，五个矩阵在论文中分别起什么作用？矩阵中地数据分别代表什么意思？期待您的回复，感谢。

bug

Hello, I want to know where is the dataset/Epinions/data.pkl, and dataset/CiaoDVD/data.pkl is same with other dataset.

请教代码问题

 您好！请问这里的if语句中，all_user_embeddings拼接的是userEmbeddings0，我的理解是这个all_user_embeddings指用户在社交域的所有层用户嵌入，但是在else语句中的all_user_embeddings拼接的是userEmbeddings，而userEmbeddings是userEmbeddings0 + ui_userEmbedding0拼接的，这似乎有一些矛盾，此外，else语句中的all_item_embeddings和all_ui_embeddings拼接的也都是norm_embeddings，这个地方也不太理解   
 期待您的解答，感谢！