jokieleung / ntrd Goto Github PK

the Pytorch implementation for our EMNLP 2021 paper "Learning Neural Templates for Recommender Dialogue System"

License: Apache License 2.0

Python 99.89% Shell 0.11%

dialogue-system conversational-recommendation recommender-system pytorch

ntrd's Introduction

NTRD

This repository is the Pytorch implementation of our paper "Learning Neural Templates for Recommender Dialogue System" in EMNLP 2021.

In this paper, we introduce NTRD, a novel recommender dialogue system (i.e., conversational recommendation system) framework that decouples the dialogue generation from the item recommendation via a two-stage strategy. Our approach makes the recommender dialogue system more flexible and controllable. Extensive experiments show our approach significantly outperforms the previous state-of-the-art methods.

The code is still being organized, feel free to contact me if you encounter any problems.

Dependencies

pytorch==1.6.0
gensim==3.8.3
torch_geometric==1.6.3
torch-cluster==1.5.8
torch-scatter==2.0.5
torch-sparse==0.6.8
torch-spline-conv==1.2.0

the required data word2vec_redial.npy can be produced by the function dataset.prepare_word2vec().

Run

Run the script below to pre-train the recommender module. It would converge after 3 epochs pre-training and 3 epochs fine-tuning.

python run.py

Then, run the following script to train the seq2seq dialogue task. Transformer model is difficult to coverge, so the model need many of epochs to covergence. Please be patient to train this model.

python run.py --is_finetune True

The model will report the result on test data automatically after covergence.

To run the novel experiments, you need to generate the data/full_data.jsonl first by combining the data/train_data.jsonl and data/test_data.jsonl into one file.

Also, you need to uncomment the code in dataset.py L117 and L317 - L 322.

Then, run the following script to pretrained the recommender module.

python run_novel.py

and the following step is the same as the conventional setting by runing the command below.

python run_novel.py --is_finetune True

Citation

If you find this codebase helps your research, please kindly consider citing our paper in your publications.

@inproceedings{liang2021learning,
  title={Learning Neural Templates for Recommender Dialogue System},
  author={Liang, Zujie and 
          Hu, Huang and 
          Xu, Can and 
          Miao, Jian and 
          He, Yingying and 
          Chen, Yining and 
          Geng, Xiubo and 
          Liang, Fan and 
          Jiang, Daxin},
  booktitle={Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  year={2021}
}

Acknowledgment

This codebase is implemented based on KGSF. Many thanks to the authors for their open-source project.

ntrd's People

Contributors

Stargazers

Watchers

Forkers

nlpxucan meiyoufeng116 dutim

ntrd's Issues

What's the meaning of res_movie_recall

In my view, res_movie_recall means the generated response contains exactly the same items as the ground truth, but function response_movie_recall_cal doesn't correctly judge whether the generated response contains exactly the same items as the ground truth. response_movie_recall_cal returns immediately when it sees the item in the ground truth appear in predicted response but doesn't check all the other items.

How can ReR implemented in KGSF?

About evaluation

Hi jokie,
first of all, thanks for your work!

The results of Dist@N and R@N of KGSF reported in the NTRD are different from those reported in the KGSF itself. Is this because you and KGSF use different evaluation scripts? I read some recent CRS papers(NTRD, KGSF, Revcore, C2CRS), and the results have some inconsistencies with each other. However, the evaluation formulas are all fixed, so I'm very confused at this point. Are these inconsistent results because of different evaluation scripts used by these baselines?
I also want to know how did you calculate the "item ratio and item diversity" in your evaluation script.
Thanks a lot! Lucy

An error occurs when run python run.py --is_finetune True

File "run.py", line 914, in
loop.model.load_model(args.load_model_pth)
No such file or directory: 'saved_model/net_parameter1.pkl'

Should the filename be 'saved_model/best_recom_model.pkl'?

An error happened in calculating CrossEntropyLoss

Hello!

When I was running the python run.py --is_finetune True, there was an error happened.

C:\Users\94323\anaconda3\lib\site-packages\torch\nn\functional.py:1805: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [0,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [1,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [2,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [4,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [5,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [6,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [7,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [8,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [9,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [10,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [11,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [12,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [13,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [14,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [15,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [16,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [17,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [18,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [19,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [20,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [22,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [24,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [25,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [26,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
  0%|                                                                                                                                                                             | 0/1431 [00:01<?, ?it/s] 
Traceback (most recent call last):
  File "run.py", line 916, in <module>
    loop.train()
  File "run.py", line 460, in train
    self.backward(joint_loss)
  File "run.py", line 854, in backward
    loss.backward()
  File "C:\Users\94323\anaconda3\lib\site-packages\torch\_tensor.py", line 255, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "C:\Users\94323\anaconda3\lib\site-packages\torch\autograd\__init__.py", line 147, in backward
    Variable._execution_engine.run_backward(
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

I tried to find the traceback of the error, and find it happen in model.py line 636.

selection_loss = torch.mean(self.compute_loss(matching_logits, movies_gth.type(torch.long))) # movies_gth.squeeze(0):[bsz * dynamic_movie_nums]

I tried to find some fix method in web, some people say that the target value(movies_gth) should in the value 0<target[i]<C-1, but when I was debugging, the movies.gth is

"tensor([28207, 22727, 64362, 4646, 64362, 64362, 8404, 64362, 51711, 40569,....])"

and the C is 6924.

I don’t know if I understand this question correctly, but I hope somebody can solve this problem.

masked_for_selection_token doesn't match matching_logits_

At line 690 in model.py, masked_for_selection_token corresponds to the range [1:] which ignores the start token while latent corresponds to the range [0:end) which take the start into account. Their positions doesn't match

Error in all_response_movie_recall_cal

In all_response_movie_recall_cal，将非movie token的label mask为0，但是selection__label也可能是为0的，且后续没有对此作额外判断会造成指标计算误差

run_novel vs run vs e2e

Hi,

Thank you for your contribution. It is a great work!

Could you please explain the logic behind the file names? What is the difference between run.py, run_novel.py, and e2e_run.py?
And what is the difference between the model file behind them: e2e_model, model, and model_novel.

Those three names are kind of confusing. Thanks!

Should I change the arg '-infomax_pretrain' to True?

I found that there is a part of codes about informax_pretraining, but due to the arg '-infomax_pretrain' is False, this part will not be executed. If I want to reproduce the test results in the paper, should I change the arg '-infomax_pretrain' to True?

missing neginf in utils.py

when I run the code by python run.py, it says:

Traceback (most recent call last):
  File "e2e_run.py", line 36, in <module>
    from e2e_model import E2ECrossModel
  File "/home/NTRD/e2e_model.py", line 1, in <module>
    from models.transformer import TorchGeneratorModel,_build_encoder,_build_decoder,_build_encoder_mask, _build_encoder4kg, _build_decoder4kg, _build_decoder_selection, _build_decoder_e2e_selection
  File "/home/NTRD/models/transformer.py", line 15, in <module>
    from models.utils import neginf
ImportError: cannot import name 'neginf'

And I can't find the neginf in the utils.py. Is the code aviliable now?

in python run.py --is_finetune True process , assert torch.sum(movies_gth!=0, dim=(0,1)) == torch.sum((mask_ys == 6), dim=(0,1)) error

hello

assert error

raceback (most recent call last):
  File "run.py", line 918, in <module>
    loop.train()
  File "run.py", line 470, in train
    output_metrics_gen = self.val(True)
  File "run.py", line 525, in val
    _, _, _, _, gen_loss, mask_loss, info_db_loss, info_con_loss, selection_loss, _, _ = self.model(context.cuda(), response.cuda(), mask_response.cuda(), concept_mask, dbpedia_mask, seed_sets, movie, concept_vec, db_vec, entity_vector.cuda(), rec,movies_gth.cuda(),movie_nums, test=False)
  File "/home/bigdata10/anaconda3/envs/h10yyb/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/bigdata10/yyb/Paper/NTRD-main/model.py", line 600, in forward
    assert torch.sum(movies_gth!=0, dim=(0,1)) == torch.sum((mask_ys == 6), dim=(0,1))
AssertionError

I don't find a way to fix the bug.

Problem in TOTAL_NOVEL_MOVIES

The paper claims that the movies in TOTAL_NOVEL_MOVIES do not appear in training corporus, while I found many mentioned id in this list also appear in the training corporus, such as @77306

Error in dbpedia_mask

When movies_gth is None且context中有电影这里直接索引会发生异常，而dbpedia_mask也只能append self.entity_max，不能反映上文是否出现电影实体

下面的当movies_gth为None同时对话中出现电影的话那直接索引会引发异常，而没有相应的异常处理措施

应该改为：

Numbers of entities mismatched

Hi,

I found the total number of DBpedia entities in "/data/entity2entityId.pkl" is 64362. However, in the run.py, the total number of entities is set to be 64368. (also in subkg.pkl), the max number of index of a node is 64367.(and the min is 0)

This cause the issue that if I have a relation (10, 64367, relation type=5) in the DBpedia graph (subkg.pkl), I will have no idea which entitiy is the node 64367 (since there is no match in entity2entityId.pkl)

I am wondering why your setting mismatched the data?

Thank you!

how to reach the performance NTRD should have?

Hi Jokie,

After I fixed the bug, now I can run your code smoothly. However, I found I am unable to replicate the claimed performance that is shown on the paper. Could you please help me?

The parameter setting follows the paper:

infomax pretain = True
for rec task, epoch = 30
for sentence generation task, epoch = 30*3

I did not change any other thing and I got the results of rec task as:
'recall@1': 0.01972770213948319,
'recall@10': 0.2856348985829397,
'recall@50': 0.7265907196443456}

This is much lower than the paper claims! I did not know which step I messed up.

After I fine-tuning on sentence generation task, I got the results as:
{'ppl': 4.646927499761735,

'dist1': 0.07012840412022012,
'dist2': 0.31649499082827715,
'dist3': 0.4667701425144631,
'dist4': 0.5724566106956399,

'bleu1': 0.093541870702522,
'bleu2': 0.026171342178387604,
'bleu3': 0.016242155418590167,
'bleu4': 0.011014523340149043, }

=================================================
I ran the code 3 times, results are similar. Here is the comparison of results after taking the average:

what I can observe is that rec performance is a little lower than KGSF, and sentence task is better than KGSF(but not reach the claimed performance).

I think the result does not make sense since I think the method purposed on the paper is certainly better than KGSF, which take the actual movie name into account.

training manner? two-stage or end2end

Hi there,
thanks for your fantastic work. However, I am a little confused about the differences between the readme.md and your paper.
In your paper,
you mentioned in Section 4.3 that

Though the entire framework is typically two-stage, the two modules can be trained simultaneously in an end-to-end manner.

but you use python run.py in readme.md which is obvious a two-stage training manner.

So should we use python e2e_run.py for end2end training or python run.py for two- stage training in order to reproduce your results?
Thx.

关于推荐模块预训练目的

代码实现中似乎并没有用到推荐模块得到的候选商品集合，因为selection_cross_decoder用的是db_encoding，这用的是context出现的实体编码，推荐模块并没有用到，不知道推荐模块对于回复生成在哪里起作用了，是不是应该将db_encoding改成推荐模块排名前50的实体表示才是正确的，感觉代码实现与论文提到的使用来自推荐模块的候选商品集合不符合