Coder Social home page Coder Social logo

ictnlp / dialoflow Goto Github PK

View Code? Open in Web Editor NEW
93.0 4.0 10.0 4.49 MB

Code for ACL 2021 main conference paper "Conversations Are Not Flat: Modeling the Dynamic Information Flow across Dialogue Utterances".

License: MIT License

Python 99.87% Shell 0.13%
dialogue-systems dialogue-generation dialogue-evaluation dialogue-pretraining flow-score

dialoflow's People

Contributors

lizekang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

dialoflow's Issues

train

Hello, I would like to ask you some questions when I execute generation.py. Could you please provide me with the environmental requirements for model execution?

no modele named '_regex'

Model evaluation

First of all, kudos for this nice work. I really liked your work.
I am trying to reproduce the results of your paper. It will be very helpful if you could share the evaluation script for the automated metrics. In the paper, it is written that "We employ the evaluation scripts used by DialoGPT.". Could you please point out the DialoGPT file used for your evaluation.

Question about constructing test set

First of all, thanks for your great work!
I'm trying to run the code and model on the DialyDialog dataset to have a better understanding of your work. But I cant figure out how to construct the input to the model from /data/test.json. Also, the file 'test.refs.txt' which apprears in generate.py is not provided in this repository.
I have tried to construct the input to the model myself from the /data/test.json, but i was confused since i couldn't find multi-reference for every examples.
I wonder whether the code to preprocess /data/test.json could be released?

How to reproduce

I reproduce the DialoFlow base in DailyDialog,
the evaluation results are:
NIST: [2.9148, 3.3919, 3.5077, 3.5375]
BLEU: [0.4535, 0.2323, 0.1367, 0.086]
METEOR: 0.1479778034868275
Entropy: [6.250107671407306, 8.663223223839859, 9.603956363262926, 9.959120587252972]
Distinct: [0.08599954617653732, 0.32188216456202917]
avg_len: 9.154005934718102

The results are lower than the results shown in paper.

Can you show the detail of fine-tune in DailyDialog?

My setting is:
Training:
Batch_size 16 (4 GPU , per_gpu_batch_size=4)
gradient_accumulation_steps 1
epoch 50

The best Validation loss is 7.5168, at epoch 34.

generate:
The Config parameters is default,and I set the beam_size=5.

And I did not use the Apex.

Pre-trained model release date?

We are really interested in using DialoFlow for our research on chatbots and their influence on psychological well-being. Our experiments should presumably start in the span of 2-3 weeks. Will the pre-trained DialoFlow model be available by then?

logger/log.out Issue

Hello,
I fellow README about instruction "bash fine-tune.sh" while got error of "fine-tune.sh: line 1: logger/log.out: No such file or directory".

How can I solve it? Thanks for Help!

Cannot access to googledrive.

Hi, Dr. Li,
Thank you for your code.
But I cannot access to googledrive for some reasons.
Could you provide another position to download your pre-trained models, like DialoFlow-base?
Thanks for your reply in advance.

No module named '_regex'

Hello,

I followed the README but got some errors.
I want to use Flow score for the evaluation metric analysis.
when I run the code:
from flow_score import * MODEL_PATH = "models/DialoFlow_large.bin" FLOW_SCORE = FlowScore(MODEL_PATH) dialogues = ["hello", "Hi there. tell me about yourself.", "Well I'm a college student who loves learning about the world around me!","verry good !"] flow_score = FLOW_SCORE.score(dialogues)

An error has occurred:ModuleNotFoundError: No module named '_regex'
environment:python==3.7 torch==1.7.0 transformers==3.0.2; pickle==4.0;
I don't know what happen .How do I need to solve the version or the environment problem?
image
image

Getting 'nan' flow_score

I am trying to run the code:

from flow_score import *
MODEL_PATH = "models/DialoFlow_large.bin"
FLOW_SCORE = FlowScore(MODEL_PATH)
dialogues = ["hello", "Hi there. tell me about yourself.", "Well I'm a college student who loves learning about the world around me!"]
flow_score = FLOW_SCORE.score(dialogues)

I am using torch==1.7.1 and transformers==3.0.2.

The value of flow_score I get is 'nan' and I am getting a lot of warnings when loading the model:

/DialoFlow/FlowScore/dialoflow_venv/lib/python3.9/site-packages/torch/serialization.py:658: SourceChangeWarning: source code of class 'model.DPKSModel' has changed. Saved a reverse patch to DPKSModel.patch. Run `patch -p0 < DPKSModel.patch` to revert your changes.
  warnings.warn(msg, SourceChangeWarning)
/DialoFlow/FlowScore/dialoflow_venv/lib/python3.9/site-packages/torch/serialization.py:658: SourceChangeWarning: source code of class 'transformers.modeling_gpt2.GPT2Model' has changed. Saved a reverse patch to GPT2Model.patch. Run `patch -p0 < GPT2Model.patch` to revert your changes.
  warnings.warn(msg, SourceChangeWarning)
/DialoFlow/FlowScore/dialoflow_venv/lib/python3.9/site-packages/torch/serialization.py:658: SourceChangeWarning: source code of class 'torch.nn.modules.sparse.Embedding' has changed. Saved a reverse patch to Embedding.patch. Run `patch -p0 < Embedding.patch` to revert your changes.
  warnings.warn(msg, SourceChangeWarning)
/DialoFlow/FlowScore/dialoflow_venv/lib/python3.9/site-packages/torch/serialization.py:658: SourceChangeWarning: source code of class 'torch.nn.modules.dropout.Dropout' has changed. Saved a reverse patch to Dropout.patch. Run `patch -p0 < Dropout.patch` to revert your changes.
  warnings.warn(msg, SourceChangeWarning)
/DialoFlow/FlowScore/dialoflow_venv/lib/python3.9/site-packages/torch/serialization.py:658: SourceChangeWarning: source code of class 'torch.nn.modules.container.ModuleList' has changed. Saved a reverse patch to ModuleList.patch. Run `patch -p0 < ModuleList.patch` to revert your changes.
  warnings.warn(msg, SourceChangeWarning)
/DialoFlow/FlowScore/dialoflow_venv/lib/python3.9/site-packages/torch/serialization.py:658: SourceChangeWarning: source code of class 'torch.nn.modules.normalization.LayerNorm' has changed. Saved a reverse patch to LayerNorm.patch. Run `patch -p0 < LayerNorm.patch` to revert your changes.
  warnings.warn(msg, SourceChangeWarning)
/DialoFlow/FlowScore/dialoflow_venv/lib/python3.9/site-packages/torch/serialization.py:658: SourceChangeWarning: source code of class 'model.PlanModel' has changed. Saved a reverse patch to PlanModel.patch. Run `patch -p0 < PlanModel.patch` to revert your changes.
  warnings.warn(msg, SourceChangeWarning)
/DialoFlow/FlowScore/dialoflow_venv/lib/python3.9/site-packages/torch/serialization.py:658: SourceChangeWarning: source code of class 'torch.nn.modules.loss.MSELoss' has changed. Saved a reverse patch to MSELoss.patch. Run `patch -p0 < MSELoss.patch` to revert your changes.
  warnings.warn(msg, SourceChangeWarning)
/DialoFlow/FlowScore/dialoflow_venv/lib/python3.9/site-packages/torch/serialization.py:658: SourceChangeWarning: source code of class 'torch.nn.modules.linear.Linear' has changed. Saved a reverse patch to Linear.patch. Run `patch -p0 < Linear.patch` to revert your changes.
  warnings.warn(msg, SourceChangeWarning)
/DialoFlow/FlowScore/dialoflow_venv/lib/python3.9/site-packages/torch/serialization.py:658: SourceChangeWarning: source code of class 'torch.nn.modules.activation.Sigmoid' has changed. Saved a reverse patch to Sigmoid.patch. Run `patch -p0 < Sigmoid.patch` to revert your changes.
  warnings.warn(msg, SourceChangeWarning)
/DialoFlow/FlowScore/dialoflow_venv/lib/python3.9/site-packages/torch/serialization.py:658: SourceChangeWarning: source code of class 'torch.nn.modules.loss.CrossEntropyLoss' has changed. Saved a reverse patch to CrossEntropyLoss.patch. Run `patch -p0 < CrossEntropyLoss.patch` to revert your changes.
  warnings.warn(msg, SourceChangeWarning)

How could I fix this and get numerical scores? Could you share the requirements.txt? maybe it is other packages that are causing the issue.

Turn-level DSTC9 annotation data

I noticed there is just dialog-level DSTC9 data used in your work. May I ask if turn-level DSTC9 annotation data is made public? How can I get this data? Thank you very much!

about data tokenizer

hello, i see tokenizer seq in paper is :
[u1] [C] [u2] [C] [res] [C]

but tokenizer in code dataset is :
[speaker1] [u1] [eos] [speaker2] [u2] [eos] [bos] [res] [eos]

Is there any difference between the two? which works best

config.json not found

I got the following error when running the fine-tune.sh:

Traceback (most recent call last):
File "D:\python3.7\lib\site-packages\transformers\configuration_utils.py", line 238, in get_config_dict
local_files_only=local_files_only,
File "D:\python3.7\lib\site-packages\transformers\file_utils.py", line 578, in cached_path
raise EnvironmentError("file {} not found".format(url_or_filename))
OSError: file ../models/DialoFlow_base/config.json not found

Could you tell me where is your model's config file? I couldn't find it. Thanks.

Clarification on post-processing generated result

Hi. Kudos for this nice work. I am trying to reproduce the results on DailyDialog dataset. It will be very helpful if you can clarify the following details.
In Issue #13, you mentioned using "nltk.word_tokenize() to tokenize the sentence and then concatenate the tokens" to make the format of the generated dialogue same as the reference response. I have two questions here,

  1. Did you use any post-processing on the reference files?
  2. Did you try only nltk.word_tokenize() or some other tokenizer as well?

It will be very useful if you can briefly mention your post-processing steps.

Unable to reappear?

I hope I can modify the code and README, because there are many errors when running again. Thank you very much

Why PlanModel didn't use mask

     for i, block in enumerate(self.h):
        outputs = block(hidden_states)
        hidden_states, present = outputs[:2]

in this code , planmodel didn't use mask .
it make seq can attention future context

ModuleNotFoundError: No module named 'flow_score'

Hello,
I tried to follow the README, but I receive a ModuleNotFoundError. I downloaded the model from the Google Drive and put it into a custom modelpath. I don't see why this error would appear. Do you perhaps know where the issue stems from?

image

code

代码中的 empty 是表示论文里面的 C 么?
info 这个特别字符表示的是什么意思呢?

}O}3ZG%6%G}OVH_T~)XS S8
在编码的时候只添加了history,后面的responsez这个后面是没有使用么?

Can't load the model!

Greetings,

Actually I'm surprised that such an error came up, my problem lies with this line

model = torch.load("models/DialoFlow_large/model.bin")

model.bin is placed appropriately, and EC2 works with cuda 11.2 and pytorch = 1.9.

Where would the problem come from?

Thanks in advance

about the Automatic evaluation

First of all, thank you very much for your help. I have encountered a problem and hope you can answer it. I used myself to implement the automatic evaluation indicators, but the results are quite different from those in the paper. Can you please disclose the implementation code of your evaluation indicators?

Nan in README instruction

Hi,

I followed "How to use?" in the README but got strange results.
flow_score is nan.

The only changes is putting these tensors to self.cuda and my environment has no GPU.

conv_seq = conv_seq.unsqueeze(0).cuda()
sentence_index = sentence_index.unsqueeze(0).cuda()
token_type_seq = token_type_seq.unsqueeze(0).cuda()

Environment: python==3.7 torch==1.7.0 transformers==3.0.2 regex==2017.4.5

Could you please provide some guides to solve this?

model

When will the Chinese version be released?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.