thu-coai / ecm Goto Github PK

This project is a tensorflow implement of our work, ECM (emotional chatting machine).

License: Apache License 2.0

Python 100.00%

ecm's Introduction

Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory

Perception and expression of emotion are key factors to the success of dialogue systems or conversational agents. However, this problem has not been studied in large-scale conversation generation so far. In this paper, we propose Emotional Chatting Machine (ECM) that can generate appropriate responses not only in content (relevant and grammatical) but also in emotion (emotionally consistent). The overview of ECM is shown in Figure 1.

This project is a tensorflow implement of our work, ECM.

Dependencies

Python 2.7
Numpy
Tensorflow 0.12

Quick Start

Dataset

Due to the copyright of the STC dataset, you can ask Lifeng Shang ([email protected]) for the STC dataset (Neural Responding Machine for Short-Text Conversation), and build the ESTC dataset follow the instruction in the Data Preparation Section of our paper, ECM.

The basic format of the sample data is:

[[[post, emotion tag1, emotion tag2], [[response1, emotion tag1, emotion tag2], [response2, emotion tag1, emotion tag2], ...], ...]

where emotion tag1 is generated by neural network classifier which is used in our model, and emotion tag2 is generated by rule-based classifier which is not used.

The basic format of the ememory file used in our model is:

[[word1_of_emotion1, word2_of_emotion1,…], [word1_of_emotion2, word2_of_emotion2, …], …]

which is built according to the vocabulary and the emotion dictionary.

For your convenience, we also recommand you implement your model using the NLPCC2017 dataset, which has more than 1 million Weibo post-response pairs with emotional labels.

Train

python baseline.py --use_emb --use_imemory --use_ememory

You can remove "--use_emb", "--use_imemory", "--use_ememory" to remove the embedding, internal memory, and external memory module respectively. The model will achieve the expected performance after 20 epochs.

Test

python baseline.py --use_emb --use_imemory --use_ememory --decode

You can test and apply the ecm model using this command. Note: the input words should be splitted by ' ', for example, '我很喜欢你！', or you can add the chinese text segmentation module in split() function.

Details

Training

You can change the model parameters using:

--size xxx 				the hidden size of each layer
--num_layers xxx 			the number of RNN layers
--batch_size xxx 			batch size to use during training 
--steps_per_checkpoint xxx 		steps to save and evaluate the model
--train_dir xxx				training directory
--use_emb xxx				whether to use the embedding module
--use_imemory xxx 			whether to use the internal memory module
--use_ememory xxx 			whether to use the external memory module

Evaluation

The automatic evaluation is shown as：

The sample responses generated by Seq2Seq and ECM is shown as：

Paper

Hao Zhou, Minlie Huang, Tianyang Zhang, Xiaoyan Zhu, Bing Liu.
Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory.
AAAI 2018, New Orleans, Louisiana, USA.

Please kindly cite our paper if this paper and the code are helpful.

Acknowlegments

Thanks for the kind help of Prof. Minlie Huang and Prof. Xiaoyan Zhu. Thanks for the support of my teammates.

License

Apache License 2.0

ecm's People

Contributors

Stargazers

Watchers

Forkers

jidlin fence xiximeng shinichr mengxj08 aucson haoyusoong luomuqinghan wangzhijun2020 zzbzzb1413 gearsuccess sunyz cyzhangathit dilberdillu yifei87 zekkenn expressgit haozijie zhwo nxw1994 drewer9 chinashijiashuai strategist922 zhongpeixiang haswelliris all-iswell alex-sanda preke lqo202 diegosguille yy-yun233 zhongyunuestc ruoyao-wang happyyolanda sherlynchan tomarraj008 bofei5675 yijun-mao awa121 katherinelyx huanglinshan ttgit chrisliu007 fivekilometers cuiqingyao vikaschowdary tonydeep xiaoweiweixiao cstghitpku xl2248 tianhaofu dengkaidk jocelyn1981 rushing-snail com77001 hughzeng-x xrosliang tuxchow whoolly wellb3ing neu-cse zmyivi csh-apprentice icelighting kquark wangyong848 laozhanghahaha abigailcui

ecm's Issues

请问一下实际应用当中，用bucket的效果更好还是用dynamic rnn的效果更好？

带上--use_ememory参数没有相应的ememory文件

@tuxchow ，train的时候说没有记忆文件，什么情况？谢谢

情感分类器用到的NLPCC2013和2014的23105个微博句子的数据集可以提供下吗？

请问处理后的数据集可以发一下我邮箱吗？提取出来的包含标签和句子的，可以用来训练分类器的数据集。可以发下吗？我的邮箱[email protected]

关于ememory_vocab_file和vector.txt

作者您好，读了您的论文之后感觉非常有意思，想复现一遍。但是在复现的时候遇到了缺少ememory_vocab_file和vector.txt这两个文件，其中vector.txt代码中可以随机生成，但是ememory_vocab_file是需要提供的，请问是否能提供这两个文件的下载地址？

数据集地址链接失效

你好，http://www.aihuang.org/p/challenge.html 这个网址已经失效了，请问还能去哪里下载数据集？

请问你们的ppl是如何计算的？

请问一下正则表达式如何处理中文符号？

_WORD_SPLIT = re.compile(b"([.,!?"':;)(])")这个实际测试不能处理中文的符号，如“。”这样的

Got a problem with your parper

dear author:
I could't find the definition of the vector-(Vu) anywhere in you paper on AAAI,could you please explain that for me ?

How can i directly make a chinese word into type''int''?

The basic format is as follows:
[[[post, emotion tag1, emotion tag2], [[response1, emotion tag1, emotion tag2], [response2, emotion tag1, emotion tag2], ...], ...]
How can i directly change a chinese word into type''int'' since an error occured and raised "ValueError"
Thanks for replying.

跑得非常慢，怎么能训练得快一点呢

请问解码的在解码的时候，ememory_vocab是否用到了？

else:

                    # This is a greedy decoder - outputs are just argmaxes of output_logits.
                    outputs = [int(np.argmax(np.split(logit, [2, FLAGS.response_vocab_size], axis=1)[1], axis=1)+2) for logit in output_logits]
                    # If there is an EOS symbol in outputs, cut them at that point.
                    if data_utils.EOS_ID in outputs:
                        outputs = outputs[:outputs.index(data_utils.EOS_ID)]
                    # Print out response sentence corresponding to outputs.
                    print(int2emotion[decoder_emotion]+': '+"".join([tf.compat.as_str(rev_response_vocab[output]) for output in outputs]))
            print("> ", end="")

在baseline.py这个文件中解码时没有体现使用外部词表， @tuxchow 请问这个是为什么呢？

有没有大牛能更新到Python3+tensorflow1.0以上的？

encoder_cell = my_rnn_cell.EmbeddingWrapper 不能这样用吗？

encoder_cell = rnn_cell.EmbeddingWrapper(
en_cell, embedding_classes=num_encoder_symbols,
embedding_size=embedding_size)
不能用成
encoder_cell = my_rnn_cell.EmbeddingWrapper(
en_cell, embedding_classes=num_encoder_symbols,
embedding_size=embedding_size)
会报错TypeError: The parameter cell is not RNNCell.

Troubles with the train - vocabulary data

We're trying to replicate your model using the default data set included in /ecm/data/train, but we've got two errors which are shown in the image that indicate "Reshape cannot infer the missing input size for an empty tensor unless all specified input sizes are non zero".
We are supposing that the file format for the files is the following:

train.txt: [[[post, emotion tag1, emotion tag2], [[response1, emotion tag1, emotion tag2], [response2, emotion tag1, emotion tag2], ...], ...]
vector.txt: can be an empty file
vocab.ememory: [[word1_emot1, word2_emot1,..], [word1_emot2, word2_emot2, ...], [word1_emot3, word2_emot3, ...], [word1_emot4, word2_emot4, ...], [word1_emot5, word2_emot5, ...], [word1_emot6, word2_emot6, ...] ...]
Questions:
Are there any other file that we need in order to train?
What's the purpose of dev.txt file?
May we are doing something wrong or are missing something for the execution?

请问把句子中的单词id转换成对应的词向量的代码在哪里？

请问把句子中的单词id转换成对应的词向量的代码在哪里？我在代码中找了很久，还是没有看到。

请问之前的情感分类数据集哪里能下载呢？

在NLPCC2013网站上没有找到

Emotion Category Input vs. Emotion Detection

From what I can make out in the code, you are using the input emotion category (that the user can set - {joy, anger, sadness etc.}) to condition the response. So, do I understand correctly that you are actually not detecting any emotion from the user text input but rather hardwire the emotion in the response to the emotion input category, no matter what the user's emotion in the input text is?

Thanks in advance for the clarifications 👍

ememory的问题。

    if use_ememory:
        external_memory = variable_scope.get_variable("external_memory", [emotion_category, num_symbols], trainable=False)
        ememory = embedding_ops.embedding_lookup(external_memory, decoder_emotions[0])

您好，有个问题请教一下，如上例中代码所示：

解码的时候用到了use_ememory，但是针对一个batch的decoder_emotions，为何只用batch中的第一个case的emotion来当作整个batch的emotion来训练？即为何是 decoder_emotions[0]而不是decoder_emotions？

这是一个BUG吗？

谢谢。

How to get the emotion words and generic words ?

你好，请问情感词和一般词是怎么获取的，麻烦可以提供这个数据的下载链接吗

词汇表文件，不知道什么样呢，能否给个demo，或者分享一下，万分感谢

InvalidArgumentError

我在用data中的小规模数据训练的时候出现了下面的错误，因为数据规模小，所以同时将post和response词表的大小都改成了100，不知道有没有这方面的原因。

Caused by op _u'model_with_buckets/embedding_attention_seq2seq_1/embedding_attention_decoder/attention_decoder/Reshape_3',_` defined at:
  File "baseline.py", line 392, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "baseline.py", line 388, in main
    train()
  File "baseline.py", line 195, in train
    model = create_model(sess, False, False)
  File "baseline.py", line 139, in create_model
    dtype=dtype)
  File "/home/emotion_generation/mcy/ecm/seq2seq_model.py", line 157, in __init__
    softmax_loss_function=softmax_loss_function, use_imemory=use_imemory, use_ememory=use_ememory)
  File "/home/emotion_generation/mcy/ecm/seq2seq.py", line 1322, in model_with_buckets
    decoder_inputs[:bucket[1]], decoder_emotions)
  File "/home/emotion_generation/mcy/ecm/seq2seq_model.py", line 156, in <lambda>
    lambda x, y, z: seq2seq_f(x, y, z, False),
  File "/home/emotion_generation/mcy/ecm/seq2seq_model.py", line 119, in seq2seq_f
    beam_size=beam_size)
  File "/home/emotion_generation/mcy/ecm/seq2seq.py", line 1171, in embedding_attention_seq2seq
    beam_size=beam_size)
  File "/home/emotion_generation/mcy/ecm/seq2seq.py", line 1055, in embedding_attention_decoder
    initial_state_attention=initial_state_attention)
  File "/home/emotion_generation/mcy/ecm/seq2seq.py", line 701, in attention_decoder
    s1 = tf.nn.softmax(output1, dim=0) * g
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_ops.py", line 1667, in softmax
    return _softmax(logits, gen_nn_ops._softmax, dim, name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_ops.py", line 1630, in _softmax
    logits = _flatten_outer_dims(logits)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_ops.py", line 1551, in _flatten_outer_dims
    output = array_ops.reshape(logits, array_ops.concat([[-1], last_dim_size], 0))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 3938, in reshape
    "Reshape", tensor=tensor, shape=shape, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2956, in create_op
    op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1470, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Reshape cannot infer the missing input size for an empty tensor unless all specified input sizes are non-zero
	 [[Node: model_with_buckets/embedding_attention_seq2seq_1/embedding_attention_decoder/attention_decoder/Reshape_3 = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](model_with_buckets/embedding_attention_seq2seq_1/embedding_attention_decoder/attention_decoder/transpose_3, model_with_buckets/embedding_attention_seq2seq_1/embedding_attention_decoder/attention_decoder/concat_6)]]
	 [[Node: clip_by_global_norm_1/mul_8/_1657 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_21552_clip_by_global_norm_1/mul_8", tensor_type=DT_FLOAT, `_device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]```

ECG数据集ecg_train_data.json中0~5分别对应的是什么情感？

你好,我在使用ECG数据集ecg_train_data.json时，发现标签是0-5的数字，我想请问下这0-5对应的是什么情感?谢谢！

NLPCC2017 emotional conversation dataset not available

The guidance recommended the NLPCC2017 dataset as the substitute for ESTC dataset, but it's not available in the linked website. Specificly, when I click the Download link in the guidance document presented in NLPCC website, the target website has gone. If possible, could anyone please offer a copy for that?

训练数据的格式是什么样子的

想先跑通您开源的程序，发现需要train,dev这两个文件。我自己根据提示下载了nlpcc 2017的train data,但是如何组织dev data

Make the code more user friendly

Hello,
Thanks a lot for your work.
To make the code more user friendly you might want to use the argparse librairy to have less hardcoded strings:

config is hardcoded both in utils and baseline
the gpu number is harcoded
those argument could be combined with the tensorflow flag.
Great work ! Thanks for publishing it !