Coder Social home page Coder Social logo

ecm's Introduction

Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory

Perception and expression of emotion are key factors to the success of dialogue systems or conversational agents. However, this problem has not been studied in large-scale conversation generation so far. In this paper, we propose Emotional Chatting Machine (ECM) that can generate appropriate responses not only in content (relevant and grammatical) but also in emotion (emotionally consistent). The overview of ECM is shown in Figure 1.

image

This project is a tensorflow implement of our work, ECM.

Dependencies

  • Python 2.7
  • Numpy
  • Tensorflow 0.12

Quick Start

  • Dataset

Due to the copyright of the STC dataset, you can ask Lifeng Shang ([email protected]) for the STC dataset (Neural Responding Machine for Short-Text Conversation), and build the ESTC dataset follow the instruction in the Data Preparation Section of our paper, ECM.

The basic format of the sample data is:

[[[post, emotion tag1, emotion tag2], [[response1, emotion tag1, emotion tag2], [response2, emotion tag1, emotion tag2], ...], ...]

where emotion tag1 is generated by neural network classifier which is used in our model, and emotion tag2 is generated by rule-based classifier which is not used.

The basic format of the ememory file used in our model is:

[[word1_of_emotion1, word2_of_emotion1,…], [word1_of_emotion2, word2_of_emotion2, …], …]

which is built according to the vocabulary and the emotion dictionary.

For your convenience, we also recommand you implement your model using the NLPCC2017 dataset, which has more than 1 million Weibo post-response pairs with emotional labels.

  • Train

    python baseline.py --use_emb --use_imemory --use_ememory

You can remove "--use_emb", "--use_imemory", "--use_ememory" to remove the embedding, internal memory, and external memory module respectively. The model will achieve the expected performance after 20 epochs.

  • Test

    python baseline.py --use_emb --use_imemory --use_ememory --decode

You can test and apply the ecm model using this command. Note: the input words should be splitted by ' ', for example, '我 很 喜欢 你 !', or you can add the chinese text segmentation module in split() function.

Details

Training

You can change the model parameters using:

--size xxx 				the hidden size of each layer
--num_layers xxx 			the number of RNN layers
--batch_size xxx 			batch size to use during training 
--steps_per_checkpoint xxx 		steps to save and evaluate the model
--train_dir xxx				training directory
--use_emb xxx				whether to use the embedding module
--use_imemory xxx 			whether to use the internal memory module
--use_ememory xxx 			whether to use the external memory module

Evaluation

The automatic evaluation is shown as:

image

The sample responses generated by Seq2Seq and ECM is shown as:

image

Paper

Hao Zhou, Minlie Huang, Tianyang Zhang, Xiaoyan Zhu, Bing Liu.
Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory.
AAAI 2018, New Orleans, Louisiana, USA.

Please kindly cite our paper if this paper and the code are helpful.

Acknowlegments

Thanks for the kind help of Prof. Minlie Huang and Prof. Xiaoyan Zhu. Thanks for the support of my teammates.

License

Apache License 2.0

ecm's People

Contributors

tuxchow avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ecm's Issues

关于ememory_vocab_file和vector.txt

作者您好,读了您的论文之后感觉非常有意思,想复现一遍。但是在复现的时候遇到了缺少ememory_vocab_file和vector.txt这两个文件,其中vector.txt代码中可以随机生成,但是ememory_vocab_file是需要提供的,请问是否能提供这两个文件的下载地址?

Got a problem with your parper

_20180822210013
dear author:
I could't find the definition of the vector-(Vu) anywhere in you paper on AAAI,could you please explain that for me ?

How can i directly make a chinese word into type''int''?

image
The basic format is as follows:
[[[post, emotion tag1, emotion tag2], [[response1, emotion tag1, emotion tag2], [response2, emotion tag1, emotion tag2], ...], ...]
How can i directly change a chinese word into type''int'' since an error occured and raised "ValueError"
Thanks for replying.

请问解码的在解码的时候,ememory_vocab是否用到了?

else:

                    # This is a greedy decoder - outputs are just argmaxes of output_logits.
                    outputs = [int(np.argmax(np.split(logit, [2, FLAGS.response_vocab_size], axis=1)[1], axis=1)+2) for logit in output_logits]
                    # If there is an EOS symbol in outputs, cut them at that point.
                    if data_utils.EOS_ID in outputs:
                        outputs = outputs[:outputs.index(data_utils.EOS_ID)]
                    # Print out response sentence corresponding to outputs.
                    print(int2emotion[decoder_emotion]+': '+"".join([tf.compat.as_str(rev_response_vocab[output]) for output in outputs]))
            print("> ", end="")

在baseline.py这个文件中解码时没有体现使用外部词表, @tuxchow 请问这个是为什么呢?

encoder_cell = my_rnn_cell.EmbeddingWrapper 不能这样用吗?

encoder_cell = rnn_cell.EmbeddingWrapper(
en_cell, embedding_classes=num_encoder_symbols,
embedding_size=embedding_size)
不能用成
encoder_cell = my_rnn_cell.EmbeddingWrapper(
en_cell, embedding_classes=num_encoder_symbols,
embedding_size=embedding_size)
会报错TypeError: The parameter cell is not RNNCell.

Troubles with the train - vocabulary data

We're trying to replicate your model using the default data set included in /ecm/data/train, but we've got two errors which are shown in the image that indicate "Reshape cannot infer the missing input size for an empty tensor unless all specified input sizes are non zero".
We are supposing that the file format for the files is the following:

  • train.txt: [[[post, emotion tag1, emotion tag2], [[response1, emotion tag1, emotion tag2], [response2, emotion tag1, emotion tag2], ...], ...]
  • vector.txt: can be an empty file
  • vocab.ememory: [[word1_emot1, word2_emot1,..], [word1_emot2, word2_emot2, ...], [word1_emot3, word2_emot3, ...], [word1_emot4, word2_emot4, ...], [word1_emot5, word2_emot5, ...], [word1_emot6, word2_emot6, ...] ...]
    Questions:
  • Are there any other file that we need in order to train?
  • What's the purpose of dev.txt file?
  • May we are doing something wrong or are missing something for the execution?
    data_error

Emotion Category Input vs. Emotion Detection

From what I can make out in the code, you are using the input emotion category (that the user can set - {joy, anger, sadness etc.}) to condition the response. So, do I understand correctly that you are actually not detecting any emotion from the user text input but rather hardwire the emotion in the response to the emotion input category, no matter what the user's emotion in the input text is?

Thanks in advance for the clarifications 👍

ememory的问题。

    if use_ememory:
        external_memory = variable_scope.get_variable("external_memory", [emotion_category, num_symbols], trainable=False)
        ememory = embedding_ops.embedding_lookup(external_memory, decoder_emotions[0])

您好,有个问题请教一下,如上例中代码所示:

解码的时候用到了use_ememory,但是针对一个batch的decoder_emotions,为何只用batch中的第一个case的emotion来当作整个batch的emotion来训练?即为何是 decoder_emotions[0]而不是decoder_emotions?

这是一个BUG吗?

谢谢。

InvalidArgumentError

我在用data中的小规模数据训练的时候出现了下面的错误,因为数据规模小,所以同时将post和response词表的大小都改成了100,不知道有没有这方面的原因。

Caused by op _u'model_with_buckets/embedding_attention_seq2seq_1/embedding_attention_decoder/attention_decoder/Reshape_3',_` defined at:
  File "baseline.py", line 392, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "baseline.py", line 388, in main
    train()
  File "baseline.py", line 195, in train
    model = create_model(sess, False, False)
  File "baseline.py", line 139, in create_model
    dtype=dtype)
  File "/home/emotion_generation/mcy/ecm/seq2seq_model.py", line 157, in __init__
    softmax_loss_function=softmax_loss_function, use_imemory=use_imemory, use_ememory=use_ememory)
  File "/home/emotion_generation/mcy/ecm/seq2seq.py", line 1322, in model_with_buckets
    decoder_inputs[:bucket[1]], decoder_emotions)
  File "/home/emotion_generation/mcy/ecm/seq2seq_model.py", line 156, in <lambda>
    lambda x, y, z: seq2seq_f(x, y, z, False),
  File "/home/emotion_generation/mcy/ecm/seq2seq_model.py", line 119, in seq2seq_f
    beam_size=beam_size)
  File "/home/emotion_generation/mcy/ecm/seq2seq.py", line 1171, in embedding_attention_seq2seq
    beam_size=beam_size)
  File "/home/emotion_generation/mcy/ecm/seq2seq.py", line 1055, in embedding_attention_decoder
    initial_state_attention=initial_state_attention)
  File "/home/emotion_generation/mcy/ecm/seq2seq.py", line 701, in attention_decoder
    s1 = tf.nn.softmax(output1, dim=0) * g
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_ops.py", line 1667, in softmax
    return _softmax(logits, gen_nn_ops._softmax, dim, name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_ops.py", line 1630, in _softmax
    logits = _flatten_outer_dims(logits)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_ops.py", line 1551, in _flatten_outer_dims
    output = array_ops.reshape(logits, array_ops.concat([[-1], last_dim_size], 0))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 3938, in reshape
    "Reshape", tensor=tensor, shape=shape, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2956, in create_op
    op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1470, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Reshape cannot infer the missing input size for an empty tensor unless all specified input sizes are non-zero
	 [[Node: model_with_buckets/embedding_attention_seq2seq_1/embedding_attention_decoder/attention_decoder/Reshape_3 = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](model_with_buckets/embedding_attention_seq2seq_1/embedding_attention_decoder/attention_decoder/transpose_3, model_with_buckets/embedding_attention_seq2seq_1/embedding_attention_decoder/attention_decoder/concat_6)]]
	 [[Node: clip_by_global_norm_1/mul_8/_1657 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_21552_clip_by_global_norm_1/mul_8", tensor_type=DT_FLOAT, `_device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]```

NLPCC2017 emotional conversation dataset not available

The guidance recommended the NLPCC2017 dataset as the substitute for ESTC dataset, but it's not available in the linked website. Specificly, when I click the Download link in the guidance document presented in NLPCC website, the target website has gone. If possible, could anyone please offer a copy for that?

Make the code more user friendly

Hello,
Thanks a lot for your work.
To make the code more user friendly you might want to use the argparse librairy to have less hardcoded strings:

  • config is hardcoded both in utils and baseline
  • the gpu number is harcoded
  • those argument could be combined with the tensorflow flag.
    Great work ! Thanks for publishing it !

[Errno 2] No such file or directory: 'vector.txt'

您好!对您的工作很感兴趣,我在复现您工作的时候发现出了这样的问题:
[Errno 2] No such file or directory: 'vector.txt'
请问这个进行word embedding初始配置(我理解是这样)的vector.txt是如何生成的?如果方便的话,能否公开出来或者提供下载方式?谢谢!

代码运行问题

代码运行一段时间后,自己卡住,没有报错,显存也够。就是卡住不动

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.