Coder Social home page Coder Social logo

ccm's Introduction

Commonsense Knowledge Aware Conversation Generation with Graph Attention

Introduction

Commonsense knowledge is vital to many natural language processing tasks. In this paper, we present a novel open-domain conversation generation model to demonstrate how large-scale commonsense knowledge can facilitate language understanding and generation. Given a user post, the model retrieves relevant knowledge graphs from a knowledge base and then encodes the graphs with a static graph attention mechanism, which augments the semantic information of the post and thus supports better understanding of the post. Then, during word generation, the model attentively reads the retrieved knowledge graphs and the knowledge triples within each graph to facilitate better generation through a dynamic graph attention mechanism, as shown in Figure 1.

image

This project is a tensorflow implement of our work, CCM.

Dependencies

  • Python 2.7
  • Numpy
  • Tensorflow 1.3.0

Quick Start

  • Dataset

    Commonsense Conversation Dataset contains one-turn post-response pairs with the corresponding commonsense knowledge graphs. Each pair is associated with some knowledge graphs retrieved from ConceptNet. We have applied some filtering rules to retain high-quality and useful knowledge graphs.

    Please download the Commonsense Conversation Dataset to data directory.

  • Train

    python main.py

    The model will achieve the expected performance after 20 epochs.

  • Test

    python main.py --is_train False

    You can test the model using this command. The statistical result and the text result will be output to the 'test.res' file and the 'test.log' file respectively.

Details

Training

You can change the model parameters using:

--units xxx 				the hidden units
--layers xxx 				the number of RNN layers
--batch_size xxx 			batch size to use during training 
--per_checkpoint xxx 			steps to save and evaluate the model
--train_dir xxx				training directory

Evaluation

image

Paper

Hao Zhou, Tom Yang, Minlie Huang, Haizhou Zhao, Jingfang Xu, Xiaoyan Zhu.
Commonsense Knowledge Aware Conversation Generation with Graph Attention.
IJCAI-ECAI 2018, Stockholm, Sweden.

Please kindly cite our paper if this paper and the code are helpful.

Acknowlegments

Thanks for the kind help of Prof. Minlie Huang and Prof. Xiaoyan Zhu. Thanks for the support of my teammates.

License

Apache License 2.0

ccm's People

Contributors

tuxchow avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ccm's Issues

ConceptNet and input data

Hi, I am very interested in your work, but I have some doubts about where is the input data and how do you use the Conceptnet? Thanks for sharing!

Can't get the dataset

Hi tuxcow , I can't get the dataset from the download link, could you provide the new download address? Or provide sample dataset? Thank you anyway~

Not the author but I'm working on extending this paper for my master's thesis so I've done some work to decode the input data and be able to recreate it so I think I can provide some insight.

Not the author but I'm working on extending this paper for my master's thesis so I've done some work to decode the input data and be able to recreate it so I think I can provide some insight.

Match_triples are the triples where an entity from the post and an entity from the response appear in the same commonsense knowledge triple.

Match_index is the list of response entities matched with post entities in the following format: the first index is the number of the entity in post_triples that matches the current word (the list is the length of the response), the second index is the index of the entity that matches in all_entities. [-1,-1] is appended if the response word is not an entity or the entity doesn't match anything in the post.

Post_triples is a list of the entities that appear in the post, with 0 representing an entity is not found in the list of entities and >0 indicating an index of entities starting with 1 and incrementing each time a new entity is found.

all_entities is a list of all the matching entities on the other end of a csk triple for the entities found in the post.

response_triples is either -1 if a word is not an entity or the entity doesn't match a triple in csk or the index of the matched triple where the current word is an entity and that entity is part of a triple in both the post and response.

all_triples is simply a list of all the matched triples between the post and response entities.

Below is the script I've written to recreate the training data. It seems to output extra entities in all_entities for some reason, at least more than what the authors found but that shouldn't break anything I think. If you find a bug in the script please let me know so I can update it on my end.

import json

test = {"post": ["you", "mean", "the", "occupation", "that", "did", "happen", "?"], "response": ["no", "i", "mean", "the", "fighting", "invasion", "that", "the", "military", "made", "so", "many", "purple", "hearts", "for", "in", "anticipation", "for", "that", "we", "have", "n't", "used", "up", "to", "this", "day", "."]}
f = open('resource.txt')
data = json.load(f)
f.close()

data['postEntityToCSKTripleIndex'] = {}
data['postEntityToOtherCSKTripleEntities'] = {}
index = 0
for triple in data['csk_triples']:
    firstEntity = triple.split(',')[0]
    secondEntity = triple.split(',')[2].strip()
    if(not firstEntity in data['postEntityToCSKTripleIndex']):
        data['postEntityToCSKTripleIndex'][firstEntity] = []
    data['postEntityToCSKTripleIndex'][firstEntity].append(index)
    if(not secondEntity in data['postEntityToCSKTripleIndex']):
        data['postEntityToCSKTripleIndex'][secondEntity] = []
    data['postEntityToCSKTripleIndex'][secondEntity].append(index)

    if (not firstEntity in data['postEntityToOtherCSKTripleEntities']):
        data['postEntityToOtherCSKTripleEntities'][firstEntity] = []
    data['postEntityToOtherCSKTripleEntities'][firstEntity].append(data['dict_csk_entities'][secondEntity])
    if (not secondEntity in data['postEntityToOtherCSKTripleEntities']):
        data['postEntityToOtherCSKTripleEntities'][secondEntity] = []
    data['postEntityToOtherCSKTripleEntities'][secondEntity].append(data['dict_csk_entities'][firstEntity])
    index += 1

data['indexToCSKTriple'] = {v: k for k,v in data['dict_csk_triples'].items()}

post_triples = []
all_triples = []
all_entities = []

post = test['post']
index = 0
for word in post:
    try:
        entityIndex = data['dict_csk_entities'][word]
        index += 1
        post_triples.append(index)
        all_triples.append(data['postEntityToCSKTripleIndex'][word])
        all_entities.append(data['postEntityToOtherCSKTripleEntities'][word])
    except:
        post_triples.append(0)
test['post_triples'] = post_triples
test['all_triples'] = all_triples
test['all_entities'] = all_entities

response_triples = []
match_index = []
match_triples = []
for word in test['response']:
    try:
        found = False
        entityIndex = data['dict_csk_entities'][word]
        for index,entitiesList in enumerate(test['all_entities']):
            for subindex,entity in enumerate(entitiesList):
                if(entity == entityIndex):
                    match_index.append([index+1,subindex])
                    response_triples.append(test['all_triples'][index][subindex])
                    match_triples.append(test['all_triples'][index][subindex])
                    found = True
                    break
        if not found:
            response_triples.append(-1)
            match_index.append([-1,-1])
    except:
        response_triples.append(-1)
        match_index.append([-1,-1])

test['response_triples'] = response_triples
test['match_index'] = match_index
test['match_triples'] = match_triples
print(str(test))

Originally posted by @andrewtackett in https://github.com/tuxchow/ccm/issues/3#issuecomment-461907771

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.