Coder Social home page Coder Social logo

dialogue-state-induction's Introduction

Code for paper "Dialogue State Induction Using Neural Latent Variable Models", IJCAI 2020.

Dialogue State Induction (DSI)

Dialogue state induction aims at automatically induce dialogue states from raw dialogues. We assume that there is a large set of dialogue records of many different domains, but without manual labeling of dialogue states. Such data are relatively easy to obtain, for example from customer service call records from different businesses. Consequently, we propose the task of dialogue state induction (DSI), which is to automatically induce slot-value pairs from raw dialogue data. The difference between DSI and DST is illustrated in Figure 1.

DSIvsDST Figure 1: Comparison between DSI and traditional DST. The strikeghtough font is used to rpresent the resources not needed by DSI. The dialogue state is accumulated as the dialogue proceeds. Turns are separately by dashed lines. Dialogues and external ontology are separated by black lines

In paricular, we use two neural latent variable models (actually one model together with its multi-domain variant) to induce dialogue states. Two datasets, e.g., MultiWOZ2.1 and SGD, are used in our experiments.

Neural Latent Variable Models

In this work, we introduce two incrementally complex neural latent variable models for DSI by treating th whole state and each slot as latent variables. models

Experiments

We evaluate our proposed DSI task on two dataset: MultiWOZ 2.1 and SGD. We consider two evaluation metrics:

  1. State Matching (Precision, Recall and F1-score in Table 1): To evaluate the overlapping of induced states and the ground truth.
  2. Goal Accuracy (Accuracy in Table 1): The predicted dialogue states for a turn is considered as true only when all the user search goal constraints are correctly and exactly identified.

We evaluate both metrics in both the turn level and the joint level (Table 1). The joint-level dialogue state is the typical definition of dialogue state, which reflects the user goal from the beginning of the dialogue until the current turn, where turn-level state reflects local user goal at each dialogue turn. The joint level metrics are more strict in jointly considering the output of all turns.

Note: A fuzzy matching mechanism is used to compare induced values with the ground truth, which is similar to the evaluation of SGD dataset.

MultiWOZ 2.1SGD
Turn LevelJoint Level
ModelF1AccuracyF1AccuracyF1AccuracyF1Accuracy
DSI-base37.325.732.12.326.0 21.114.52.3
DSI-GM 49.6 36.144.85.033.5 27.519.53.1

Pre-processing

For pre-processing, you can either directly download pre-processed files or build training data by yourself.

Download

Build data

Required packages

  1. python 3.7
  2. pytorch>=1.0
  3. allennlp
  4. stanfordcorenlp

Required files

  1. stanford-corenlp-full-2018-10-05: preprocessing (candidate extraction) toolkit. Download from stanfordnlp and unzip it into utils folder.
  2. elmo pretrained model: pretrained model. Download weights file and options file from allennlp and put them into utils/elmo_pretrained_model folder.

Dataset

For MultiWOZ 2.1 dataset:

  1. Download MultiWOZ 2.1.
  2. Data preprocessing: Process MultiWOZ 2.1 to follow the data format of the Schema-Guided Dialogue dataset (to run the script, you need to install tensorflow).

python multiwoz/create_data_from_multiwoz.py \ --input_data_dir=<downloaded_multiwoz2.1_dir> \ --output_dir data/multiwoz21

For SGD dataset:

  1. Download SGD.
  2. Put train, dev and test folders into data/dstc8 folder.

Run

Extract value candidates, extract features and build vocabulary.

python build_data.py -t multiwoz21|dstc8

Model

Required packages

  1. python 3.7
  2. pytorch>=1.0
  3. numpy
  4. fuzzywuzzy
  5. tqdm
  6. sklearn

Training

python train.py -t multiwoz|dstc8 -r train -m dsi-base|dsi-gm

Configurations including data paths and model hyper-parameters are stored in config.py

Prediction

python train.py -t multiwoz21|dstc8 -r predict -m dsi-base|dsi-gm

Reference

Please cite the following paper if you use any source codes in your work:

@inproceedings{min2020dsi,
  title={Dialogue State Induction Using Neural Latent Variable Models},
  author={Min, Qingkai and Qin, Libo and Teng, Zhiyang and Liu, Xiao and Zhang, Yue},
  booktitle={IJCAI},
  pages={3845--3852},
  year={2020}
}

dialogue-state-induction's People

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.