Coder Social home page Coder Social logo

timothyxxx / df-net Goto Github PK

View Code? Open in Web Editor NEW

This project forked from looperxx/df-net

0.0 0.0 0.0 139.26 MB

Open source code for ACL 2020 Paper "Dynamic Fusion Network for Multi-Domain End-to-end Task-Oriented Dialog"

Python 94.19% Perl 5.81%

df-net's Introduction

Dynamic Fusion Network for Multi-Domain End-to-end Task-Oriented Dialog

PWC

This repository contains the PyTorch implementation of the paper:

Dynamic Fusion Network for Multi-Domain End-to-end Task-Oriented Dialog. Libo Qin, Xiao Xu, Wanxiang Che, Yue Zhang, Ting Liu. ACL 2020. [PDF]

If you use any source codes or the datasets included in this toolkit in your work, please cite the following paper. The bibtex are listed below:

@inproceedings{qin-etal-2020-dynamic,
    title = "Dynamic Fusion Network for Multi-Domain End-to-end Task-Oriented Dialog",
    author = "Qin, Libo  and
      Xu, Xiao  and
      Che, Wanxiang  and
      Zhang, Yue  and
      Liu, Ting",
    booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
    month = jul,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.acl-main.565",
    pages = "6344--6354",
    abstract = "Recent studies have shown remarkable success in end-to-end task-oriented dialog system. However, most neural models rely on large training data, which are only available for a certain number of task domains, such as navigation and scheduling. This makes it difficult to scalable for a new domain with limited labeled data. However, there has been relatively little research on how to effectively use data from all domains to improve the performance of each domain and also unseen domains. To this end, we investigate methods that can make explicit use of domain knowledge and introduce a shared-private network to learn shared and specific knowledge. In addition, we propose a novel Dynamic Fusion Network (DF-Net) which automatically exploit the relevance between the target domain and each domain. Results show that our models outperforms existing methods on multi-domain dialogue, giving the state-of-the-art in the literature. Besides, with little training data, we show its transferability by outperforming prior best model by 13.9{\%} on average.",
}

contrast

In the following, we will guide you how to use this repository step by step.

Architecture

framework

Results

result

We clean our code, rerun the experiments based on the following environment and the suggested hyper-parameter settings.

Datasets BLEU F1 Navigate F1 Weather F1 Calendar F1 Datasets BLEU F1 Restaurant F1 Attraction F1 Hotel F1
SMD 15.2 62.5 55.7 57.3 73.8 MultiWOZ 9.5 34.8 37.5 31.2 32.8

Preparation

Our code is based on PyTorch 1.2 Required python packages:

  • numpy==1.14.2
  • tqdm==4.44.1
  • pytorch==1.2.0
  • python==3.6.3
  • cudatoolkit==9.2
  • cudnn==7.6.5

We highly suggest you using Anaconda to manage your python environment.

How to Run it

The script myTrain.py acts as a main function to the project, you can run the experiments by the following commands.

# SMD dataset
python myTrain.py -gpu=True -ds=kvr -dr=0.2 -bsz=32 -tfr=0.8 -an=SMD -op=SMD.log
# MultiWOZ 2.1 dataset
python myTrain.py -gpu=True -ds=woz -dr=0.2 -bsz=32 -tfr=0.9 -an=WOZ -op=WOZ.log

We also provide our reported model parameters in the save/best directory, you can run the following command to evaluate them and so on.

python myTrain.py -gpu=True -e=0 -ds=kvr -bsz=32 -path=save/best/SMD -op=SMD.log
python myTrain.py -gpu=True -e=0 -ds=woz -bsz=32 -path=save/best/MultiWOZ -op=WOZ.log

Due to some stochastic factors(e.g., GPU and environment), it maybe need to slightly tune the hyper-parameters using grid search to reproduce the results reported in our paper.

All the hyper-parameters are in the utils/config.py and here are the suggested hyper-parameter settings for grid search:

  • Dropout ratio [0.1, 0.15, 0.2, 0.25, 0.3]
  • Batch size [8, 16, 32]
  • Teacher forcing ratio [0.7, 0.8, 0.9, 1.0]

If you have any question, please issue the project or email me and we will reply you soon.

Acknowledgement

Global-to-local Memory Pointer Networks for Task-Oriented Dialogue. Chien-Sheng Wu, Richard Socher, Caiming Xiong. ICLR 2019. [PDF] [Open Reivew] [Code]

We are highly grateful for the public code of GLMP!

df-net's People

Contributors

looperxx avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.