Coder Social home page Coder Social logo

open-intent-discovery's Introduction

Open Intent Discovery through Unsupervised Semantic Clustering and Dependency Parsing

Introduction

Intent understanding plays an important role in dialog systems, and is typically formulated as a supervised learning problem. However, it is challenging and time-consuming to design the intents for a new domain from scratch, which usually requires a lot of manual effort of domain experts. This project presents an unsupervised two-stage approach to discover intents and generate meaningful intent labels automatically from a collection of unlabeled utterances in a domain, as illustrated in the following figure.

Unsupervised two-stage approach for intent discovery

In the first stage, we aim to generate a set of semantically coherent clusters where the utterances within each cluster convey the same intent. We obtain the utterance representation from various pre-trained sentence embeddings and apply clustering methods. In the second stage, the objective is to generate an intent label automatically for each cluster. We extract the ACTION-OBJECT pair from each utterance using a dependency parser and take the most frequent pair within each cluster, e.g., book-restaurant, as the generated intent label. We empirically show that the proposed unsupervised approach can generate meaningful intent labels automatically and achieve high precision and recall in utterance clustering and intent discovery.

Source Code

This repository contains the core code for running the experiments. The SNIPS dataset is preprocessed from https://github.com/sonos/nlu-benchmark. Please cite their paper if you use the dataset.

How to run the experiments?

The batch script can be run as follows:

bash batch.sh

Citation

If you use the released source code in your work, please cite the following paper:

@article{liu2021open,
  title={Open Intent Discovery through Unsupervised Semantic Clustering and Dependency Parsing},
  author={Liu, Pengfei and Ning, Youzhang and Wu, King Keung and Li, Kun and Meng, Helen},
  journal={arXiv preprint arXiv:2104.12114},
  year={2021}
}

Report

Please feel free to create an issue or send emails to the first author at [email protected].

open-intent-discovery's People

Contributors

ppfliu avatar tuanbc88 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.