Coder Social home page Coder Social logo

dariowho / due Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 642 KB

An episodic, multi-model, servable framework for Dialog Systems

License: GNU General Public License v3.0

Python 56.66% Jupyter Notebook 43.34%
ai conversational-agents dialogue-systems nlp

due's Issues

Include Toy corpus for easier testing

AS A Due user
IN ORDER TO try out the software as easily as possible
I WANT TO import a toy corpus that needs no extra resources

Rationale
Currently there are no pre-trained agents to be imported, and the size of the available corpora is a barrier to a quick setup of the agent. We want a user to be able to try out an agent with as little external dependencies as possible.

TODO

  • Change serialization format to JSON/YAML
  • Add a built-in toy corpus
  • Update README to the toy corpus

Make Events, Episodes, Brains and Agents serializable

As of now most of the entities in Due can be "saved", meaning that they implement a save() method returning the entity itself as a Python object.

We want saved objects to be serializable (pickle? JSON?) and we need to implement the deserialization and loading counterpart of the process.

Allow for tokenized input in `due.nlp` module

Current NLP functions are meant to receive strings where tokens are delimited by spaces. These can either be raw inputs, or normalized strings where tokens are properly splitted (e.g. "It's raining" could be normalized into ("it 's raining").

This approach is not sufficient to handle tokens that contain multiple words.

As a solution, NLP methods should be allowed to receive both string (will be splitted on spaces, as it happens now) and list of strings (already splitted)

Test Python2 compatibility

So far Due has only been tested in a Python 3 environment. We want it to be compatible with 2.7 as well, through compatibility libraries such as python-future and six.

Acceptance criteria: run the test suite with Python 2.7

Finalize basic readme

A first version of README.md must have:

  • Project's mission
  • How to build
  • How to run
  • How to run the test suite
  • How to build the documentation

Add support for Webhook Actions

We want an Action to wrap a call around an API resource. This allows Due to be used as an interface for existing applications exposing a REST API.

Introduce interactive CLI for easier testing

AS A Due user
IN ORDER TO try out the software as easily as possible
I WANT TO have a console based agent that needs no extra resources

Rationale
Currently, the only way to deploy Due is through its XMPP interface, which requires access to an external chat server. We want a user to be able to try out an agent with as little external dependencies as possible.

This is also an opportunity to refactor the role of the Agent class, which currently does little more than passing information to/from a Brain class.

TODO

  • Refactor the role of the Agent interface
  • Introduce serve package porting current XMPP agent
  • Add a serve.cli for dialog over CLI
  • Update README to use ConsoleAgent (move XMPP example to docs)

Implement asynchronous notifications for events

Agents in an episode are notified every time an Event happens; each notification triggers a callback mechanism that is currently synchronous, and this allows an agent to process only one event at the time.

We want notifications to be asynchronous, to make possible for agents to suspend or modify their reasoning activities after events are triggered. It may be useful to include "typing" events in the flow.

Implement a baseline CosineBrain model

We want a baseline Brain model that decides on Events to issue in Episodes based on a trivial vector similarity measure between sentences, and/or whole Event sequences.

Implement a Brain interface

The Brain is responsible for predicting the most appropriate Events to issue in the current Episode, based on the memory of the previous ones.

We want to define an interface to allow different implementations to be integrated in the Agents.

Introduce Rewards and/or other feedback option

It's realistic to think that Agents will implement some form of reinforcement learning to achieve good results; explicit feedback from the user would improve this. Possibly, the reward mechanisms should be intertwined with language understanding, so that regular sentences can be associated with "hardcoded" rewards.

Implement HTTP serving module

AS a user
I WANT TO have an agent exposed on HTTP as a REST API
SO THAT I can easily integrate it with my software

Write tests

Code written so far is untested. We want to catch up with tests before moving to the next steps.

  • due.agent
  • due.event
  • due.brain
  • due.episode
  • due.util

Switch from Pipenv to Poetry

AS a developer
IN ORDER TO improve package lock speed and drop maintenance of setup.py
I WANT TO use Poetry for dependency management
INSTEAD of Pipenv

Rationale
Poetry (https://poetry.eustace.io/) is an alternative to pipenv that is supposed to speed up package lock and integrates a framework to build packages without maintaining a separate setup.py, we want that.

Refine support for multi-agent episodes

Even though many parts of the framework are written with multi-agent support in mind, the "2 agents" assumption was taken here and there to ease development.

Someday, even though not in the foreseeable future, this assumption needs to be relaxed.

Remove 'python-magic' dependency

AS a developer
I WANT TO have due running without python-magic
SO THAT i can install due easily, and on many different platforms

Technical details
We use python-magic to detect file types during serialization. This is inconvenient, because the package requires libmagic to be installed at OS level. We want to find a replacement, or in case change the de-serialization flow.

Add support for basic actions

An Event in an Episode can be an Utterance. It should be also possible to issue Action Events, possibly supporting dynamic loading from a user-supplied library.

In their basic implementation, Actions take no parameters.

Pass last Event along with Episode in Agent callbacks

The Event that triggers a callback is currently inferred by the Agent as the last Event of the Episode. To make it explicit, and to prepare for asynchronous notifications, we want to pass the Event as an argument of callback functions.

Create a Resource Loading framework

Due should have its own library of resources.

A Resource Manager should define the folder where Resources are located, and provide easy access to the other components of the application.

Create interface for creating/reviewing episodes

An Agent (more precisely, its Brain module) should record all the episodes it was involved in. Some of this episodes may be successful and valuable for learning, while others may contain non-ideal answers on the machine side (and, occasionally, on the human side as well).

It will be useful to have a user-friendly interface to filter, amend or just visualize episodes in an Agent's memory. Such an interface should also support the creation of new episodes, and cover the basic I/O operations on Episode files.

Create Dockerfile

AS a user who wants to try Due
IN ORDER TO get Due running as quick as possible
I WANT TO run Due as a Docker container

Rationale
There is currently some ambiguity on whether Due should be imported as a library or run as an application. We want Due to expose its packages and classes into external applications, but we also want to provide a stand-alone application, that loads an agent and serve it on a given channel (e.g. XMPP). Docker seems to be the most user-friendly option to do this.

Technical details
There are some issues to solve when it comes to make a battery-included Docker image for Due. Ideally, Due's container should be able to:

  • Be configurable with respect to the channel where to expose the agent (XMPP, REST, ...)
    The start script could read an env variable to configure the channel
  • Load an arbitrary agent
    Docker compose could mount an folder by default, where to put optional agent files
  • Download resources into the container
    See above
  • Load Action classes that are provided by external packages
    We could initially support only default actions. Possibly this is a long term solution, if we decide that a single RESTAction type is the only interface betewwn Due and the world

Separate core framework from NLU/NLG modules

AS a developer
IN ORDER TO avoid installing dependencies that are not necessary
I WANT TO install the Brain modules I need separately from Due's core framework

Rationale
Due is made to integrate a collection of ready-made NLU/NLG modules, that we call "Brain"; a Brain can learn from Episodes, and can predict the agent's answer in a conversation. Brains may be implemented with different technologies (PyTorch, Tensorflow, pure python, ...), and including a model library in the core Due package would mean to carry the burden of many heavy dependencies in the single core package. This would penalize users that only want to try one of them out, as well as developers that want to develop new ones. Because of this, we want to include only a couple of example brains in the core package, and move the more sophisticated ones to external packages.

Implement Event.add_event() with an asynchronous queue

AS a user
IN ORDER TO avoid recursion and receive Events fairly
I WANT TO use an asynchronous queue to handle Events from Agents

Rationale
Currently, each time an event is received by an Episode with Episode.add_event(), Agent callbacks are triggered to produce responses. The Agents receiving the callbacks will produce new Events and add them to the Episode. Currently, the agents call Episode.add_event() to add response Events. As this is a synchronous method, we introduce recursion in the process. This has two effects:

  1. Two bots talking together produce a stack of recursive calls when generating replies, and there's no protection against stack overflow
  2. When more than two bots are talking together, only the first two will be engaged in the conversation

As a solution, we propose to implement Episode.add_event() as a simple method that enqueues the event. The queue is consumed in parallel, so that event handling is more controlled and fair.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.