Coder Social home page Coder Social logo

nlg-gan's Introduction

nlg-gan

This is a project I started for fun after reading about GANs and wondering if they could be applied to natural language processing. The endeavor was primarily a learning experience, helping familiarize myself with Tensorflow and deep learning in general. It did not produce practical results, and although I extensively documented some parts of it for some reason, it is only really preserved here for myself.

The primary challenge of applying GANs to NLP is that language is generally a discrete space (each word is a distinct point), and GANs require a continuous output space in order to propagate gradients back and forth between the discriminator and the generator. My solution to this problem is essentially just to use word vectors as a continuous input/output space. The outputs of the generator do not necessarily fall directly on existing words, but can be interpreted more as "meanings" in the word vector space. In order to get actual text back from the generator for humans to read, I preform a nearest-neighbor search among the dictionary of word vectors.

For the purposes of this project, I used pre-trained word vectors from GloVe, which are located here (not included in repo).

For both the generator and the discriminator, I used a basic LSTM architecture with no peephole connections.

Training the network proved difficult, and I often ran into mode collapse with the generator. Adding dropout and instance noise helped some, but I found most of my improvements in avoiding collapse came from adjusting my training schedule.

Towards the end, I began to produce sentences that each seemed to stay on the same subject and were sometimes somewhat grammatically correct. However, they were still nowhere near convincing nor did they compare to other, simpler, NLG techniques.

nlg-gan's People

Contributors

kvablack avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.