Coder Social home page Coder Social logo

samarth-b / tolkein_text Goto Github PK

View Code? Open in Web Editor NEW

This project forked from christian-doucette/tolkein_text

0.0 1.0 0.0 68.62 MB

Neural Network Language Model that generates text based off Lord of the Rings. Built with Pytorch.

Home Page: https://share.streamlit.io/christian-doucette/tolkein_text

Python 100.00%

tolkein_text's Introduction

Tolkein Text

Tolkein Text is live here!

I trained an LSTM neural network language model on The Lord of the Rings, and used it for text generation.  
 

Motivation

The motivation of this project was to gain experience with:

  • Breaking down and completing an NLP task
  • Implementing machine learning in Pytorch
  • Modern architectures for neural network language models
  • Preprocessing and tokenizing raw text data
  • Applications of language models
     
     

Some Examples

"Arrows fell from the sky like lightning hurrying down."
"At that moment Faramir came in and gazed suddenly into the sweet darkness."
"Ever the great vale ran down into the darkness. Darker the leaves rolled suddenly through the summer mist."
"And the moon was in the dark, and they lay all, grey and fading, terrible and fair."

These sentences definitely capture the feel of Tolkein's writing, but are all original sentences! I especially love the fact that it creates its own simile, "Arrows fell from the sky like lightning hurrying down," that does not appear in the original text.
 
 

Preprocessing

The preprocessing stage can be broken down into these steps:

  1. Load in the full Lord of the Rings text
  2. Remove capitalization and some punctuation with Regex
  3. Build vocabulary that associates each word to a unique id. This includes word_to_id (dictionary), and id_to_word (direct access table)
  4. Map each word in the text to its id
  5. Add each word_id in the text to dataset as a label with the n word_ids before it as its feature (currently using n=9)
  6. This list of (Feature, Label) pairs is the training data
     
     

Network Architecture

The network has the following architecture: Input -> Embedding layer -> LSTM layers -> Dropout layer -> Fully Connected Linear Layer -> Output

Input is the n preceding word_ids I mentioned before.
Output is a list of vocab_size values, where a higher value means that that word is more likely to occur next.
 
 

Network Training

Using the network architecture stated above and cross entropy loss, I find the weights that minimize loss on the training data. These weights are calculated with gradient descent using Pytorch's Autograd package.
 
 

Text Generation

By applying the softmax function to the output of the network, I get a probability distribution over all words in the vocabulary. Then, I take a weighted random choice of these to decide the next word.

With the softmax function, I use a temperature parameter of 0.8. This value punishes lower probabilities slightly more than the default value of 1. This increases the coherence of the generated text, and improves its adherence to grammatical rules.  
 

Streamlit

After training the model, I set it up with simple web interface using Streamlit. I wanted to allow other people to try generating text with the model and running custom input through it. I chose Streamlit since it allowed me to create and host the model's web interface very easily.

tolkein_text's People

Contributors

christian-doucette avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.