This is a hands-on NLTK tutorial that I completed in Jupyter notebooks. Github does not render the entire Jupyter Notebook .ipynb file if it's size or outputs are large, so please use the links below to view my code (rendered by nbviewer).
I code examples that cover the following topics:
-
Text Analysis Using nltk.text — Extracting interesting data from a given text. [View code]
-
Deriving N-Grams from Text — Creating n-grams (for language classification). [View code]
-
Detecting Text Language by Counting Stop Words — A simple way to find out what language a text is written in. [View code]
-
Language Identifier Using Word Bigrams — State-of-the-art language classifier. [View code]
-
Bigrams, Stemming and Lemmatizing — NLTK makes bigrams, stemming and lemmatization super easy. [View code]
-
Finding Unusual Words in Given Language — Which words do not belong with the rest of the text? [View code]
-
Creating a Part-of-Speech Tagger [View code]
-
Part-of-Speech and Meaning — Exploring awesome features offered by WordNet [View code]
-
Eminem, Akon & NLP — I use NLTK to perform stemming, lemmatiziation, tokenization, stop word removal this dataset where Akon explains how Eminem treats recording music like a nine-to-five job [View code]