Coder Social home page Coder Social logo

cs224n's Introduction

Stanford CS224n: NLP with Deep Learning

Lectures

YouTube Playlist

Schedule

See schedule below for ordering, assignemnts, and additional links. For context on pacing, the class met every Tuesday and Thursday.

Below table copied from here.

Description Course Materials Events Deadlines
Word Vectors
[slides] [notes]

Gensim word vectors example:
[code] [preview]
Suggested Readings:
  1. Efficient Estimation of Word Representations in Vector Space (original word2vec paper)
  2. Distributed Representations of Words and Phrases and their Compositionality (negative sampling paper)
Assignment 1 out
[code]
[preview]
Word Vectors 2 and Word Window Classification
[slides] [notes]
Suggested Readings:
  1. GloVe: Global Vectors for Word Representation (original GloVe paper)
  2. Improving Distributional Similarity with Lessons Learned from Word Embeddings
  3. Evaluation methods for unsupervised word embeddings
Additional Readings:
  1. A Latent Variable Model Approach to PMI-based Word Embeddings
  2. Linear Algebraic Structure of Word Senses, with Applications to Polysemy
  3. On the Dimensionality of Word Embedding
Python Review Session
[code] [preview]
10:00am - 11:20am
Backprop and Neural Networks
[slides] [notes]
Suggested Readings:
  1. matrix calculus notes
  2. Review of differential calculus
  3. CS231n notes on network architectures
  4. CS231n notes on backprop
  5. Derivatives, Backpropagation, and Vectorization
  6. Learning Representations by Backpropagating Errors (seminal Rumelhart et al. backpropagation paper)
Additional Readings:
  1. Yes you should understand backprop
  2. Natural Language Processing (Almost) from Scratch
Assignment 2 out
[code] [handout]
Assignment 1 due
Dependency Parsing
[slides] [notes]
[slides (annotated)]
Suggested Readings:
  1. Incrementality in Deterministic Dependency Parsing
  2. A Fast and Accurate Dependency Parser using Neural Networks
  3. Dependency Parsing
  4. Globally Normalized Transition-Based Neural Networks
  5. Universal Stanford Dependencies: A cross-linguistic typology
  6. Universal Dependencies website
PyTorch Tutorial Session
[colab notebook] [preview]
[jupyter notebook]
10:00am - 11:20am
Recurrent Neural Networks and Language Models
[slides] [notes (lectures 5 and 6)]
Suggested Readings:
  1. N-gram Language Models (textbook chapter)
  2. The Unreasonable Effectiveness of Recurrent Neural Networks (blog post overview)
  3. Sequence Modeling: Recurrent and Recursive Neural Nets (Sections 10.1 and 10.2)
  4. On Chomsky and the Two Cultures of Statistical Learning
Assignment 3 out
[code] [handout]
Assignment 2 due
Vanishing Gradients, Fancy RNNs, Seq2Seq
[slides] [notes (lectures 5 and 6)]
Suggested Readings:
  1. Sequence Modeling: Recurrent and Recursive Neural Nets (Sections 10.3, 10.5, 10.7-10.12)
  2. Learning long-term dependencies with gradient descent is difficult (one of the original vanishing gradient papers)
  3. On the difficulty of training Recurrent Neural Networks (proof of vanishing gradient problem)
  4. Vanishing Gradients Jupyter Notebook (demo for feedforward networks)
  5. Understanding LSTM Networks (blog post overview)
Machine Translation, Attention, Subword Models
[slides] [notes]
Suggested Readings:
  1. Statistical Machine Translation slides, CS224n 2015 (lectures 2/3/4)
  2. Statistical Machine Translation (book by Philipp Koehn)
  3. BLEU (original paper)
  4. Sequence to Sequence Learning with Neural Networks (original seq2seq NMT paper)
  5. Sequence Transduction with Recurrent Neural Networks (early seq2seq speech recognition paper)
  6. Neural Machine Translation by Jointly Learning to Align and Translate (original seq2seq+attention paper)
  7. Attention and Augmented Recurrent Neural Networks (blog post overview)
  8. Massive Exploration of Neural Machine Translation Architectures (practical advice for hyperparameter choices)
  9. Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models
  10. Revisiting Character-Based Neural Machine Translation with Capacity and Compression
Assignment 4 out
[code] [handout] [Azure Guide] [Practical Guide to VMs]
Assignment 3 due
Final Projects: Custom and Default; Practical Tips
[slides] [notes]
Suggested Readings:
  1. Practical Methodology (Deep Learning book chapter)
Project Proposal out
[instructions]

Default Final Project out
[handout (IID SQuAD track)]
[handout (Robust QA track)]
Transformers (lecture by John Hewitt)
[slides] [notes]
Suggested Readings:
  1. Project Handout (IID SQuAD track)
  2. Project Handout (Robust QA track)
  3. Attention Is All You Need
  4. The Illustrated Transformer
  5. Transformer (Google AI blog post)
  6. Layer Normalization
  7. Image Transformer
  8. Music Transformer: Generating music with long-term structure
More about Transformers and Pretraining (lecture by John Hewitt)
[slides] [notes]
Suggested Readings:
  1. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
  2. Contextual Word Representations: A Contextual Introduction
  3. The Illustrated BERT, ELMo, and co.
Assignment 5 out
[code] [handout]
Assignment 4 due
Question Answering (guest lecture by Danqi Chen)
[slides]
Suggested Readings:
  1. SQuAD: 100,000+ Questions for Machine Comprehension of Text
  2. Bidirectional Attention Flow for Machine Comprehension
  3. Reading Wikipedia to Answer Open-Domain Questions
  4. Latent Retrieval for Weakly Supervised Open Domain Question Answering
  5. Dense Passage Retrieval for Open-Domain Question Answering
  6. Learning Dense Representations of Phrases at Scale
Project Proposal due
Natural Language Generation (lecture by Antoine Bosselut)
[slides]
Suggested readings:
  1. The Curious Case of Neural Text Degeneration
  2. Get To The Point: Summarization with Pointer-Generator Networks
  3. Hierarchical Neural Story Generation
  4. How NOT To Evaluate Your Dialogue System
Project Milestone out [instructions] Assignment 5 due
Reference in Language and Coreference Resolution
[slides]
Suggested readings:
  1. Coreference Resolution chapter of Jurafsky and Martin
  2. End-to-end Neural Coreference Resolution.
T5 and large language models: The good, the bad, and the ugly (guest lecture by Colin Raffel)
[slides]
Suggested readings:
  1. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Integrating knowledge in language models (lecture by Megan Leszczynski)
[slides]
Suggested readings:
  1. ERNIE: Enhanced Language Representation with Informative Entities
  2. Barack’s Wife Hillary: Using Knowledge Graphs for Fact-Aware Language Modeling
  3. Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model
  4. Language Models as Knowledge Bases?
Project Milestone due
Social & Ethical Considerations in NLP Systems (guest lecture by Yulia Tsvetkov)
[slides]
Model Analysis and Explanation (lecture by John Hewitt)
[slides]
Future of NLP + Deep Learning (lecture by Shikhar Murty)
[slides]
Project Summary Image and Paragraph out [instructions]
Ask Me Anything / Final Project Assistance Project due [instructions]
Final Project Emergency Assistance
Project Summary Image and Paragraph due

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.