Natural Language Processing IIITB, 2024

Course Code: AI 829
Course Name: Natural Language Processing
Course Instructor: Professor Srinath Srinivasa
Course Pre-Requisites: Mathematics for Machine Learning, Discrete Mathematics, Data Structures and Algorithms

This repository contains to all the materials, resources, tutorials, etc. delivered during the NLP course at International Institute of Information Technology (IIIT) Bangalore, 2024.

Course Content

Mandate - 1: Philosophy of language and Computational Linguistics

History of language and linguistics
Language paradigms
Language and thought
- Mould and cloak hypothesis
- Linguistic determinism
- Linguistic relativism
NLP fundamentals
History of NLP
- NLP and Symbolic Logic
- Statistical NLP
- Neural NLP

Mandate 1.1: Tutorial on Large Language Models (LLMs)

Architecture of LLMs
Foundation Models and Transfer Learning
Fine-tuning LLMs
Retrieval Augmented Generation

Mandate - 2: Lexical Processing

Distributional Semantics
Relevance models
Regular expressions
Stems, lemmas and morphological forms
Keyphrase extraction
Phrase identification models (CAP, PMI, N-grams)
Spelling variants and spelling mistake corrections
Phonetic hashing
Semantic hashing and word embeddings
Tutorials on lexical processing

Mandate - 3: Syntactic Processing

Shallow parsing and POS tagging
HMMs and Viterbi heuristic
Introduction to CFGs and Parsing
Ambiguity, left recursion and probabilistic parsing
Long range dependencies and coreference resolution
Free word-order languages
Dependency parsing
Tutorials on syntactic processing

Mandate - 4: Semantic Processing

Conceptual modeling fundamentals
Word sense disambiguation
Named entity recognition
Spectral models for latent semantics (LSA, PLSA, PCA, word and document embeddings)
Topic modeling
Masked Language Model (MLM)
Discourse and conversation modeling
Tutorials on semantic processing

Learning Resources

Books

Christopher Manning, Foundations of Statistical Natural Language Processing, MIT Press, 1999.
Dan Jurafsky and James H. Martin, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition, Pearson 2014.
Steven Bird, Ewan Klein, and Edward Loper, Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit, O’Reilly.

Online NLP Resources

Tools and Libraries

IndicNLP Resources

Evaluation Procedures

A set of rubrics for grading a mandate contribution include the following:

Relevance: The contribution should be relevant to the current mandate and should contribute to the overall collective knowledge of the class pertaining to this mandate.
Originality: Mandate contributions should be original knowledge-creation exercises. Plagiarism is strictly forbidden. Contributions with plagiarised content automatically get an F grade.
Specificity: Mandate contributions that address a specific problem, or make specific points with the required rigour, are graded higher than contributions that make very general “newspaper-style” statements.
Synthesis: Mandate contributions that synthesize knowledge from multiple sources and bring out the contributor’s own constructed knowledge, are rated higher than contributions that simply report on an existing paper or result.
Impact: Mandate contributions are also rated for their impact on the rest of the class, based on the quality of responses it generates from other members of the class.

dhruvawasthi / nlp-ta Goto Github PK

nlp-ta's Introduction

Natural Language Processing IIITB, 2024

Course Content

Mandate - 1: Philosophy of language and Computational Linguistics

Mandate 1.1: Tutorial on Large Language Models (LLMs)

Mandate - 2: Lexical Processing

Mandate - 3: Syntactic Processing

Mandate - 4: Semantic Processing

Learning Resources

Books

Online NLP Resources

Tools and Libraries

IndicNLP Resources

Evaluation Procedures

nlp-ta's People

Contributors

Stargazers

Watchers

Recommend Projects

Recommend Topics

Recommend Org