📚 Weekly Study Notes
Week 1 · Jan 12–16
Intro & Text Preprocessing
Modules 1 & 2 — Foundational NLP
- Course overview & NLP problem types
- Tokenization, stemming, lemmatization
- Stop words, N-grams, noise removal
- One-Hot Encoding, Bag of Words (BoW)
- TF-IDF importance weighting
Week 2 · Jan 19–23
Naïve Bayes & Classification Evaluation
Module 3 — First Classifier
- Bayes’ Rule: Prior, Likelihood, Posterior
- Generative vs. Discriminative models
- Multinomial Naïve Bayes for NLP
- Confusion Matrix, Precision, Recall, F1
- ROC Curve & AUC — MLK Day holiday (Jan 20)
Week 3 · Jan 26–30
Logistic Regression, SVM & Perceptron
Module 4 — Linear Classifiers
- Sigmoid function & soft classification
- Gradient descent & MLE
- Support Vector Machines & margin
- Kernel trick (non-linear SVM)
- Perceptron algorithm & convergence
Week 4 · Feb 2–6
SVD, Co-occurrence & GloVe
Module 5 — Word Embeddings I
- Why dimensionality reduction?
- Co-occurrence matrix construction
- Singular Value Decomposition (SVD)
- Dense word embeddings from SVD
- GloVe: global co-occurrence + semantics
Week 5 · Feb 9–13
Neural Networks & Word2Vec
Module 6 — Deep Learning Intro
- Activation functions: ReLU, Sigmoid, Tanh
- Forward pass & Backpropagation
- Stochastic & Batch Gradient Descent
- Word2Vec: CBoW model
- Word2Vec: Skip-Gram model
Week 6 · Feb 16–20
Convolutional & Recurrent Neural Networks
Module 7 — Sequence & Structure
- CNN: filters, convolution, feature maps
- Max pooling & parameter sharing
- CNN for text classification
- RNN: sequential memory & hidden state
- Vanishing gradient & exploding gradient
Week 7 · Feb 23–Mar 1
LSTM, GRU & Attention
Module 8 — Long-Term Memory & Focus
- LSTM: cell state, forget / input / output gates
- Cell state update: C_t = f_t⊙C_{t-1} + i_t⊙C̃_t
- GRU: 2-gate simplified LSTM
- Encoder-Decoder architecture
- Additive Attention (Bahdanau): α weights + C_t
Week 9 · Mar 9–13
Transformers, BERT & GPT
Module 9 — The Architecture That Changed NLP
- Why Transformers? LSTM limitations & pre-training benefits
- Encoder vs. Decoder: BERT, GPT, T5/BART families
- Self-Attention: Q, K, V — Attention(Q,K,V) = softmax(QKᵀ/√d_k)·V
- Positional Encoding: sine (even) / cosine (odd) indices
- BERT: MLM + NSP pre-training; GPT: autoregressive generation
Week 10 · Mar 16–20
Sequence Labeling: POS Tagging & NER
Module 10 — Token-Level Classification
- Sequence labeling: one label per token (input = output length)
- POS Tagging: Penn Treebank tagset, lexical ambiguity, perceptron & BERT
- NER: 9 entity types, IOB/BIO tagging — B-PER, I-ORG, O
- NLTK binary=True/False, Keras transformer, BERT (dslim)
- Why encoder-only (BERT) is preferred for NER/POS over encoder-decoder
Week 11 · Mar 23–27
🌴 Spring Break
No lectures · No quiz · Recharge!
No new content this week. Week 12 resumes with Unsupervised Models & Topic Modeling.
Week 12 · Mar 30–Apr 3 ← YOU ARE HERE
Topic Modeling: LSI & LDA
Module 11 — Unsupervised Learning
- Topic modeling: unsupervised extraction of hidden topics from documents
- LSI: SVD decomposition A = U × Σ × V¹ — concept matrices
- LSI querying: map term to concept space using V¹ matrix
- LDA: Dirichlet distributions for doc-topic (α) and topic-word (β)
- LDA equation: 4 probability terms — Dirichlet + Multinomial distributions
- Gibbs sampling optimization for LDA parameter tuning
📅 Quiz Schedule & Coverage
All 12 Quizzes — Week Coverage & Dates
| Quiz | Covers Week | Topic | Due | Status |
|---|---|---|---|---|
| Quiz 1 | Week 1 | Text Preprocessing & Representations | Jan 23 | ✓ Done |
| Quiz 2 | Week 2 | Naïve Bayes & Classification | Jan 30 | ✓ Done |
| Quiz 3 | Week 3 | Logistic Regression, SVM & Perceptron | Feb 6 | ✓ Done |
| Quiz 4 | Week 4 | SVD, Co-occurrence & GloVe | Feb 13 | ✓ Done |
| Quiz 5 | Week 5 | Neural Networks & Word2Vec | Feb 20 | ✓ Done |
| Quiz 6 | Week 6 | CNN & RNN | Feb 27 | ✓ Done |
| Quiz 7 | Week 7 | LSTM, GRU & Attention | Mar 6 | ✓ Done |
| Quiz 8 | Week 9 | Transformers, BERT & GPT | Mar 20 | ✓ Done |
| Quiz 9 | Week 10 | POS Tagging & NER | Apr 3 | ⚠ Due Apr 3! |
| Quiz 10 | Week 12 | Topic Modeling: LSI & LDA | Apr 10 | ▶ Upcoming |
| Quiz 11 | Week 13 | LLMs & Prompt Engineering | Apr 24 | — Future |
| Quiz 12 | Week 14 | AI Agents & Applications | May 1 | — Future |
🗺️ Course Learning Roadmap
From raw text → intelligent models
Raw Text
→
Preprocessing
W1
W1
→
Representations
W1
W1
→
Classifiers
W2–W3
W2–W3
→
Embeddings
W4–W5
W4–W5
→
Deep Models
W5–W6
W5–W6
→
Transformers
W9
W9
→
Seq. Labeling
W10
W10
→
LLMs / Agents
W13–W14
W13–W14
💡 How to Use These Notes
What → Why → How
Every concept is explained in three layers: what it is, why we need it, and how it works mechanically. Read all three before moving on.
Use the Interactive Demos
Each week has live visualizers and clickable demos. Change the inputs and watch outputs update — this is the fastest way to build intuition.
Quiz Yourself
Each week ends with multiple-choice questions modeled after actual Quizzes 1–7. Cover the answer, try it yourself, then reveal. Repeat weak spots.
Follow the Thread
Each week builds on the last. If something in Week 9 is confusing, check Week 7 (LSTMs). The roadmap above shows exactly how concepts connect.