📚 Weekly Study Notes
Week 1 · Jan 12–16
Intro & Text Preprocessing
Modules 1 & 2 — Foundational NLP
🔬 Beyond the Slides — Graduate Depth Content Inside
- Course overview & NLP problem types
- Tokenization, stemming, lemmatization
- Stop words, N-grams, noise removal
- One-Hot Encoding, Bag of Words (BoW)
- TF-IDF importance weighting
Week 2 · Jan 19–23
Naïve Bayes & Classification Evaluation
Module 3 — First Classifier
📚 Course Content
- Bayes’ Rule: Prior, Likelihood, Posterior
- Generative vs. Discriminative models
- Multinomial Naïve Bayes for NLP
- Confusion Matrix, Precision, Recall, F1
- ROC Curve & AUC — MLK Day holiday (Jan 20)
Week 3 · Jan 26–30
Logistic Regression, SVM & Perceptron
Module 4 — Linear Classifiers
🔬 Beyond the Slides — Graduate Depth Content Inside
- Sigmoid function & soft classification
- Gradient descent & MLE
- Support Vector Machines & margin
- Kernel trick (non-linear SVM)
- Perceptron algorithm & convergence
Week 4 · Feb 2–6
SVD, Co-occurrence & GloVe
Module 5 — Word Embeddings I
📚 Course Content
- Why dimensionality reduction?
- Co-occurrence matrix construction
- Singular Value Decomposition (SVD)
- Dense word embeddings from SVD
- GloVe: global co-occurrence + semantics
Week 5 · Feb 9–13
Neural Networks & Word2Vec
Module 6 — Deep Learning Intro
🔬 Beyond the Slides — Graduate Depth Content Inside
- Activation functions: ReLU, Sigmoid, Tanh
- Forward pass & Backpropagation
- Stochastic & Batch Gradient Descent
- Word2Vec: CBoW model
- Word2Vec: Skip-Gram model
Week 6 · Feb 16–20
Convolutional & Recurrent Neural Networks
Module 7 — Sequence & Structure
📚 Course Content
- CNN: filters, convolution, feature maps
- Max pooling & parameter sharing
- CNN for text classification
- RNN: sequential memory & hidden state
- Vanishing gradient & exploding gradient
Week 7 · Feb 23–Mar 1
LSTM, GRU & Attention
Module 8 — Long-Term Memory & Focus
🔬 Beyond the Slides — Graduate Depth Content Inside
- LSTM: cell state, forget / input / output gates
- Cell state update: C_t = f_t⊙C_{t-1} + i_t⊙C̃_t
- GRU: 2-gate simplified LSTM
- Encoder-Decoder architecture
- Additive Attention (Bahdanau): α weights + C_t
Week 8 · Bonus Module
Multimodal NLP & Multilingual Models
CLIP · LLaVA · Flamingo · mBERT · XLM-R
⭐ New Bonus Module — Additional Graduate Content
- CLIP: contrastive image-text pretraining, InfoNCE loss, zero-shot
- LLaVA: visual instruction tuning, projection layer
- Flamingo: gated cross-attention, interleaved sequences
- mBERT: 104 languages, cross-lingual transfer
- XLM-R: CC-100, SentencePiece 250K, XNLI benchmarks
Week 9 · Mar 9–13
Transformers, BERT & GPT
Module 9 — The Architecture That Changed NLP
🔬 Beyond the Slides — Graduate Depth Content Inside
- Why Transformers? LSTM limitations & pre-training benefits
- Encoder vs. Decoder: BERT, GPT, T5/BART families
- Self-Attention: Q, K, V — Attention(Q,K,V) = softmax(QKᵀ/√d_k)·V
- Positional Encoding: sine (even) / cosine (odd) indices
- BERT: MLM + NSP pre-training; GPT: autoregressive generation
Week 10 · Mar 16–20
Sequence Labeling: POS Tagging & NER
Module 10 — Token-Level Classification
🔬 Beyond the Slides — Graduate Depth Content Inside
- Sequence labeling: one label per token (input = output length)
- POS Tagging: Penn Treebank tagset, lexical ambiguity, perceptron & BERT
- NER: 9 entity types, IOB/BIO tagging — B-PER, I-ORG, O
- NLTK binary=True/False, Keras transformer, BERT (dslim)
- Why encoder-only (BERT) is preferred for NER/POS over encoder-decoder
Week 11 · Bonus Module
Post-Training Alignment Pipeline
SFT → RLHF → Constitutional AI → DPO
⭐ New Bonus Module — Additional Graduate Content
- Supervised Fine-Tuning (SFT): data, format, loss
- RLHF with PPO: reward model, KL penalty, 4-model setup
- Constitutional AI (RLAIF): Anthropic's approach
- DPO: closed-form alignment without RL
- KTO: preference optimization from binary labels
Week 12 · Mar 30–Apr 3
Topic Modeling: LSI & LDA
Module 11 — Unsupervised Learning
📚 Course Content
- Topic modeling: unsupervised extraction of hidden topics
- LSI: SVD decomposition A = U × Σ × V¹ — concept matrices
- LSI querying: map term to concept space via V¹
- LDA: Dirichlet distributions for doc-topic (α) and topic-word (β)
- Gibbs sampling for LDA parameter tuning
Week 13 · Apr 6–10
Generative AI, Prompt Engineering & RAG
Module 12 — Modern LLM Applications
🔬 Beyond the Slides — Graduate Depth Content Inside
- Generative AI: autoregressive decoding, LLM training pipeline
- Temperature, Top-p sampling — controlling generation diversity
- Prompt Engineering: Zero-shot, Few-shot, CoT, Self-Consistency, ReAct
- Advanced prompts: Tree-of-Thought, prompt chaining, structured output
- RAG: chunking, vector DB, semantic search, augmented generation
- RAG vs. Fine-Tuning vs. Prompting: decision framework
Week 14 · Apr 13–17
Foundations of Agentic AI & Reasoning
Module 13 L1&L2 — Intelligent Agents
🔬 Beyond the Slides — Graduate Depth Content Inside
- Agent vs. chatbot: perceive → reason → act → observe loop
- Memory systems: in-context, episodic, semantic, procedural
- Tool use & function calling: extending agents into the real world
- ReAct framework: Thought → Action → Observation cycles
- Reflexion, Tree-of-Thought, MCTS reasoning frameworks
- Agent failure modes: goal drift, prompt injection, context overflow
Week 15 · Apr 20–24 ← FINAL WEEK
Multi-Agent Systems & Future of NLP
Module 13 L3 — Collaboration & Frontiers
🔬 Beyond the Slides — Graduate Depth Content Inside
- Multi-agent topologies: hub-spoke, chain, mesh, hierarchical
- Frameworks: LangGraph (graph), AutoGen (conversational), CrewAI (roles)
- Agent communication protocols & termination conditions
- Real-world applications: healthcare, legal, finance, software dev
- NLP evolution timeline: 2012 → 2026 → beyond
- Open research frontiers: hallucination, interpretability, continual learning
📅 Quiz Schedule & Coverage
All 12 Quizzes — Week Coverage & Dates
| Quiz | Covers Week | Topic | Due | Status |
|---|---|---|---|---|
| Quiz 1 | Week 1 | Text Preprocessing & Representations | Jan 23 | ✓ Done |
| Quiz 2 | Week 2 | Naïve Bayes & Classification | Jan 30 | ✓ Done |
| Quiz 3 | Week 3 | Logistic Regression, SVM & Perceptron | Feb 6 | ✓ Done |
| Quiz 4 | Week 4 | SVD, Co-occurrence & GloVe | Feb 13 | ✓ Done |
| Quiz 5 | Week 5 | Neural Networks & Word2Vec | Feb 20 | ✓ Done |
| Quiz 6 | Week 6 | CNN & RNN | Feb 27 | ✓ Done |
| Quiz 7 | Week 7 | LSTM, GRU & Attention | Mar 6 | ✓ Done |
| Quiz 8 | Week 9 | Transformers, BERT & GPT | Mar 20 | ✓ Done |
| Quiz 9 | Week 10 | POS Tagging & NER | Apr 3 | ✓ Done |
| Quiz 10 | Week 12 | Topic Modeling: LSI & LDA | Apr 10 | ✓ Done |
| Quiz 11 | Week 13 | GenAI, Prompt Engineering & RAG | Apr 17 | ✓ Done |
| Quiz 12 | Weeks 14&15 | Agentic AI & Multi-Agent Systems | Apr 24 | ⚠ Due Apr 24! |
🗺️ Course Learning Roadmap
From raw text → intelligent models
Raw Text
→
Preprocessing
W1
W1
→
Representations
W1
W1
→
Classifiers
W2–W3
W2–W3
→
Embeddings
W4–W5
W4–W5
→
Deep Models
W5–W6
W5–W6
→
Transformers
W9
W9
→
Seq. Labeling
W10
W10
→
Topic Models
W12
W12
→
LLMs & RAG
W13
W13
→
AI Agents
W14–W15
W14–W15
💡 How to Use These Notes
What → Why → How
Every concept is explained in three layers: what it is, why we need it, and how it works mechanically. Read all three before moving on.
Use the Interactive Demos
Each week has live visualizers and clickable demos. Change the inputs and watch outputs update — this is the fastest way to build intuition.
Quiz Yourself
Each week ends with multiple-choice questions modeled after actual Quizzes 1–7. Cover the answer, try it yourself, then reveal. Repeat weak spots.
Follow the Thread
Each week builds on the last. If something in Week 9 is confusing, check Week 7 (LSTMs). The roadmap above shows exactly how concepts connect.