CSE 8803 · Spring 2026

Applied NLP Study Guide

Your interactive learning companion for all 15 weeks. Built from slides, transcripts, and real quiz patterns — with visuals, demos, and quiz prep.

📅 Final: Week 15 (Apr 20–24) 📚 15 Weeks + 2 Bonus Modules 🎯 Quiz 12 due Apr 24 — HW5 due Apr 24
Week 15 of 16 — Course Complete! 🎓
📚 Weekly Study Notes
Week 1 · Jan 12–16

Intro & Text Preprocessing

Modules 1 & 2 — Foundational NLP
🔬 Beyond the Slides — Graduate Depth Content Inside
  • Course overview & NLP problem types
  • Tokenization, stemming, lemmatization
  • Stop words, N-grams, noise removal
  • One-Hot Encoding, Bag of Words (BoW)
  • TF-IDF importance weighting
Week 2 · Jan 19–23

Naïve Bayes & Classification Evaluation

Module 3 — First Classifier
📚 Course Content
  • Bayes’ Rule: Prior, Likelihood, Posterior
  • Generative vs. Discriminative models
  • Multinomial Naïve Bayes for NLP
  • Confusion Matrix, Precision, Recall, F1
  • ROC Curve & AUC — MLK Day holiday (Jan 20)
Week 3 · Jan 26–30

Logistic Regression, SVM & Perceptron

Module 4 — Linear Classifiers
🔬 Beyond the Slides — Graduate Depth Content Inside
  • Sigmoid function & soft classification
  • Gradient descent & MLE
  • Support Vector Machines & margin
  • Kernel trick (non-linear SVM)
  • Perceptron algorithm & convergence
Week 4 · Feb 2–6

SVD, Co-occurrence & GloVe

Module 5 — Word Embeddings I
📚 Course Content
  • Why dimensionality reduction?
  • Co-occurrence matrix construction
  • Singular Value Decomposition (SVD)
  • Dense word embeddings from SVD
  • GloVe: global co-occurrence + semantics
Week 5 · Feb 9–13

Neural Networks & Word2Vec

Module 6 — Deep Learning Intro
🔬 Beyond the Slides — Graduate Depth Content Inside
  • Activation functions: ReLU, Sigmoid, Tanh
  • Forward pass & Backpropagation
  • Stochastic & Batch Gradient Descent
  • Word2Vec: CBoW model
  • Word2Vec: Skip-Gram model
Week 6 · Feb 16–20

Convolutional & Recurrent Neural Networks

Module 7 — Sequence & Structure
📚 Course Content
  • CNN: filters, convolution, feature maps
  • Max pooling & parameter sharing
  • CNN for text classification
  • RNN: sequential memory & hidden state
  • Vanishing gradient & exploding gradient
Week 7 · Feb 23–Mar 1

LSTM, GRU & Attention

Module 8 — Long-Term Memory & Focus
🔬 Beyond the Slides — Graduate Depth Content Inside
  • LSTM: cell state, forget / input / output gates
  • Cell state update: C_t = f_t⊙C_{t-1} + i_t⊙C̃_t
  • GRU: 2-gate simplified LSTM
  • Encoder-Decoder architecture
  • Additive Attention (Bahdanau): α weights + C_t
Week 8 · Bonus Module

Multimodal NLP & Multilingual Models

CLIP · LLaVA · Flamingo · mBERT · XLM-R
⭐ New Bonus Module — Additional Graduate Content
  • CLIP: contrastive image-text pretraining, InfoNCE loss, zero-shot
  • LLaVA: visual instruction tuning, projection layer
  • Flamingo: gated cross-attention, interleaved sequences
  • mBERT: 104 languages, cross-lingual transfer
  • XLM-R: CC-100, SentencePiece 250K, XNLI benchmarks
Week 9 · Mar 9–13

Transformers, BERT & GPT

Module 9 — The Architecture That Changed NLP
🔬 Beyond the Slides — Graduate Depth Content Inside
  • Why Transformers? LSTM limitations & pre-training benefits
  • Encoder vs. Decoder: BERT, GPT, T5/BART families
  • Self-Attention: Q, K, V — Attention(Q,K,V) = softmax(QKᵀ/√d_k)·V
  • Positional Encoding: sine (even) / cosine (odd) indices
  • BERT: MLM + NSP pre-training; GPT: autoregressive generation
Week 10 · Mar 16–20

Sequence Labeling: POS Tagging & NER

Module 10 — Token-Level Classification
🔬 Beyond the Slides — Graduate Depth Content Inside
  • Sequence labeling: one label per token (input = output length)
  • POS Tagging: Penn Treebank tagset, lexical ambiguity, perceptron & BERT
  • NER: 9 entity types, IOB/BIO tagging — B-PER, I-ORG, O
  • NLTK binary=True/False, Keras transformer, BERT (dslim)
  • Why encoder-only (BERT) is preferred for NER/POS over encoder-decoder
Week 11 · Bonus Module

Post-Training Alignment Pipeline

SFT → RLHF → Constitutional AI → DPO
⭐ New Bonus Module — Additional Graduate Content
  • Supervised Fine-Tuning (SFT): data, format, loss
  • RLHF with PPO: reward model, KL penalty, 4-model setup
  • Constitutional AI (RLAIF): Anthropic's approach
  • DPO: closed-form alignment without RL
  • KTO: preference optimization from binary labels
Week 12 · Mar 30–Apr 3

Topic Modeling: LSI & LDA

Module 11 — Unsupervised Learning
📚 Course Content
  • Topic modeling: unsupervised extraction of hidden topics
  • LSI: SVD decomposition A = U × Σ × V¹ — concept matrices
  • LSI querying: map term to concept space via V¹
  • LDA: Dirichlet distributions for doc-topic (α) and topic-word (β)
  • Gibbs sampling for LDA parameter tuning
Week 13 · Apr 6–10

Generative AI, Prompt Engineering & RAG

Module 12 — Modern LLM Applications
🔬 Beyond the Slides — Graduate Depth Content Inside
  • Generative AI: autoregressive decoding, LLM training pipeline
  • Temperature, Top-p sampling — controlling generation diversity
  • Prompt Engineering: Zero-shot, Few-shot, CoT, Self-Consistency, ReAct
  • Advanced prompts: Tree-of-Thought, prompt chaining, structured output
  • RAG: chunking, vector DB, semantic search, augmented generation
  • RAG vs. Fine-Tuning vs. Prompting: decision framework
Week 14 · Apr 13–17

Foundations of Agentic AI & Reasoning

Module 13 L1&L2 — Intelligent Agents
🔬 Beyond the Slides — Graduate Depth Content Inside
  • Agent vs. chatbot: perceive → reason → act → observe loop
  • Memory systems: in-context, episodic, semantic, procedural
  • Tool use & function calling: extending agents into the real world
  • ReAct framework: Thought → Action → Observation cycles
  • Reflexion, Tree-of-Thought, MCTS reasoning frameworks
  • Agent failure modes: goal drift, prompt injection, context overflow
Week 15 · Apr 20–24 ← FINAL WEEK

Multi-Agent Systems & Future of NLP

Module 13 L3 — Collaboration & Frontiers
🔬 Beyond the Slides — Graduate Depth Content Inside
  • Multi-agent topologies: hub-spoke, chain, mesh, hierarchical
  • Frameworks: LangGraph (graph), AutoGen (conversational), CrewAI (roles)
  • Agent communication protocols & termination conditions
  • Real-world applications: healthcare, legal, finance, software dev
  • NLP evolution timeline: 2012 → 2026 → beyond
  • Open research frontiers: hallucination, interpretability, continual learning
📅 Quiz Schedule & Coverage

All 12 Quizzes — Week Coverage & Dates

QuizCovers WeekTopicDueStatus
Quiz 1Week 1Text Preprocessing & RepresentationsJan 23✓ Done
Quiz 2Week 2Naïve Bayes & ClassificationJan 30✓ Done
Quiz 3Week 3Logistic Regression, SVM & PerceptronFeb 6✓ Done
Quiz 4Week 4SVD, Co-occurrence & GloVeFeb 13✓ Done
Quiz 5Week 5Neural Networks & Word2VecFeb 20✓ Done
Quiz 6Week 6CNN & RNNFeb 27✓ Done
Quiz 7Week 7LSTM, GRU & AttentionMar 6✓ Done
Quiz 8Week 9Transformers, BERT & GPTMar 20✓ Done
Quiz 9Week 10POS Tagging & NERApr 3✓ Done
Quiz 10Week 12Topic Modeling: LSI & LDAApr 10✓ Done
Quiz 11Week 13GenAI, Prompt Engineering & RAGApr 17✓ Done
Quiz 12Weeks 14&15Agentic AI & Multi-Agent SystemsApr 24⚠ Due Apr 24!
🗺️ Course Learning Roadmap
From raw text → intelligent models
Raw Text
Preprocessing
W1
Representations
W1
Classifiers
W2–W3
Embeddings
W4–W5
Deep Models
W5–W6
Transformers
W9
Seq. Labeling
W10
Topic Models
W12
LLMs & RAG
W13
AI Agents
W14–W15
💡 How to Use These Notes
🧠

What → Why → How

Every concept is explained in three layers: what it is, why we need it, and how it works mechanically. Read all three before moving on.

🎮

Use the Interactive Demos

Each week has live visualizers and clickable demos. Change the inputs and watch outputs update — this is the fastest way to build intuition.

📝

Quiz Yourself

Each week ends with multiple-choice questions modeled after actual Quizzes 1–7. Cover the answer, try it yourself, then reveal. Repeat weak spots.

🔗

Follow the Thread

Each week builds on the last. If something in Week 9 is confusing, check Week 7 (LSTMs). The roadmap above shows exactly how concepts connect.