CSCI-SHU 376: Natural Language Processing
Hua Shen
Course Website: 2026 Spring-NLP-[CSCI-SHU-376]-Class Schedule
2026-01-29
Spring 2026
Lecture 3: Text Classification
Today’s Plan
Today’s Plan
Why Text Classification
Text Classification
Rule-based Text Classification
Supervised Learning
Types of Supervised Learning
Today’s Plan
Naïve Bayes Classifier
Naïve Bayes Classifier
How to represent P( d | c )
Bag of Words
Predicting with Naïve Bayes
Estimate probabilities
Smoothing
Naïve Bayes: Overall Process
A worked Example
A worked Example
A worked Example
Naïve Bayes: Pros and Cons
Today’s Plan
Logistic Regression
Generative vs Discriminative models
Generative Classifier
Discriminative Classifier
Overall Process
Feature Representation
Feature Representation
Classification function
Classification function
Loss function
Bernoulli distribution
Optimization
Gradients for binary logistic regression
Multinomial Logistic Regression