Welcome to STATS 305B! Officially, this course is called Applied Statistics II. Unofficially, I call it Models and Algorithms for Discrete Data. We will cover models ranging from generalized linear models to sequential latent variable models, autoregressive models, and transformers. On the algorithm side, we will cover a few techniques for convex optimization, as well as approximate Bayesian inference algorithms like MCMC and variational inference. I think the best way to learn these concepts is to implement them from scratch, so coding will be a big focus of this course. By the end of the course, you'll have a strong grasp of classical techniques as well as modern methods for modeling discrete data.
Instructor: Scott Linderman
TAs: Amber Hu and Michael Salerno
Term: Winter 2024-25
Time: Monday and Wednesday, 1:30-2:50pm
Location: Sequoia Hall, Room 200, Stanford University
Office Hours
- Scott: Wed 10-11am, Wu Tsai Neurosciences Institute, 2nd Floor in the Theory Center
- Michael: Thu, 5-7pm, Sequoia library (Rm 105)
- Amber: Fri 1:30-3:30pm, Sequoia library (Rm 105) [except Feb 7 and 14]
- [Feb 3 and 10 only] Mon 10am-12pm, Wu Tsai Neurosciences Institute, 2nd Floor in the Theory Center
Students should be comfortable with undergraduate probability and statistics as well as multivariate calculus and linear algebra. This course will emphasize implementing models and algorithms, so coding proficiency with Python is required. (HW0: Python Primer will help you get up to speed.)
This course will draw from a few textbooks:
- Agresti, Alan. Categorical Data Analysis, 2nd edition. John Wiley & Sons, 2002. link
- Gelman, Andrew, et al. Bayesian Data Analysis, 3rd edition. Chapman and Hall/CRC, 2013. link
- Bishop, Christopher. Pattern Recognition and Machine Learning. Springer, 2006. link
We will also cover material from research papers.
Please note that this is a tentative schedule. It may change slightly depending on our pace.
Date | Topic | Slides | Additional Reading |
---|---|---|---|
Mon, Jan 6, 2025 | Basics of Probability and Statistics and Contingency Tables HW0 Released |
download | {cite:p}agresti2002categorical Ch. 1-3 |
Wed, Jan 8, 2025 | Logistic Regression | download | {cite:p}agresti2002categorical Ch. 4-5 |
Fri, Jan 10, 2025 | HW0 Due | ||
Mon, Jan 13, 2025 | Exponential Families HW1 Released |
download | {cite:p}agresti2002categorical Ch. 4-5 |
Wed, Jan 25, 2025 | Generalized Linear Models | download | {cite:p}agresti2002categorical Ch. 6 |
Mon, Jan 20, 2025 | MLK Day. No class | ||
Wed, Feb 22, 2025 | Sparse GLMs | download | {cite:p}friedman2010regularization and {cite:p}lee2014proximal |
Fri, Jan 24, 2025 | HW1 Due | ||
Mon, Jan 27, 2025 | Bayesian Inference HW2 Released |
download | {cite:p}gelman1995bayesian Ch. 1 |
Wed, Jan 29, 2025 | Markov Chain Monte Carlo and Bayesian GLM Demo | download | |
Mon, Feb 3, 2025 | Variational Inference | download | {cite:p}blei2017variational |
Wed, Feb 5, 2025 | Midterm Exam from 1:30-2:50pm in MCCULL 115. | download download |
|
Mon, Feb 10, 2025 | Mixture Models and EM | download | {cite:p}bishop2006pattern Ch. 9 |
Wed, Feb 12, 2025 | Hidden Markov Models HW2 Due; HW3 Released |
download | {cite:p}bishop2006pattern Ch. 13 |
Mon, Feb 17, 2025 | Presidents' Day. No class | ||
Wed, Feb 19, 2025 | Linear Gaussian Latent Variable Models | download | |
Mon, Feb 24, 2025 | Variational Autoencoders HW3 Due; HW4 Released |
download | {cite:p}kingma2019introduction Ch.1-2 |
Wed, Feb 26, 2025 | Transformers | download | {cite:p}turner2023introduction |
Mon, Mar 3, 2025 | Recurrent Neural Networks | download | {cite:p}goodfellow2016deep Ch 9 {cite:p} smith2023simplified and {cite:p}gu2023mamba |
Wed, Mar 5, 2025 | Denoising Diffusion Models | download | {cite:p}turner2024denoising |
Mon, Mar 10, 2025 | Poisson Processes | download | |
Wed, Mar 12, 2025 | Discrete Denoising Diffusion Models | ||
Fri, Mar 14, 2025 | HW4 Due |
There will be 5 assignments due roughly every other Friday. They will not be equally weighted. The first one is just a primer to get you up to speed; the last one will be a bit more substantial than the rest.
-
- Released Mon, Jan 6, 2025
- Due Fri, Jan 10, 2025 at 11:59pm
-
Homework 1: Logistic Regression
- Released Mon, Jan 13, 2025
- Due Fri, Jan 24, 2025 at 11:59pm
-
- Released Wed, Jan 29, 2025
- Due Wed, Feb 12, 2025 at 11:59pm
-
Homework 3: Hidden Markov Models
- Released Wed, Feb 12, 2025
- Due Mon, Feb 24, 2025 at 11:59pm
-
Homework 4: Large Language Models
- Released Mon, Feb 24, 2025
- Due Fri, Mar 14, 2025 at 11:59pm
We will allow 5 late days to be used as needed throughout the quarter.
-
Midterm Exam: Wed, Feb. 5 from 1:30-2:50pm in MCCULL 115
-
Final Exam: Wed, Mar 19 from 3:30-6:30pm in Building 370, Room 370
Tentatively:
Assignment | Percentage |
---|---|
HW 0 | 5% |
HW 1-3 | 15% each |
HW 4 | 20% |
Midterm | 10% |
Final | 15% |
Participation | 5% |