Mathematics of Big Data
Readings should be done before class. All resources (including lecture slides, homework, starter files, hw solution, articles) can be found under the Resources tab.
The topics to cover and the readings to be assigned are subject to change.
Date | Topics | Homework |
---|---|---|
Supervised Learning Jan 27 |
Introduction to Big Data Linear Regression Normal Equations and Optimization Techniques Linear Algebra Review Covariance Matrix |
Read: Murphy 1.{all} Murphy, 7.{1,...,5} |
Feb 3 |
Gaussian Distribution Linear Regression (Probabilitic Approach) Gradient Descent Newton's Methods Logistic Regression Exponential Family Generalized Linear Models |
Read: Murphy, 8.{1,2,3,5} \ 8.{3.4,3.5}, 9.{1,2.2,2.4,3} Due: Homework 1 Brainstorm for midterm project |
Feb 10 |
Probability Review Generalized Linear Models continued Poisson Regression Softmax Regression Covariance matrix Multivariate Gaussian Distribution Marginalized Gaussian and the Schur Complement |
Read: Murphy 9.7, 4.{1,2,3,4,5,6} (important background) Due: Homework 2 Project Proposal (<1 page) |
Feb 17 |
Dimensionality Reduction Spectral Decomposition Singular Value Decomposition Principal Component Analysis Generative Learning Algorithms Gaussian Discriminant Analysis Cholesky Decomposition |
Due: Final Project Proposal Homework 3 |
Feb 24 |
Naive Bayes L1 Regularization and Sparsity Lasso Support Vector Machines Kernels |
Read: Murphy 14.{1,2,3,4} \ 14.{4.4} MapReduce: Simplified Data Processing on Large Clusters Due: Homework 4 |
Unsupervised Learning Mar 2 |
Introduction to Unsupervised Learning Clustering K-Means Mixture of Gaussians Jensen's inequality Expectation-Maximization (EM) Algorithm |
Read: Murphy 11.{1,2,3,4} \ 11.{4.6,4.9} Pegasos: Primal Estimated sub-GrAdient SOlver for SVM Random Features for Large-Scale Kernel Machines Due: Homework 5 |
Mar 9 |
Summary of EM Algorithm EM for MAP estimation Kernel PCA One Class Support Vector Machines Learning Theory |
Read: Murphy 12.2.{0,1,2,3} 14.4.4 Support Vector Method for Novelty Detection Due: Homework 6 |
Midterm Project Work Mar 23 |
Work on your midterm projects. |
Read: None Due: None |
Midterm Project Presentation Mar 30 |
Be ready to present your midterm projects in class. |
Read: None Due: Midterm presentation and slides |
Midterm Project Due (11:59 pm) Mar 31 |
Your midterm projects must be sent to Prof. Gu via email by 11:59 pm.
Your submission should include all relevant code and the .tex files for your essay. |
Read: None Due: Midterm project write-up. |
Learning Theory Apr 6 |
Bayesian Learning Bayesian Logistic and Linear Regressions (review) Bayesian Inference Intractable Integrals and Motivation for Approximate Methods Learning Theory |
Read: Large-Scale Sparse Principal Component Analysis with Application to Text Data On the Convergence Properties of the EM Algorithm Due: Homework 7 |
Recommender Systems Apr 13 |
Introduction to Recommender Systems Collaborative Filtering Non-Negative Matrix Factorization Using Non-Negative Matrix Factorization for Topic Modelling |
Read: Murphy 27.6.2 Netflix Update: Try This at Home Due: Homework 8 |
Graph MethodsApr 20 | Additional topics will be covered in a workshop from 7:00 to 9:45 pm. |
Read: Murphy 10.{1,2,3,4,5,6} Due: |
Work on finalApr 27 |
Read: Due: |
|
May 4 or 11 (TBD) | Final Project Presentation (Mon. 7-9:50 pm) |
Due: Final Project Presentation Slides |
May 11 | Final Project Due for non (Tue. 11:59 pm) |
Due: Finish writing up final project |