Course description
Instructor
Fabio A. González
Profesor Titular
Depto de Ingeniería de Sistemas e Industrial
Universidad Nacional de Colombia
Course goal
The main goal of Machine Learning (ML) is the development of systems that are able to autonomously change their behavior based on experience. ML offers some of the more effective techniques for knowledge discovery in large data sets. ML has played a fundamental role in areas such as bioinformatics, information retrieval, business intelligence and autonomous vehicle development.
The main goal of this course is to study the computational, mathematical and statistical foundations of ML, which are essential for the theoretical analysis of existing learning algorithms, the development of new algorithms and the well-founded application of ML to solve real-world problems.
Course topics
1 Introduction to Machine Learning
1.1 History of ML
1.2 The learning problem
1.3 Design and analysis of ML experiments
2 Probabilistic models
2.1 Bayesian decision theory
2.2 Parametric estimation
3 Instance based learning
3.1 K-nearest neighbors
3.2 Kernel methods
3.3 Support vector machines
4 Function approximation and neural networks
4.1 Neural networks
4.2 Differentiable programming
4.3 Deep learning
5 Non-parametric estimation
5.1 Kernel density estimation and classification
5.2 Gaussian processes
6 Metric learning
6.1 Deep metric learning
6.2 Self-supervised Contrastive learning
7 Probabilistic deep learning
7.1 Variational inference
7.2 Generative models
8 Kernel density matrices
8.1 Density matrices
8.2 KDM for density estimation
8.3 KDM for classification and regression
8.4 KDM for generative modeling
Evaluation and grading policy
- Participation 10%
- Assignments 20%
- Quizzes 20%
- Exam 20%
- Final project 30%
Course resources
References
- [Alp14] Alpaydin, E. Introduction to Machine Learning, 3Ed. The MIT Press, 2014
- [Mur12] Murphy, Kevin P. Machine learning: a probabilistic perspective. The MIT Press, 2012.
- [Sha14] Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding machine learning: From theory to algorithms. Cambridge university press.
- [Dei20] Deisenroth, M. P., Faisal, A. A., & Ong, C. S. (2020). Mathematics for machine learning. Cambridge University Press.
- [Bar13] Barber, David, Bayesian Reasoning and Machine Learning, Cambridge University Press, 2013.
- [Bis06] Bishop, C. Pattern Recognition and Machine Learning. Springer-Verlag, 2006
- [HTF09] Hastie, T. and Tibshirani, R. and Friedman. The elements of statistical learning: data mining, inference, and prediction, Springer, 2009
- [GBC16] Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep learning. MIT press, 2016.
- [Mit97] Mitchell, T. M. 1997 Machine Learning. 1st. McGraw-Hill Higher Education.
- [DHS00] Duda, R. O., Hart, P. E., and Stork, D. G. 2000 Pattern Classification (2nd Edition). Wiley-Interscience.
- [SC04] Shawe-Taylor, J. and Cristianini, N. 2004 Kernel Methods for Pattern Analysis. Cambridge University Press.
- [SS02] Scholkopf, B. and Smola, A.J., 2002, Learning with kernels, MIT Press.
- [OCW-ML] 6.867 Machine Learning, Fall 2006, MIT OpenCourseWare.
- [STANFD-ML] Andrew Ng, CS229 Machine Learning, Stanford University
- [FundDL] Raúl Ramos, Fundamentos de Deep Learning, Universidad de Antioquia, 2021
Additional resources
- SciPy: scientific, mathematical, and engineering package for Python
- scikit-learn: machine learning Scipy add-on
- Kaggle: datascience competition, many interesting data sets and different competitions with prizes.
- Coursera Machine Learning Course: one of the first (and still one of the best) machine learning MOOCs taught by Andrew Ng.
- Stanford Statistical Learning Course: an introductory course with focus in supervised learning and taught by Trevor Hastie and Rob Tibshirani.
Course schedule
| Week | Topic | Material | Assignments |
|---|---|---|---|
| Feb 5 | Course presentation 1.1 History of ML | Reading material and resources Linear Algebra and Probability Review (part 1 Linear Algebra, part 2 Probability) [GBC16] Chap 2, Chap 3 | |
| Feb 12 | 1.2 The learning problem | Synchronous Class: Different approaches to solve a classification problem (video, notebook) Reading material and resources [Alp14] Chap 1 (slides) | Practice problems 1 |
| Feb 19 | 1.2 The learning problem | Synchronous Class: The learning problem (video, slides) Reading material and resources [Dei20] Chap 8 (book) [Alp14] Chap 1 (slides) | Practice problems 2 |
| Feb 26 | 1.3 Design and analysis of ML experiments | Asynchronous class - 1. Introduction (video 2m) - 2. Algorithm preference (video 8m) - 3. Factors and response (video 6.5m) - 4. Strategies of experimentation (video 12m) - 5. ML experiments (video 2m) - 6. Cross-validation (video 18.5m) - 7. Bootstrapping (video 15m) - 8. Performance measures (video 9m) - 9. ROC curve (video 24m) - 10. Precision and recall (video 10m) - 11. Interval estimation (video 14m) - 12. Hypothesis testing (video 23.5m) - 13. Error types (video 8.5m) - 14. Statistics (video 2m) - 15. Binomial tests (video 2m) - 16. Normal approximation (video 1m) - 17. Paired t-test (video 3m) - 18. McNemar's test (video 3.5m) - 19. K-fold CV paired t-test (video 3m) - 20. Anova (video 13.5m) - 21. Post-hoc tests (video 9m) Reading material and resources [Alp14] Chap 19 (slides) | Practice problems 2.5 |
| Mar 4 | 2.1 Bayesian decision theory | Asynchronous class - 1. Probability and inference (video 8.5m) - 2. Example MLE derivation (video 18m) - 3. Classification (video 4m) - 4. Bayes' rule (video 13m) - 5. Bayes' rule K > 2 (video 1m) - 6. Losses and risks (video 7.5m) - 7. 0/1 loss (video 10m) - 8. Rejection (video 18.5m) - 9. Different losses (video 8.5m) - 10. Discriminant functions (video 9m) - 11. K=2 classes (video 4m) Reading material and resources [Alp14] Chap 3 (slides) | Practice problems 3 |
| Mar 11 | 2.2 Parametric estimation | Asynchronous class - 1. Parametric estimation (video 1m) - 2. Maximum likelihood estimation (video 2m) - 3. Bernoulli/multinomial (video 3m) - 4. Gaussian distribution (video 1.5m) - 5. Bias and variance (video 5.5m) - 6. Bias and variance example (video 31m) Reading material and resources [Alp14] Chap 4, 5 (slides) Bias and variance (Jupyter notebook) | Practice problems 4 |
| Mar 18 | 2.1 Kernel methods | Asynchronous class - 1. Introduction (video 7.5m) - 2. Input and feature space (video 9m) - 3. The kernel trick (video 12m) - 4. Kernel induced feature space (video 8m) - 5. Kernel approach to ML (video 6.5m) - 6. Kernel functions (video 12m) - 7. Embeddings and kernels (video 9.5m) - 8. Visualizing kernel induced spaces (video 16.5m) - 9. Kernel algorithms (video 1m) - 10. Example: linear regression (video 9.5m) - 11. Solution using optimization (video 4.5m) - 12. Linear regression as a kernel method (video 7m) - 13. Dual linear regression (video 8m) - 14. Solution of dual linear regression (video 10m) - 15. Coding primal linear regression (video 8m) - 16. Regularization (video 9m) - 17. Ridge regression (video 5m) - 18. Ridge regression as a kernel method (video 8m) - 19. Different kernel functions (video 19.5m) Reading material and resources Introduction to kernel methods (slides) [Alp14] Chap 13 (slides) [SC04] Chap 1 and 2 | Practice problems 5 Assignment 1 |
| Apr 8 | 2.2 Support vector machines | Asynchronous class (video 1) (video 2) Reading material and resources [Alp14] Chap 13 (slides) An introduction to ML (Lecture 4, pp 146), Smola Support Vector Machine Tutorial, Weston Máquinas de vectores de soporte y selección de modelos (Jupyter Notebook) | Practice problems 6 |
| Apr 15 | 4.1 Neural networks | Asynchronous class - 1. History: the perceptron (video 3m) - 2. History: artificial neuron (video 4m) - 3. History: NN criticism (video 3.5m) - 4. History: backpropagation (video 6.5m) - 5. History: SVM's and kernels (video 11.5m) - 6. History: beginning of deep learning (video 3m) - 7. History: UNNeuro (video 3m) - 8. Practical example (video 28.5m) - 9. Backpropagation derivation (video )45m Reading material and resources Neural networks, Representation Learning and Deep Learning (slides) [Alp14] Chap 11 (slides) Quick and dirty introduction to neural networks (Colab notebook) Backpropagation derivation handout | Practice problems 7 Assignment 2 |
| Apr 22 | 4.2 Differentiable programming | Asynchronous class - 1. Deep learning frameworks (video 22.5m) - 2. Tensorflow example (video 35m) - 3. Keras example (video 34.5m) Reading material and resources Deep learning frameworks (slides) Introduction to TensorFlow (Jupyter notebook) Neural Networks in Keras (Jupyter notebook) Neural Networks in PyTorch (Jupyter notebook) | Practice problems 8 |
| May 6 | 4.3 Deep learning | Asynchronous class - 1. Introduction to deep learning (video) - 2. Convolutional neural networks (video ) - 3. CNN in Keras example (video ) Reading material and resources Representation Learning and Deep Learning (slides) [GBC2016] (Chap 9) CNN for image classification in Keras (Jupyter notebook) ConvNetJS demos Feature visualization | Practice problems 9 Project proposal: 3 persons per group, pdf with maximum 2 pages describing problem, objectives and method (15/05/24) |
| May 20 | 6.1 Deep metric learning 6.2 Self-supervised Contrastive learning | Synchronous class Reading material and resources - Class imbalance and Metric Learning slides, Master Year 2 Data Science IP-Paris - Self-supervised Learning slides, EECS498.008/598.008 Deep Learning for Computer Vision, University of Michigan, Winter 2022 | Assignment 3 |
| Jun 3 | 7.1 Variational inference 7.2 Generative models | Asynchronous class Alexander Amini, Deep generative models (slides, video) (from MIT 6.S191) Deep generative models (Jupyter notebook) | |
| Jun 17-24 | 5.1 Kernel density estimation and classification 8 Kernel density matrices | Reading material and resources - González, F. A., Ramos-Pollán, R., & Gallego-Mejia, J. A. Kernel Density Matrices for Probabilistic Deep Learning, arXiv: 2305.18204. - An Introduction to Kernel Density Matrices (notebook) - Kernel Density Matrices Library https://github.com/fagonzalezo/kdm | |
| Jul 1 | Final Project | Final project |