## Course description

### Instructor

Fabio A. González

*Maestría en Ingeniería de Sistemas y Computación*

*Universidad Nacional de Colombia*

### Course goal

The main goal of Machine Learning (ML) is the development of systems that are able to autonomously change their behavior based on experience. ML offers some of the more effective techniques for knowledge discovery in large data sets. ML has played a fundamental role in areas such as bioinformatics, information retrieval, business intelligence and autonomous vehicle development.

The main goal of this course is to study the computational, mathematical and statistical foundations of ML, which are essential for the theoretical analysis of existing learning algorithms, the development of new algorithms and the well-founded application of ML to solve real-world problems.

## Course topics

1 **Introduction**

2 **Generalization**

2.1 Bayesian decision theory

2.2 Estimation

2.3 Linear models

2.4 Performance evaluation

3 **Perception and representation**

3.1 Feature extraction and selection

3.2 Kernel methods

3.3 Representation learning

4 **Learning**

4.1 Support vector learning

4.2 Random forest learning

4.3. Neural network learning

5 **Discovering**

5.1 Mixture densities

5.2 Latent topic models

5.3 Matrix factorization

6 **Implementing**

6.1 Experimental design

6.2 Large scale machine learning

## Evaluation and grading policy

- Assignments 40%
- Exams 30%
- Presentation 15%
- Final project 15%

## Course resources

### References

- [Alp10] Alpaydin, E. Introduction to Machine Learning, 2Ed. The MIT Press, 2010
- [Mur12] Murphy, Kevin P. Machine learning: a probabilistic perspective. The MIT Press, 2012.
- [Barber2013] Barber, David, Bayesian Reasoning and Machine Learning, Cambridge University Press, 2013.
- [Bis06] Bishop, C. Pattern Recognition and Machine Learning. Springer-Verlag, 2006
- [HTF09] Hastie, T. and Tibshirani, R. and Friedman. The elements of statistical learning: data mining, inference, and prediction, Springer, 2009
- [Mit97] Mitchell, T. M. 1997 Machine Learning. 1st. McGraw-Hill Higher Education.
- [DHS00] Duda, R. O., Hart, P. E., and Stork, D. G. 2000 Pattern Classification (2nd Edition). Wiley-Interscience.
- [SC04] Shawe-Taylor, J. and Cristianini, N. 2004 Kernel Methods for Pattern Analysis. Cambridge University Press.
- [TSK05] Pang-Ning Tan, Michael Steinbach, Vipin Kumar, 2005, Introduction to Data Mining, Addison-Wesley.
- [CST00] Cristianini, N. and Shawe-Taylor, J., 2000, An introduction to support Vector Machines: and other kernel-based learning methods,, Cambridge Univ Press.
- [SS02] Scholkopf, B. and Smola, A.J., 2002, Learning with kernels, MIT Press.
- [Bak07] Bakir, G. (Ed), 2007, Predicting Structured Data, MIT Press.
- [OCW-ML] 6.867 Machine Learning, Fall 2006, MIT OpenCourseWare.
- [STANFD-ML] Andrew Ng, CS229 Machine Learning, Stanford University

### Additional resources

- SciPy: scientific, mathematical, and engineering package for Python
- scikit-learn: machine learning Scipy add-on
- Kaggle: datascience competition, many interesting data sets and different competitions with prizes.
- Coursera Machine Learning Course: one of the first (and still one of the best) machine learning MOOCs taught by Andrew Ng.
- Stanford Statistical Learning Course: an introductory course with focus in supervised learning and taught by Trevor Hastie and Rob Tibshirani.

## Course schedule

Week | Topic | Material | Assignments |
---|---|---|---|

Aug 1 | 1. Introduction | Brief Introduction to ML (slides) Jeremy Howard: The wonderful and terrifying implications of computers that can learn Hastie and Tibshirani: Statistical Learning Introduction | Assignment 1 |

Aug 8 | 2.1 Bayesian decision theory | [Alp10] Chap 3 (slides) | |

Aug 15 | 2.2 Estimation | [Alp10] Chap 4 (slides) Bias and variance (IPython notebook) | Assignment 2 |

Aug 22 | 2.3 Linear models | [Alp10] Chap 10 (slides) | |

Aug 29 | 3.2 Kernel methods | Introduction to kernel methods (slides) [Alp10] Chap 13 (slides) | |

Sep 5 | 4.1 Support vector learning | [Alp10] Chap 13 (slides) An introduction to ML, Smola Support Vector Machine Tutorial, Weston | Assignment 3 |

Sep 12 | 3.1 Feature extraction and selection | Feature Engineering, Léon Bottou (slides) [Alp10] Chap 6 (slides) | |

Sep 19 | 4.3. Neural network learning | [Alp10] Chap 11 (slides) Quick and dirty introduction to neural networks (IPython notebook) | |

Sep 26 Oct 10 | 3.3 Representation learning | Deep Learning, Andrew Ng (slides) Representation learning for histopathology image analysis, Arévalo and González (slides) Deep Learning Tutorial, Yann LeCun (slides) How we're teaching computers to understand pictures, Li Fei-Fei (slides) Representation Learning and Deep Learning Tutorial | Assignment 4 Additional samples for question 1 |

Oct 17 | 4.2 Random forest learning | [HTF09] Chap 15 (book) Random Forest and Boosting, Trevor Hastie (slides) Trees and Random Forest, Markus Kalisch (slides1, slides2) | |

Oct 24 | 5.1 Mixture densities | [Alp10] Chap 7 (slides) | |

Oct 31 | 5.2 Latent topic models 5.3 Matrix factorization | Latent Semantic Analysis, CS158 Pomona College (slides) Latent Semantic Variable Models, Thomas Hofmann (videolecture) Non-negative Matrix Factorization for Multimodal Image Retrieval, Fabio González (slides) | |

Nov 7 | 5.3 Matrix factorization 6.2 Large scale machine learning | Two-way Multimodal Online Matrix Factorization, Jorge Vanegas (slides) Online Kernel Matrix Factorizatioon, Esteban Paez (slides) | |

Nov 14 | 6.1 Experimental design | [Alp10] Chap 19 (slides) |