Servizi per la didattica
PORTALE DELLA DIDATTICA

Optimization for Machine Learning

01SQKMV, 01SQKNG

A.A. 2018/19

Course Language

English

Course degree

Master of science-level of the Bologna process in Biomedical Engineering - Torino
Master of science-level of the Bologna process in Mathematical Engineering - Torino

Course structure
Teaching Hours
Lezioni 40
Esercitazioni in aula 20
Teachers
Teacher Status SSD h.Les h.Ex h.Lab h.Tut Years teaching
Calafiore Giuseppe Carlo Professore Ordinario ING-INF/04 40 10 0 0 1
Teaching assistant
Espandi

Context
SSD CFU Activities Area context
ING-INF/04 6 D - A scelta dello studente A scelta dello studente
2018/19
Machine Learning (ML) encompasses a variety of methodologies and computational algorithms, mainly grounded in Bayesian statistics, for extracting information, clustering, detecting patterns, making decisions and predictions or, more generally, understanding phenomena from available data. Classical learning models created in the 1970s, such as Neural Networks, as well as later techniques such as Support Vector Machines (SVM), are witnessing a new wave of resurgence in both theory and applications in the present era of Big Data, where the deluge of unstructured information calls for automated and highly efficient methods of data analysis. Contemporary Machine Learning, in turn, constitutes an essential part of Data Science, an interdisciplinary field for which industry has a global excess demand of experts. In this course, we present the main tools for supervised learning (regression, regularization, classification) and unsupervised learning (clustering, dimensionality reduction), with a focus on the structure and features of the optimization algorithms that are needed to actually solve numerically the learning problems of interest. The course is structured into lectures in class, in which the context and methodologies are explained, and computer lab sessions, in which the students apply the methodologies to real-world data sets and problems coming from various fields, such as finance, business analytics, news, biology, medical diagnosis, etc.
Machine Learning (ML) encompasses a variety of methodologies and computational algorithms, mainly grounded in Bayesian statistics, for extracting information, clustering, detecting patterns, making decisions and predictions or, more generally, understanding phenomena from available data. Classical learning models created in the 1970s, such as Neural Networks, as well as later techniques such as Support Vector Machines (SVM), are witnessing a new wave of resurgence in both theory and applications in the present era of Big Data, where the deluge of unstructured information calls for automated and highly efficient methods of data analysis. Contemporary Machine Learning, in turn, constitutes an essential part of Data Science, an interdisciplinary field for which industry has a global excess demand of experts. In this course, we present the main tools for supervised learning (regression, regularization, classification) and unsupervised learning (clustering, dimensionality reduction), with a focus on the structure and features of the optimization algorithms that are needed to actually solve numerically the learning problems of interest. The course is structured into lectures in class, in which the context and methodologies are explained, and computer lab sessions, in which the students apply the methodologies to real-world data sets and problems coming from various fields, such as finance, business analytics, news, biology, medical diagnosis, etc.
The student will acquire knowledge of the basic tools used in machine learning, and an introductory insight on the functioning of the optimization algorithms that form the inner computational “engine” of these tools. The student will gain some experience in visualizing and analyzing labeled and unlabeled high-dimensional data sets and in extracting useful information from them. Complementing a student’s background in statistics, optimization or data mining, this course will help forming the skills of a Junior Data Scientist.
The student will acquire knowledge of the basic tools used in machine learning, and an introductory insight on the functioning of the optimization algorithms that form the inner computational “engine” of these tools. The student will gain some experience in visualizing and analyzing labeled and unlabeled high-dimensional data sets and in extracting useful information from them. Complementing a student’s background in statistics, optimization or data mining, this course will help forming the skills of a Junior Data Scientist.
Good knowledge of linear algebra, geometry, analysis and some exposure to probability and statistics is required. A previous course on numerical computing, optimization, or operations research is recommended but not strictly required.
Good knowledge of linear algebra, geometry, analysis and some exposure to probability and statistics is required. A previous course on numerical computing, optimization, or operations research is recommended but not strictly required.
Introduction to Machine Learning. Supervised and unsupervised learning. Parametric and nonparametric models. Classical examples in pattern analysis (e.g., hand-writing recognition). A brief historical perspective. Review of probability theory and statistics. Marginal and conditional distributions. Bayes theorem. Prior, likelihood, posterior. Bayesian inference. Regression problems. Over-fitting. Bias-variance tradeoff. Regularized regression. Linear regression with sparsity-inducing penalties. Ridge regression. The Lasso. The Elastic-Net. Applications (e.g., in image analysis and in computational finance). Logistic regression. Sparse logistic regression and applications (e.g., to text categorization). Algorithms for large-scale regularized regression: First-order methods. Proximal methods. The Fast Iterative Shrinkage-Thresholding Algorithm (FISTA). Coordinate descent and block-coordinate descent methods. Classifiers. Neural Networks. Training and the back-propagation algorithm. Maximum margin classifiers. Dual representation. Kernel methods and the Support Vector Machine (SVM). Clustering. K-means clustering. Gaussian mixtures and the Expectation-Minimization (EM) algorithm. Singular value decomposition and the Principal Component Analysis (PCA). Interpretability and the Sparse-PCA. Fast algorithms for Sparse-PCA. Time-series modeling and forecasting.
Introduction to Machine Learning. Supervised and unsupervised learning. Parametric and nonparametric models. Classical examples in pattern analysis (e.g., hand-writing recognition). A brief historical perspective. Review of probability theory and statistics. Marginal and conditional distributions. Bayes theorem. Prior, likelihood, posterior. Bayesian inference. Regression problems. Over-fitting. Bias-variance tradeoff. Regularized regression. Linear regression with sparsity-inducing penalties. Ridge regression. The Lasso. The Elastic-Net. Applications (e.g., in image analysis and in computational finance). Logistic regression. Sparse logistic regression and applications (e.g., to text categorization). Algorithms for large-scale regularized regression: First-order methods. Proximal methods. The Fast Iterative Shrinkage-Thresholding Algorithm (FISTA). Coordinate descent and block-coordinate descent methods. Classifiers. Neural Networks. Training and the back-propagation algorithm. Maximum margin classifiers. Dual representation. Kernel methods and the Support Vector Machine (SVM). Clustering. K-means clustering. Gaussian mixtures and the Expectation-Minimization (EM) algorithm. Singular value decomposition and the Principal Component Analysis (PCA). Interpretability and the Sparse-PCA. Fast algorithms for Sparse-PCA. Time-series modeling and forecasting.
The course is organized in a series of lectures (about 1/3 of the course) and computer lab exercises and practice sessions (about 2/3 of the course).
The course is organized in a series of lectures (about 1/3 of the course) and computer lab exercises and practice sessions (about 2/3 of the course).
Course slides, handouts, and lab practice sheets will be made available to the students via the PoliTo Web portal. Useful reference textbooks are the following ones: C.M. Bishop, Pattern Recognition and Machine Learning, Springer, 2006. J. Friedman, T. Hastie and R. Tibshirani, The Elements of Statistical Learning, Springer, 2009. G.C. Calafiore and L. El Ghaoui, Optimization Models, Cambridge Univ. Press, 2014.
Course slides, handouts, and lab practice sheets will be made available to the students via the PoliTo Web portal. Useful reference textbooks are the following ones: C.M. Bishop, Pattern Recognition and Machine Learning, Springer, 2006. J. Friedman, T. Hastie and R. Tibshirani, The Elements of Statistical Learning, Springer, 2009. G.C. Calafiore and L. El Ghaoui, Optimization Models, Cambridge Univ. Press, 2014.
Modalità di esame: prova scritta; progetto di gruppo;
The final exam consists in a written test, which will contain a mixture of methodological questions and numerical exercises (to be executed with pen and paper, use of a calculator is allowed) and which will take the form of multiple-choice questionar. For students who fully participate in the lab sessions an optional alternate form of final exam is offered, which consists of a written report on an applicative project assigned by the instructor. In this case, the students can work on the project in groups of at most five persons; a reasonable time will be allowed for completing the report (typically, one week), and each group is requested to give a public presentation of their results in front of the class. The evaluation and corresponding score will be based on the results of lab assignments and factors including technical correctness of the report, organization of the presentation, and autonomy in the report development. No oral exams are foreseen, besides the oral presentation of the report.
Exam: written test; group project;
The final exam consists in a written test, which will contain a mixture of methodological questions and numerical exercises (to be executed with pen and paper, use of a calculator is allowed) and which will take the form of multiple-choice questionar. For students who fully participate in the lab sessions an optional alternate form of final exam is offered, which consists of a written report on an applicative project assigned by the instructor. In this case, the students can work on the project in groups of at most five persons; a reasonable time will be allowed for completing the report (typically, one week), and each group is requested to give a public presentation of their results in front of the class. The evaluation and corresponding score will be based on the results of lab assignments and factors including technical correctness of the report, organization of the presentation, and autonomy in the report development. No oral exams are foreseen, besides the oral presentation of the report.


© Politecnico di Torino
Corso Duca degli Abruzzi, 24 - 10129 Torino, ITALY
m@il