


Politecnico di Torino  
Academic Year 2017/18  
01RLONG Data spaces/Statistical models 

Master of sciencelevel of the Bologna process in Mathematical Engineering  Torino 





Esclusioni: 01RLP 
Subject fundamentals
Advanced methods of Statistics and mathematical foundations of Machine Learning will be presented, with a unifying view of the relationship between the two subjects.

Expected learning outcomes
The contents of the course are based on the frequentist and Bayesian ideas for data analysis and interpretation, with the addition of a few new ideas coming from Machine Learning and its mathematical foundations.
The student will learn the critical use of specialised software (R, SAS, BUGS, STAN, MATLAB, ORANGE, RapidMiner) and will be able to tell its potentials and limitations. 
Prerequisites / Assumed knowledge
A good knowledge of Mathematical Analysis and Linear Algebra and a basic education worth around 12 credits in elementary Probability and Mathematical Statistics.

Contents
Linear models (regression, ANOVA) with quantitative and qualitative predictors, transformations, model choice.
Simultanous inference in linear models (multiple comparison, ranking and testing, Tukey, Scheffé) . Generalities on data representation (topologies, metrics, dissimilarities, linear spaces). Mathematical foundations of machine learning (SVM and the like). Supervised and unsupervised statistical Learning. Machine learning methods for classification, including treebased methods. Generalized linear models (e.g. logistic, Poisson and Cox regression) Linear and nonlinear mixed effect models Bayesian Networks Time series and some spatial statistics (with elements of kriging) 
Delivery modes
Traditional lectures will alternate with computer sessions. The student will bring his/her own computer and install the necessary software. Some support is available in case of need.

Texts, readings, handouts and other learning resources
Most supporting material and software will be published on the webpage of the course.
In addition, for the Statistics part, the following textbooks will be useful:  The BUGS Book: A Practical Introduction to Bayesian Analysis by David Lunn, Chris Jackson, Nicky Best, Andrew Thomas, David Spiegelhalter. Chapman & Hall.  Categorical Data Analysis by Alan Agresti. Wiley  Statistical analysis of designed experiments by Ajit C. Tamhane. Wiley For the Machine Learning part instead:  An introduction to Statistical Learning with Applications in R. Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani. Springer, freely available at http://wwwbcf.usc.edu/~gareth/ISL/ 
Assessment and grading criteria
At the end of the course, a list of case studies will be provided, together with explanations, references and software. For each student, two case studies in Statistics and one case study in Machine Learning will be sampled and discussed during the oral exam.
The student will comment on the specific aspects of the case study and on its methodological foundations. 
Notes This course is mainly designed for master students in Mathematical Engineering, but a reduced version simply called Data Spaces is available for students in other Engineering masters, notably Computer Science. 
