Computational linear algebra for large scale problems
02TWYNG, 02TWYSM
A.A. 2024/25
Course Language
Inglese
Degree programme(s)
Master of science-level of the Bologna process in Ingegneria Matematica - Torino Master of science-level of the Bologna process in Data Science And Engineering - Torino
This course aims at presenting the mathematical and numerical foundation of several methods applied in Data Science. Analysis of large scale data sets requires specific algebraic tools in order to extract the most relevant information from data. This problem is tackled by the application of several mathematical tools. This course is designed to present them and to explain pro and cons of their application to realistic data sets.
The extraction of information from large datasets requires the use of specific methods and algorithms. This course presents the mathematical and numerical foundation of several methods applied in Data Science. Analysis of large data sets requires specific algebraic tools to manipulate and extract the most relevant information from data. This course is designed to present them and to explain the pros and cons of their application to realistic data sets.
Learn and understand the tools of linear algebra most used to extract infrormations from large datasets: projection operators, decomposition of matrices, eigenvalues and eigenvectors of matrices, SVD decomposition and relations with the Principal Component Analysis.
The student will learn and understand the most used tools of linear algebra to extract information from large datasets: projection operators, decomposition of matrices, eigenvalues and eigenvectors of matrices, SVD decomposition and relations with the Principal Component Analysis and clustering methods.
The student will practice with Matlab and Python packages to apply the most used tools of linear algebra to manipulate data, reduce their dimensionality and extract information from large datasets.
The student will acquire the capability to choose the most appropriate tools depending on the properties of the data and the target of the data analysis.
Basic knowledge of basic linear algebra and calculus is a prerequisite, as well as a basic coding ability and computer knowledge.
Basic knowledge of basic linear algebra and calculus is a prerequisite, as well as a basic coding ability and computer knowledge.
1. Basic linear algebra tools: vector spaces, bases, linear operators, matrices, eigenvalues and eigenvectors, norms.
2. Dense and sparse matrices, matrix operations.
3. Vector rotations, orthogonalization, projections: Grahm-Schmidt, Givens, Householder methods and QR factorization.
4. Iterative solutions of large scale linear systems: applicability, convergence, computational cost and memory requirements, preconditioning.
5. Approximation of data and functions: global and piecewise interpolation, least square approximation, numerical tools.
6. Eigenvalues and eigenvectors computations: numerical methods and common tools for large scale matrices. Stability and conditioning.
7. Computation and theoretical properties of Singular Value Decomposition (SVD).
8. Generalized inverse matrix and Moore–Penrose inverse.
9. Dimensional reduction of a problem and Principal Component Analisys (PCA).
10. LAPACK, ARPACK, Matlab/Octave, C/C++ (Armadillo, PETSc and SLEPc) and Python common numerical libraries.
1. Basic linear algebra tools: vector spaces, bases, linear operators, matrices, eigenvalues and eigenvectors, norms.
2. Dense and sparse matrices, matrix operations.
3. Vector rotations, orthogonalization, projections: Grahm-Schmidt, Givens, Householder methods and QR factorization. Reorthogonalization.
4. Iterative solutions of large scale linear systems: applicability, convergence, computational cost and memory requirements, preconditioning.
5. Approximation of data and functions: global and piecewise interpolation, least-square approximation, numerical tools.
6. Eigenvalues and eigenvectors computations: numerical methods and common tools for large scale matrices. Stability and conditioning.
7. Computation and theoretical properties of Singular Value Decomposition (SVD).
8. Generalized inverse matrix and Moore–Penrose inverse.
9. Dimensional reduction of a problem and Principal Component Analysis (PCA).
10. Randomized SVD and Johnson-Lindenstrauss theorem.
11. Spectral Clustering.
The course is organized in theoretical lectures and practice classes. Theoretical lectures are devoted to the presentation of the topics, with definitions, properties, introductory examples. The practice classes are devoted to train the students’ abilities to solve problems and exercises and to perform computations and simulations with common tools.
The course is organized in theoretical lectures and practice classes. Theoretical lectures are devoted to the presentation of the topics, with definitions, properties, and introductory examples. The practice classes are devoted to training the students’ abilities to solve problems and exercises and to perform computations and simulations with common tools.
Slides presented during lesson will be made available through the Portale della Didattica.
Other material will be suggested in class and, if possible, made available through the Portale della Didattica.
Suggested textbook:
• Linear Algebra and Learning from Data, G. Strang, Cambridge University Press, 2019, ISBN: 9780692196380
• Iterative Methods for Sparse Linear Systems, Y. Saad, Society for Industrial and Applied Mathematics Philadelphia, PA, USA, 2003, ISBN:0898715342
Slides presented during lesson will be made available through the Portale della Didattica.
Other material will be suggested in class and, if possible, made available through the Portale della Didattica.
Suggested textbook:
• Linear Algebra and Learning from Data, G. Strang, Cambridge University Press, 2019, ISBN: 9780692196380
• Iterative Methods for Sparse Linear Systems, Y. Saad, Society for Industrial and Applied Mathematics Philadelphia, PA, USA, 2003, ISBN:0898715342
Slides; Dispense; Esercitazioni di laboratorio;
Lecture slides; Lecture notes; Lab exercises;
Modalità di esame: Prova orale obbligatoria; Elaborato scritto prodotto in gruppo;
Exam: Compulsory oral exam; Group essay;
...
Exam: compulsory oral exam;
Exam: Oral exam with a discussion of homeworks assigned during the course.
Three homework assignments (HW1, HW2, HW3) will be assigned to the students during the course:
• HW1 and HW2 consist of exercises that are aimed to evaluate the students in using the methods presented;
• HW3 is an application of the methods learned to a problem chosen by the student.
The oral exam will then consist of two parts:
a) a discussion of the submitted HW1, HW2, and HW3 reports, aimed at testing the depth of the students’ understanding of the subjects and their ability to explain, defend, reflect, critically evaluate, and possibly improve their work, proving the real acquisition of the abilities listed in the expected learning outcomes section.
b) a presentation of a topic studied in the course covering both theoretical aspects and possibly their implementation and applications proving the real acquisition of the knowledge listed in the expected learning outcomes section.
Grading:
• The maximum grade for HW1, HW2, and HW3, upon the discussion detailed at point (a) above, is of 14 points.
• The maximum grade for part (b) of the oral test is 18 points.
The final course grade is then obtained by summing up the final grades of parts (a) and (b) of the oral test.
Gli studenti e le studentesse con disabilità o con Disturbi Specifici di Apprendimento (DSA), oltre alla segnalazione tramite procedura informatizzata, sono invitati a comunicare anche direttamente al/la docente titolare dell'insegnamento, con un preavviso non inferiore ad una settimana dall'avvio della sessione d'esame, gli strumenti compensativi concordati con l'Unità Special Needs, al fine di permettere al/la docente la declinazione più idonea in riferimento alla specifica tipologia di esame.
Exam: Compulsory oral exam; Group essay;
Exam: compulsory oral exam;
Exam: Oral exam with a discussion of homeworks assigned during the course.
Two homework assignments (HW1, HW2) will be assigned to the students during the course:
• HW1 and HW2 consist of exercises that evaluate the students using the methods presented.
A third optional homework (HW3) consists of a theoretical or implementation insight into one of the topics of the course.
The oral exam will then consist of two parts:
a) a discussion of the submitted HW1 and HW2 reports, aimed at testing the depth of the student’s understanding of the subjects and their ability to explain, defend, reflect, critically evaluate, and possibly improve their work, proving the acquisition of the skills listed in the expected learning outcomes section.
b) a presentation of the HW3 ad of a topic chosen by the student considered in the course covering theoretical aspects and possibly their implementation, and applications proving the real acquisition of the knowledge listed in the expected learning outcomes section on a subject different from the ones covered by HW1 and HW2.
Grading:
• The maximum grade for HW1 and HW2, upon the discussion detailed at point (a) above, is 14 points.
• The maximum grade for part (b) of the oral test is 18 points.
The final course grade is then obtained by summing up the final grades of parts (a) and (b) of the oral test.
In addition to the message sent by the online system, students with disabilities or Specific Learning Disorders (SLD) are invited to directly inform the professor in charge of the course about the special arrangements for the exam that have been agreed with the Special Needs Unit. The professor has to be informed at least one week before the beginning of the examination session in order to provide students with the most suitable arrangements for each specific type of exam.