PORTALE DELLA DIDATTICA

PORTALE DELLA DIDATTICA

PORTALE DELLA DIDATTICA

Elenco notifiche



Machine learning for vision and multimedia

01URPOV

A.A. 2024/25

Course Language

Inglese

Degree programme(s)

Master of science-level of the Bologna process in Ingegneria Informatica (Computer Engineering) - Torino

Course structure
Teaching Hours
Lezioni 40,5
Esercitazioni in laboratorio 19,5
Lecturers
Teacher Status SSD h.Les h.Ex h.Lab h.Tut Years teaching
Lamberti Fabrizio Professore Ordinario IINF-05/A 19,5 0 0 0 6
Co-lectures
Espandi

Context
SSD CFU Activities Area context
ING-INF/05 6 B - Caratterizzanti Ingegneria informatica
2024/25
Course goal is to provide the student with both practical and theoretical content on the application of machine and deep learning techniques to computer vision, to multimedia information processing, and to tridimensional graphics. At the beginning of the course, the theoretical fundamentals of deep neural networks will be introduced. Particular attention will be devoted to the analysis of state-of-the-art architectures available for computer vision applications. Through laboratory and project activities, students will gain practical experience with the main programming frameworks as well as with the training and optimization techniques required to implement solutions suited to representative case studies for the various curricula.
The aim of the course is to provide the student with the ability to understand the functioning of applications based on machine learning, with a specific focus on deep neural networks, as well as to let him or her get in touch with the main libraries for developing them. Moreover, the student will learn how to analyze, design and evaluate various solutions targeted to computer vision and, more in general, to the processing of complex data like, e.g., signals, audio, video and point clouds. Specifically, the students will achieve basic knowledge pertaining: - theoretical fundamentals of artificial neural networks and training algorithms (back-propagation); - main deep networks architectures; - hardware and software libraries for training neural networks; - applications of machine and deep learning for computer vision; - applications of machine and deep learning for the processing and generation of multimedia content. Students will develop skills concerning: - the analysis and design of deep neural networks, selecting the most appropriate architecture for the problem at hand; - the analysis, testing and integration of machine learning components within complex applications; - the implementation of a deep neural network using state-of-the-art libraries; - the configuration of the main hyper-parameters and the training of a neural network; - how to cope with practical problems associated with the development of a machine learning component.
Fundamentals of computer programming. Mathematical analysis I. Linear algebra.
The tentative course program will be articulated in in-class theoretical lessons and lab exercises, as reported below. Course introduction Fundamentals of machine learning (in-class lessons 1 CFU) - Basics of probability - Shallow and deep neural networks - Feed forward and recurrent networks - Statistical machine learning (supervised learning, overfitting, regularization) - Back-propagation and stochastic gradient descent Methods for the implementation of deep networks (lab exercises 1 CFU) - Automatic differentiation, programming in Keras and/or Tensorflow - Training deep networks: hyperparameter tuning, data augmentation, batch normalization, and transfer learning Applications for computer vision (in-class lessons 1 CFU + lab exercises 0.7 CFU) - Overview and taxonomy of the main applications - Convolutional neural networks (definition and main architectures) - Neural networks for object detection and tracking - Neural networks for image segmentation (encoder-decoder) - Human movement analysis: pose estimation and action recognition Applications for multimedia data processing (in-class lessons 0.7 CFU + lab exercises 0.3 CFU) - Audio/Speech processing: audio identification/fingerprinting and sound classification - Video analysis and processing: image/video recognition/labeling Introduction to advanced deep learning techniques (in-class lessons or seminars 0.8 CFU) - Representation and analysis of 3D data (point clouds and meshes) - Analysis and synthesis of 3D representations: applications in the field of computer graphics and industry - Generative models and GANs
The course will encompass 40 hours of in-class lessons and 20 hours of lab exercises. Exercises will be aimed to introduce the software stack used for the implementation of neural networks and the practical aspects associated with their training. During lab activities, students will be provided with problems to solve individually on in a pair, which will be followed by a discussion of the main approaches adopted to reach the solution. These activities will be preparatory to the development of an individual or group project, which will concur to the determination of the final grade. Seminars could be organized to tackle advanced topics.
Books (to be selected based on software tools used): - I. Goodfellow, Y. Bengio, A. Courville. Deep Learning. MIT press - Aurelie Geron, Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems, O’Reilly - F. Cholet, Deep Learning with Python, Manning Publications, or - Eli Stevens, Luca Antiga, and Thomas Viehmann, Deep Learning with Pytorch Links to further books could be provided at the beginning or during the course. Additional material provided: - slides for in-class lessons, notes for lab exercises, sample exams and other content on the Portale della Didattica; - recordings of in-class lessons (and of lab exercises, when possible) - tutorials available on the web.
Lecture slides; Lecture notes; Text book; Exercises; Lab exercises; Video lectures (current year);
You can take this exam before attending the course
Exam: Written test; Compulsory oral exam; Individual project; Group project;
L’esame si compone di una prova scritta e di un progetto di laboratorio, che concorrono alla determinazione del voto finale nella misura di 1/3 e 2/3. L’esame è superato se la valutazione totale conseguita nelle due parti, espresse in trentesimi, è di almeno 18/30. La prova scritta, della durata indicativa di un’ora e mezza, è volta ad accertare la conoscenza degli argomenti elencati nel programma ufficiale dell’insegnamento e comprenderà domande a risposta aperta o brevi esercizi volti ad accertare la capacità di applicare i concetti teorici. Durante la prova scritta non è permesso tenere e consultare libri, appunti, fogli con esercizi, formulari, e materiale simile. Potrà essere permesso tenere una calcolatrice se funzionale all’esercizio. Il progetto di laboratorio, svolto individualmente o in gruppo di al più 3 studenti, è volto ad applicare le nozioni acquisite durante le lezioni ed esercitazioni ad un caso d’uso rappresentativo per gli orientamenti coinvolti. Il tema del progetto sarà proposto dagli studenti ed approvato dal docente, e dovrà necessariamente prevedere l’implementazione ed addestramento di una rete profonda. A titolo di esempio, potrà essere considerata l’implementazione di un articolo di letteratura scientifica, la creazione di un’applicazione originale o la risoluzione di una challenge. L’integrazione di componenti o codice pubblicamente disponibile (ad esempio, modelli pre-allenati) dovrà essere pre-concordato con il docente. Il progetto consegnato dovrà contenere il codice sviluppato, almeno un modello allenato ed un breve elaborato (4-6 pagine) che illustri le tecniche adottate, i valori degli hyper-parameter, il dataset, le metodologie di valutazione ed i risultati ottenuti. Il progetto dovrà essere consegnato entro la data indicata all’inizio del corso e sarà discusso con i docenti (costituirà la prova orale). Durante la discussione saranno poste domande volte ad accertare il contributo individuale nella realizzazione del progetto, e potrà essere richiesto di verificare l’effettivo funzionamento del codice consegnato. Possono essere assegnati fino a 2 punti aggiuntivi sia per la prova di teoria (per il rigore espositivo) che per il progetto di laboratorio/prova orale (per l'uso di tecniche e strumenti non illustrati dal docente caratterizzate/i da una complessità significativa, o per la completezza delle prove sperimentali eseguite). Tali punti permettono di ottenere la lode. I risultati conseguiti nelle due parti e la valutazione complessiva vengono comunicati tramite il Portale della Didattica.
In addition to the message sent by the online system, students with disabilities or Specific Learning Disorders (SLD) are invited to directly inform the professor in charge of the course about the special arrangements for the exam that have been agreed with the Special Needs Unit. The professor has to be informed at least one week before the beginning of the examination session in order to provide students with the most suitable arrangements for each specific type of exam.
Esporta Word