PORTALE DELLA DIDATTICA

PORTALE DELLA DIDATTICA

PORTALE DELLA DIDATTICA

Elenco notifiche



Optimized execution of neural networks at the edge

02DNMIU

A.A. 2025/26

Course Language

Inglese

Degree programme(s)

Doctorate Research in Ingegneria Informatica E Dei Sistemi - Torino

Course structure
Teaching Hours
Lezioni 15
Esercitazioni in aula 5
Lecturers
Teacher Status SSD h.Les h.Ex h.Lab h.Tut Years teaching
Burrello Alessio   Ricercatore L240/10 IINF-05/A 15 0 0 0 1
Co-lectures
Espandi

Context
SSD CFU Activities Area context
*** N/A *** 4    
The program of the course will be the following: - Introduction - Deep learning basics - Static optimizations for deep learning: quantization, pruning, NAS, etc. - Dynamic optimizations for deep learning: big/little, N-width networks, dynamic precision scaling, etc. - Efficient deployment of deep learning models: compilers and design space exploration tools. - Optimizing deep learning models: practical examples - Future trends and challenges
The program of the course will be the following: - Introduction - Deep learning basics - Static optimizations for deep learning: quantization, pruning, NAS, etc. - Dynamic optimizations for deep learning: big/little, N-width networks, dynamic precision scaling, etc. - Efficient deployment of deep learning models: compilers and design space exploration tools. - Optimizing deep learning models: practical examples - Future trends and challenges
A minimum knowledge in the following fields is required to enjoy the course: - Computer architecture - Embedded systems - Embedded software/firmware - Parallel computing - Machine learning - Signal processing - Data science
A minimum knowledge in the following fields is required to enjoy the course: - Computer architecture - Embedded systems - Embedded software/firmware - Parallel computing - Machine learning - Signal processing - Data science
The course will focus on artificial neural networks and deep learning. Deep neural networks will be analyzed from a computational standpoint, identifying critical operations in terms of time, memory and energy. We will then survey the main techniques to optimize the execution of these models. Specifically, we will first introduce so-called "static" optimizations, i.e., those performed before deploying the model on the target hardware, either at training time or post-training. These include data quantization, weights and activations pruning, knowledge distillation, neural architecture search, and others. Then, we will describe "dynamic" optimizations, which adapt the execution complexity at runtime, based on external conditions (e.g., the battery state-of-charge) or on the processed data. Lastly, we will discuss the automatic compilation of inference code from a high-level Python representation of a neural network to an optimized binary for a specific hardware target. Students will have the possibility of trying some of the optimizations seen in class in a practical session, based on the PyTorch deep learning framework, attempting to deploy a complex deep neural network onto a real edge device. The final examination will consist of a presentation by the students showing how one or more of the techniques introduced in the course can be applied to his/her own research.
The course will focus on artificial neural networks and deep learning. Deep neural networks will be analyzed from a computational standpoint, identifying critical operations in terms of time, memory and energy. We will then survey the main techniques to optimize the execution of these models. Specifically, we will first introduce so-called "static" optimizations, i.e., those performed before deploying the model on the target hardware, either at training time or post-training. These include data quantization, weights and activations pruning, knowledge distillation, neural architecture search, and others. Then, we will describe "dynamic" optimizations, which adapt the execution complexity at runtime, based on external conditions (e.g., the battery state-of-charge) or on the processed data. Lastly, we will discuss the automatic compilation of inference code from a high-level Python representation of a neural network to an optimized binary for a specific hardware target. Students will have the possibility of trying some of the optimizations seen in class in a practical session, based on the PyTorch deep learning framework, attempting to deploy a complex deep neural network onto a real edge device. The final examination will consist of a presentation by the students showing how one or more of the techniques introduced in the course can be applied to his/her own research.
In presenza
On site
Presentazione orale
Oral presentation
P.D.2-2 - Aprile
P.D.2-2 - April