Portale della Didattica

Optimized execution of neural networks at the edge

02DNMIU

A.A. 2025/26

Course Language

Inglese

Degree programme(s)

Doctorate Research in Ingegneria Informatica E Dei Sistemi - Torino

Course structure

Teaching	Hours
Lezioni	15
Esercitazioni in aula	5

Lecturers

Teacher	Status	SSD	h.Les	h.Ex	h.Lab	h.Tut	Years teaching
Burrello Alessio	Ricercatore L240/10	IINF-05/A	15	0	0	0	1

Co-lectures

Espandi

Teacher	Status	SSD	h.Les	h.Ex	h.Lab	h.Tut
Jahier Pagliari Daniele	Professore Associato	IINF-05/A	0	5	0	0

Context

SSD	CFU	Activities	Area context
* N/A *	4

Statistiche superamento esami

Presentazione
Course description

The program of the course will be the following: - Introduction - Deep learning basics - Static optimizations for deep learning: quantization, pruning, NAS, etc. - Dynamic optimizations for deep learning: big/little, N-width networks, dynamic precision scaling, etc. - Efficient deployment of deep learning models: compilers and design space exploration tools. - Optimizing deep learning models: practical examples - Future trends and challenges

Prerequisiti
Pre-requirements

A minimum knowledge in the following fields is required to enjoy the course: - Computer architecture - Embedded systems - Embedded software/firmware - Parallel computing - Machine learning - Signal processing - Data science

Programma
Course topics

The course will focus on artificial neural networks and deep learning. Deep neural networks will be analyzed from a computational standpoint, identifying critical operations in terms of time, memory and energy. We will then survey the main techniques to optimize the execution of these models. Specifically, we will first introduce so-called "static" optimizations, i.e., those performed before deploying the model on the target hardware, either at training time or post-training. These include data quantization, weights and activations pruning, knowledge distillation, neural architecture search, and others. Then, we will describe "dynamic" optimizations, which adapt the execution complexity at runtime, based on external conditions (e.g., the battery state-of-charge) or on the processed data. Lastly, we will discuss the automatic compilation of inference code from a high-level Python representation of a neural network to an optimized binary for a specific hardware target. Students will have the possibility of trying some of the optimizations seen in class in a practical session, based on the PyTorch deep learning framework, attempting to deploy a complex deep neural network onto a real edge device. The final examination will consist of a presentation by the students showing how one or more of the techniques introduced in the course can be applied to his/her own research.

Tipo di erogazione
Tipo di erogazione

In presenza

On site

Modalit� d'esame
Modalit� d'esame

Presentazione orale

Oral presentation

Periodo Didattico
Periodo Didattico

P.D.2-2 - Aprile

P.D.2-2 - April

Informazioni aggiuntive
Additional information