Portale della Didattica

Efficient Computing for Artificial Intelligence

02VRVSM, 02VRVWS

A.A. 2025/26

Course Language

Inglese

Degree programme(s)

Master of science-level of the Bologna process in Data Science And Engineering - Torino

Borrow

01VRVOV

Course structure

Teaching	Hours
Lezioni	36
Esercitazioni in aula	11
Esercitazioni in laboratorio	33

Lecturers

Teacher	Status	SSD	h.Les	h.Ex	h.Lab	h.Tut	Years teaching
Jahier Pagliari Daniele	Professore Associato	IINF-05/A	18	5	0	0	1

Co-lectures

Espandi

Teacher	Status	SSD	h.Les	h.Ex	h.Lab
Calimera Andrea	Professore Ordinario	IINF-05/A	9	0	0
Patti Edoardo	Professore Associato	IINF-05/A	9	0	0
Peluso Valentino	Ricercatore L240/10	IINF-05/A	0	6	33

Context

SSD	CFU	Activities	Area context
ING-INF/05	8	B - Caratterizzanti	Ingegneria informatica

Statistiche superamento esami

Anno accademico di inizio validit�

2025/26

Presentazione
Course description

The course aims to introduce the challenges related to the deployment and efficient execution of artificial intelligence (AI) applications, especially those based on deep neural networks (DNNs). It discusses how the various steps of an AI application�s lifecycle can be distributed among edge and cloud devices (from data collection to model training and inference), focusing in particular on how to efficiently execute the inference phase on hardware different from the high-performance servers available in the cloud, such as mobile or , embedded Internet of Things (IoT) devices. This encompasses three main objectives: 1) Understanding the complex ecosystem behind AI applications deployed in the real world, and the characteristics of the compute devices available at the edge and in the cloud. Understanding the challenges arising from the use of portable, low-power devices as (i) the main gateway to collect, pre-process, and exchange raw data, and as (ii) the main hosting node for the execution of DNN models. 2) Introducing techniques (such as quantization, pruning, neural architecture search, etc) to train and optimize efficient DNNs, carefully considering their memory occupation, execution latency, and energy (or CO2) consumption. Describing how efficient inference engines optimize the execution of DNN workloads. All the techniques that will be presented are currently applied in real-world AI deployments at any scale, from tiny models executing at the edge, to LLMs in the cloud. 3) Introducing the software stacks, communication protocols and data formats employed in distributed platforms for AI-based applications, to let multiple inference nodes communicate with each other and/or with a private server.

The course aims to introduce the challenges related to the deployment and efficient execution of artificial intelligence (AI) applications, especially those based on deep neural networks (DNNs). It discusses how the various steps of an AI application�s lifecycle can be distributed among edge and cloud devices (from data collection to model training and inference), focusing in particular on how to efficiently execute the inference phase on hardware different from the high-performance servers available in the cloud, such as mobile or embedded Internet of Things (IoT) devices. This encompasses three main objectives: 1) Understanding the complex ecosystem behind AI applications deployed in the real world, and the characteristics of the compute devices available at the edge and in the cloud. Understanding the challenges arising from the use of portable, low-power devices as (i) the main gateway to collect, pre-process, and exchange raw data, and as (ii) the main hosting node for the execution of DNN models. 2) Introducing techniques (such as quantization, pruning, neural architecture search, etc) to train and optimize efficient DNNs, carefully considering their memory occupation, execution latency, and energy (or CO2) consumption. Describing how efficient inference engines optimize the execution of DNN workloads. All the techniques that will be presented are currently applied in real-world AI deployments at any scale, from tiny models executing at the edge, to LLMs in the cloud. 3) Introducing the software stacks, communication protocols and data formats employed in distributed platforms for AI-based applications, to let multiple inference nodes communicate with each other and/or with a private server.

Risultati attesi
Expected Learning Outcomes

The contents and thus the skills acquired at the end of the course include both the hardware aspects of the problem (architectures of "edge" devices) and the software aspects (programming models, protocols, and related APIs). The skills acquired will allow a correct understanding of complex AI deployments (such as those in the IoT domain), in which the flow of data is processed not only on servers but also, partially or entirely, on local devices with reduced computational resources and energy budget. At the end of the course, each student will be able to train, optimize, deploy an AI system on a low-power commercial board, and to implement the data-exchange protocols to communicate raw-data and distilled information among distributed end-nodes. Specific Learning Outcomes: - Understanding of the topics covered and specifically of the hardware and software technologies involved in real-world deployments of AI applications, with a focus on the IoT domain. - Understanding and use of AI model design and optimization tools, with particular attention to non-functional metrics (such as latency, memory, energy, and power consumed) and to DNN models. - Being able to recognize and use adequate programming tools and communication protocols to share and distribute data across connected devices. - Being able to design, integrate, and evaluate the main components of a distributed IoT solution using the appropriate programming tools.

The contents and thus the skills acquired at the end of the course include both the hardware aspects of the problem (architectures of "edge" devices) and the software aspects (programming models, protocols, and related APIs). The skills acquired will allow a correct understanding of complex AI deployments, in which the flow of data is processed not only on servers but also, partially or entirely, on local devices with reduced computational resources and energy budget. At the end of the course, each student will be able to train, optimize, deploy an AI system on a low-power commercial board, and to implement the data-exchange protocols to communicate raw-data and distilled information among distributed end-nodes. Specific Learning Outcomes: - Understanding of the topics covered and specifically of the hardware and software technologies involved in real-world deployments of AI applications, with a focus on the IoT domain. - Understanding and use of AI model design and optimization tools, with particular attention to non-functional metrics (such as latency, memory, energy, and power consumed) and to DNN models. - Being able to recognize and use adequate programming tools and communication protocols to share and distribute data across connected devices. - Being able to design, integrate, and evaluate the main components of a distributed AI system using the appropriate programming tools.

Prerequisiti
Pre-requirements

- Theory and basic concepts of machine learning and deep learning - Software programming theories and tools - Object-oriented programming - Basic concepts on computer architectures and networks

- Theory and basic concepts of machine learning and deep learning - Software programming theories and tools - Basics of Object-oriented programming - Basic concepts on computer architectures and networks

Programma
Course topics

The course topics are organized in three main parts: 1. HW and SW technologies adopted in a typical AI-based application; computer architectures used for running and developing machine learning algorithms at the edge and in the cloud; basic knowledge about sensors as data sources; modeling of non-functional metrics (performance, memory occupation, energy consumption). 2. Efficient AI: resource-driven model optimization of deep neural networks (quantization, pruning, NAS, etc), industrial frameworks and engines for efficient model training, optimization and deployment. 3. Data exchange: Distributed software platforms for edge computing, management of edge-fog-cloud interfaces (web programming/network programming of IoT protocols - REST response and Publish subscribe and MQTT), Microservices design patterns, cloud/edge workload balancing. Lab practices will touch upon the above topics, teaching students to implement their own optimization and communication applications using as benchmark real-life use-cases where IoT data will be sampled, pre-processed, distilled and communicated with a dedicated server node.

Sustainable development goals

Garantire modelli sostenibili di produzione e di consumo

Rendere le citt� e gli insediamenti umani inclusivi, sicuri, duraturi e sostenibili

Promuovere azioni, a tutti i livelli, per combattere il cambiamento climatico

Note
Additional information

The first part of this course (parts 1 and 2) is also offered standalone as a 6 CFU course to Computer Engineering students (code: 01VRVOV).

Organizzazione dell'insegnamento
Course structure

The structure of the course reflects the organization of the main topics, with three main teaching blocks organized as follows: Part 1 [12h] � Deployment of AI-based Applications (definitions, architectures, challenges, and technologies); Part 2 [24h] � Efficient AI (commercial training, optimization and inference frameworks, resource-driven optimization techniques for deep neural networks); Part 3 [12h] - Communication protocols and their implementation (how to send/receive data from/to edge/remote servers) Within each block, there are lab sessions [30h] during which students (groups of 2-3 people) can practice with a real software implementation of the "theoretic" concepts and strategies introduced during the regular classes.

The structure of the course reflects the organization of the main topics, with three main teaching blocks organized as follows: Part 1 [15h] � Deployment of AI-based Applications (definitions, architectures, challenges, and technologies); Part 2 [25h] � Efficient AI (commercial training, optimization and inference frameworks, resource-driven optimization techniques for deep neural networks); Part 3 [10h] - Communication protocols and their implementation (how to send/receive data from/to edge/remote servers) Within each block, there are lab sessions [30h] during which students (groups of 2-3 people) can practice with a real software implementation of the "theoretic" concepts and strategies introduced during the regular classes.

Bibliografia
Reading materials

Class handouts and additional material will be made available on the course webpage. User guides and tutorials for lab sessions will be made available as well, including code templates and all the needed tools/library.

Materiale di supporto allo studio
Study materials

Slides; Esercitazioni di laboratorio;

Lecture slides; Lab exercises;

Criteri, regole e procedure per l'esame
Assessment and grading criteria

Modalit� di esame: Elaborato progettuale in gruppo; Prova scritta in aula tramite PC con l'utilizzo della piattaforma di ateneo;

Exam: Group project; Computer-based written test in class using POLITO platform;

... The exam consists of two mandatory parts: 1. the evaluation of three (3) group projects assigned during the course, one assignment for each of the three main parts of the course; the maximum score for each delivered project is 6 points (total: 18 points). 2. a written test on the theoretical aspects introduced during the course, including numerical exercises and open-ended questions. The time allowed for the test is 1 hours, closed books; the maximum score for this part is 14 points. The final score is the sum of the score obtained in the two parts. A minimum of 6 points at the written test is required for passing. A score of 30 cum laude is awarded for students obtaining more than 31 total points (written test + labs).

Gli studenti e le studentesse con disabilit� o con Disturbi Specifici di Apprendimento (DSA), oltre alla segnalazione tramite procedura informatizzata, sono invitati a comunicare anche direttamente al/la docente titolare dell'insegnamento, con un preavviso non inferiore ad una settimana dall'avvio della sessione d'esame, gli strumenti compensativi concordati con l'Unit� Special Needs, al fine di permettere al/la docente la declinazione pi� idonea in riferimento alla specifica tipologia di esame.

Exam: Group project; Computer-based written test in class using POLITO platform;

The exam consists of two mandatory parts: 1. the evaluation of three (3) group projects assigned during the course, one assignment for each of the three main parts of the course; the maximum score for each delivered project is 6 points (total: 18 points). 2. a written test on the theoretical aspects introduced during the course, including numerical exercises and open-ended questions. The time allowed for the test is 1 hours, closed books; the maximum score for this part is 14 points. The final score is the sum of the score obtained in the two parts. A minimum of 6 points at the written test is required for passing. A score of 30 cum laude is awarded for students obtaining more than 31 total points (written test + labs).

In addition to the message sent by the online system, students with disabilities or Specific Learning Disorders (SLD) are invited to directly inform the professor in charge of the course about the special arrangements for the exam that have been agreed with the Special Needs Unit. The professor has to be informed at least one week before the beginning of the examination session in order to provide students with the most suitable arrangements for each specific type of exam.