PORTALE DELLA DIDATTICA

PORTALE DELLA DIDATTICA

PORTALE DELLA DIDATTICA

Elenco notifiche



Hardware/Software codesign of flexible computing systems for edge AI (insegnamento su invito)

01TKLIU

A.A. 2024/25

Course Language

Inglese

Degree programme(s)

Doctorate Research in Ingegneria Informatica E Dei Sistemi - Torino

Course structure
Teaching Hours
Lezioni 15
Lecturers
Teacher Status SSD h.Les h.Ex h.Lab h.Tut Years teaching
Jahier Pagliari Daniele   Professore Associato IINF-05/A 2 0 0 0 1
Co-lectures
Espandi

Context
SSD CFU Activities Area context
*** N/A ***    
The course aims to provide methodologies for hardware/software co-design solutions that enable the execution of AI models on area- and power-constrained edge devices. In the first part, following an introduction to quantization strategies for reducing the footprint of AI models to make them deployable on resource-limited devices, the course will focus on the open-source and extensible RISC-V Instruction Set Architecture (ISA) as a method for creating energy-efficient, domain-specialized, yet software-programmable edge AI processors. In the core section, leveraging the massive parallelism inherent in AI kernel operations, as well as their tolerance to low-precision integer arithmetic, the course will focus on parallel optimization strategies to execute low-precision integer AI kernels on multi-core, low-power edge platforms with high computational efficiency. To address the limitations of fetch-execute architectures highlighted by Amdahl’s Law, the course will introduce the concept of tightly-coupled accelerators as cooperative coprocessors, designed to enhance specific sections of AI kernels and further reduce end-to-end execution latency. Each lecture on hardware blocks will be complemented with the introduction of related computing paradigms and practical programming examples. In the final section, the course will explore the impact of data movement on the overall execution performance of AI models and introduce tiling strategies to mitigate the "memory wall" effect. To conclude, all concepts will be integrated into an end-to-end execution of a DNN model on a multi-core edge device. The course will close with a discussion of emerging computing paradigms for AI, such as Analog-in-Memory Computing (AIMC), and the system-level challenges of integrating AIMC into heterogeneous analog/digital systems. Tentative syllabus (subject to adjustment based on class needs): 1. Introduction to DNNs, CNNs, and Quantized CNNs for deployment on resource-constrained edge devices (2h) 2. Introduction to the Parallel Ultra Low Power (PULP) RISC-V compute platform and the RISC-V ISA (2h) 3. RISC-V ISA extensions for edge AI and parallel execution of optimized AI kernels on PULP (3h) 4. Tightly-coupled AI accelerators and how to program them (3h) 5. Efficient data movement: introduction to tiling strategies and execution of end-to-end DNN models on PULP (3h) 6. Outlook: energy efficiency promises of Analog-in-Memory Computing and introduction to system-level challenges in heterogeneous integration (2h)
The course aims to provide methodologies for hardware/software co-design solutions that enable the execution of AI models on area- and power-constrained edge devices. In the first part, following an introduction to quantization strategies for reducing the footprint of AI models to make them deployable on resource-limited devices, the course will focus on the open-source and extensible RISC-V Instruction Set Architecture (ISA) as a method for creating energy-efficient, domain-specialized, yet software-programmable edge AI processors. In the core section, leveraging the massive parallelism inherent in AI kernel operations, as well as their tolerance to low-precision integer arithmetic, the course will focus on parallel optimization strategies to execute low-precision integer AI kernels on multi-core, low-power edge platforms with high computational efficiency. To address the limitations of fetch-execute architectures highlighted by Amdahl’s Law, the course will introduce the concept of tightly-coupled accelerators as cooperative coprocessors, designed to enhance specific sections of AI kernels and further reduce end-to-end execution latency. Each lecture on hardware blocks will be complemented with the introduction of related computing paradigms and practical programming examples. In the final section, the course will explore the impact of data movement on the overall execution performance of AI models and introduce tiling strategies to mitigate the "memory wall" effect. To conclude, all concepts will be integrated into an end-to-end execution of a DNN model on a multi-core edge device. The course will close with a discussion of emerging computing paradigms for AI, such as Analog-in-Memory Computing (AIMC), and the system-level challenges of integrating AIMC into heterogeneous analog/digital systems. Tentative syllabus (subject to adjustment based on class needs): 1. Introduction to DNNs, CNNs, and Quantized CNNs for deployment on resource-constrained edge devices (2h) 2. Introduction to the Parallel Ultra Low Power (PULP) RISC-V compute platform and the RISC-V ISA (2h) 3. RISC-V ISA extensions for edge AI and parallel execution of optimized AI kernels on PULP (3h) 4. Tightly-coupled AI accelerators and how to program them (3h) 5. Efficient data movement: introduction to tiling strategies and execution of end-to-end DNN models on PULP (3h) 6. Outlook: energy efficiency promises of Analog-in-Memory Computing and introduction to system-level challenges in heterogeneous integration (2h)
-
-
The course aims to provide methodologies for hardware/software co-design solutions that enable the execution of AI models on area- and power-constrained edge devices. In the first part, following an introduction to quantization strategies for reducing the footprint of AI models to make them deployable on resource-limited devices, the course will focus on the open-source and extensible RISC-V Instruction Set Architecture (ISA) as a method for creating energy-efficient, domain-specialized, yet software-programmable edge AI processors. Guest lecturer: Angelo Garofalo (Post-doctoral Researcher at ETH Zurich, Switzerland): Angelo Garofalo currently holds a position as Junior Assistant Professor (RTD-A) at the University of Bologna, Italy, and he is a post-doctoral researcher at the ETH Zurich, Switzerland. Research interests include heterogeneous compute architectures for mixedcriticality edge systems, AI/ML acceleration, custom extensions to RISC-V ISA, timepredictable hardware, safety solutions for critical processors, hardware for security. During his career he gained experience in ASIC SoC Design (frontend, backend and sign-off), Hardware-Software co-design of heterogeneous systems, design of digital accelerators, RISC-V Instruction Set Architecture, RISC-V processors’ design. He published more than 30 contributions in relevant international conference and journal venues.
The course aims to provide methodologies for hardware/software co-design solutions that enable the execution of AI models on area- and power-constrained edge devices. In the first part, following an introduction to quantization strategies for reducing the footprint of AI models to make them deployable on resource-limited devices, the course will focus on the open-source and extensible RISC-V Instruction Set Architecture (ISA) as a method for creating energy-efficient, domain-specialized, yet software-programmable edge AI processors. Guest lecturer: Angelo Garofalo (Post-doctoral Researcher at ETH Zurich, Switzerland): Angelo Garofalo currently holds a position as Junior Assistant Professor (RTD-A) at the University of Bologna, Italy, and he is a post-doctoral researcher at the ETH Zurich, Switzerland. Research interests include heterogeneous compute architectures for mixedcriticality edge systems, AI/ML acceleration, custom extensions to RISC-V ISA, timepredictable hardware, safety solutions for critical processors, hardware for security. During his career he gained experience in ASIC SoC Design (frontend, backend and sign-off), Hardware-Software co-design of heterogeneous systems, design of digital accelerators, RISC-V Instruction Set Architecture, RISC-V processors’ design. He published more than 30 contributions in relevant international conference and journal venues.
In presenza
On site
Presentazione orale
Oral presentation
P.D.2-2 - Giugno
P.D.2-2 - June