PORTALE DELLA DIDATTICA

Ricerca CERCA
  KEYWORD

Area Engineering

Evaluation of the Effects of Hardware-Random Faults on DNN Trainings in AI Accelerators

keywords AI ACCELERATOR, FAULT TOLERANCE, NEURAL NETWORKS ACCELERATORS

Reference persons ANNACHIARA RUOSPO, EDGAR ERNESTO SANCHEZ SANCHEZ

Research Groups DAUIN - GR-05 - ELECTRONIC CAD & RELIABILITY GROUP - CAD, DAUIN - GR-05 - ELECTRONIC CAD and RELIABILITY GROUP - CAD

Thesis type EXPERIMENTAL - DEVELOPMENT, RESEARCH AND DEVELOPMENT

Description Artificial Intelligence (AI) accelerators are specialized hardware designed to enhance the computational performance of Deep Neural Networks (DNNs). However, as hardware scales to smaller nodes and operates under higher performance demands, it becomes increasingly susceptible to random faults, such as bit flips and transient errors. This thesis aims to systematically evaluate the impact of hardware-random faults on DNN training processes within AI accelerators. The outcomes will contribute to understanding how such faults influence training accuracy, convergence behavior, and model robustness, offering valuable insights for designing fault-tolerant systems. The thesis is in collaboration with Ecole Centrale de Lyon and Intel. The thesis must be carried out at POLITO.

Required skills - Basics of Computer Architectures
- Basics of Deep Learning
- Python and libraries like PyTorch
- C, C++


Deadline 21/01/2026      PROPONI LA TUA CANDIDATURA