Optimization of Transformer Deep Neural Networks on Multi-Core MCUs
keywords ARTIFICIAL INTELLIGENCE, C, DEEP LEARNING, DEEP NEURAL NETWORKS, EMBEDDED SYSTEMS, ENERGY EFFICIENCY, FIRMWARE, LINEAR ALGEBRA, LOW POWER, MICROCONTROLLERS, SOFTWARE, SOFTWARE ACCELERATION, TRANSFORMERS
Reference persons DANIELE JAHIER PAGLIARI
External reference persons Francesco Conti (University of Bologna)
Alessio Burrello (Politecnico di Torino)
Research Groups DAUIN - GR-06 - ELECTRONIC DESIGN AUTOMATION - EDA, ELECTRONIC DESIGN AUTOMATION - EDA, GR-06 - ELECTRONIC DESIGN AUTOMATION - EDA
Thesis type EXPERIMENTAL, SOFTWARE DEVELOPMENT
Description Transformers are new types of deep neural network that have become state-of-the-art in the field of natural language processing and computer vision. However, their computational costs are typically too high for embedded systems with a power budget (< 100 mW). One of the key ways to improve the computational efficiency of these models consists in optimizing the arrangement of data in memory, so that at any time, the data necessary for execution is stored as much as possible in the lowest-latency memory hierarchy levels.
In this thesis, the candidate will develop an automatic tool to optimize the execution, the data layout and their arrangement in the various memory levels for the multi-head self-attention layers, which are the most computationally critical components of a deep transformer. The final product of the thesis will be a tool that, receiving a a high-level description of a transformer network, is able to determine the best memory arrangement of weights and intermediate outputs (activations), as well as the elementary library function (kernel) to be used for executing each layer. The tool will target the GAP8 ultra-low-power hardware platform, which consists of a main RISC-V processor with 3 memory levels, coupled with a cluster of 8 compute cores with a power budget < 100 mW.
Interested candidates must send an email to email@example.com attaching their CV and exams' transcript with scores.
Required skills Required skills include C and Python programming. Further, a basic knowledge of computer architectures and embedded systems is necessary. Desired (but not required) skills include some familiarity with basic machine/deep learning concepts and corresponding models.
Notes Thesis in collaboration with Prof. Luca Beniniís research group at the University of Bologna and ETH Zurich.
Deadline 14/12/2023 PROPONI LA TUA CANDIDATURA