KEYWORD |
Optimizing batch normalization for small to medium batch sizes
Parole chiave DEEP NEURAL NETWORKS, MACHINE LEARNING, OTTIMIZZAZIONE
Riferimenti FABRIZIO LAMBERTI, LIA MORRA
Gruppi di ricerca DAUIN - GR-09 - GRAphics and INtelligent Systems - GRAINS
Descrizione Batch normalization has been a key enabler in the training of very deep neural networks. However, it suffers from a number of drawbacks including slow updates, instability with small batch sizes, different behaviors at training and inference times, and platform-dependent implementations than hinder research reproducibility. Recently, variations on batch normalization techniques, including batch-norm-free architectures, were proposed. However, most recent papers focus experiments on high-end, multi-GPU training environments, that enable very large batch sizes; these platforms are beyond the reach of many practitioners. The goal of this thesis is to investigate and compare different training strategies in single-GPU environments, with small to medium batch sizes, and to determine the best architectural and training strategies for low-resource training environments. Strong programming and analytical skills are required. Previous experience with at least one deep learning framework (Tensorflow/Keras or Pytorch) is required, as the thesis may involve developing new (or modifying existing) layers and optimizers.
Suggested reading:
High-Performance Large-Scale Image Recognition Without Normalization
https://arxiv.org/pdf/2102.06171.pdf
Scadenza validita proposta 11/07/2023
PROPONI LA TUA CANDIDATURA