Politecnico di Torino | Servizi per la didattica

KEYWORD

Optimizing batch normalization for small to medium batch sizes

Parole chiave DEEP NEURAL NETWORKS, MACHINE LEARNING, OTTIMIZZAZIONE

Riferimenti FABRIZIO LAMBERTI, LIA MORRA

Gruppi di ricerca DAUIN - GR-09 - GRAphics and INtelligent Systems - GRAINS

Descrizione Batch normalization has been a key enabler in the training of very deep neural networks. However, it suffers from a number of drawbacks including slow updates, instability with small batch sizes, different behaviors at training and inference times, and platform-dependent implementations than hinder research reproducibility. Recently, variations on batch normalization techniques, including batch-norm-free architectures, were proposed. However, most recent papers focus experiments on high-end, multi-GPU training environments, that enable very large batch sizes; these platforms are beyond the reach of many practitioners. The goal of this thesis is to investigate and compare different training strategies in single-GPU environments, with small to medium batch sizes, and to determine the best architectural and training strategies for low-resource training environments. Strong programming and analytical skills are required. Previous experience with at least one deep learning framework (Tensorflow/Keras or Pytorch) is required, as the thesis may involve developing new (or modifying existing) layers and optimizers.

Suggested reading:
High-Performance Large-Scale Image Recognition Without Normalization
https://arxiv.org/pdf/2102.06171.pdf

Scadenza validita proposta 11/07/2023 PROPONI LA TUA CANDIDATURA