PORTALE DELLA DIDATTICA

Ricerca CERCA
  KEYWORD

DAUIN - GR-09 - GRAphics and INtelligent Systems - GRAINS

Optimizing batch normalization for small to medium batch sizes

Parole chiave DEEP NEURAL NETWORKS, MACHINE LEARNING, OTTIMIZZAZIONE

Riferimenti FABRIZIO LAMBERTI, LIA MORRA

Gruppi di ricerca DAUIN - GR-09 - GRAphics and INtelligent Systems - GRAINS

Descrizione Batch normalization has been a key enabler in the training of very deep neural networks. However, it suffers from a number of drawbacks including slow updates, instability with small batch sizes, different behaviors at training and inference times, and platform-dependent implementations than hinder research reproducibility. Recently, variations on batch normalization techniques, including batch-norm-free architectures, were proposed. However, most recent papers focus experiments on high-end, multi-GPU training environments, that enable very large batch sizes; these platforms are beyond the reach of many practitioners. The goal of this thesis is to investigate and compare different training strategies in single-GPU environments, with small to medium batch sizes, and to determine the best architectural and training strategies for low-resource training environments. Strong programming and analytical skills are required. Previous experience with at least one deep learning framework (Tensorflow/Keras or Pytorch) is required, as the thesis may involve developing new (or modifying existing) layers and optimizers.

Suggested reading:
High-Performance Large-Scale Image Recognition Without Normalization
https://arxiv.org/pdf/2102.06171.pdf


Scadenza validita proposta 11/07/2023      PROPONI LA TUA CANDIDATURA




© Politecnico di Torino
Corso Duca degli Abruzzi, 24 - 10129 Torino, ITALY
Contatti