Ricerca CERCA

Statistical techniques for the training of deep neural networks

Reference persons LIA MORRA

Research Groups DAUIN - GR-09 - GRAphics and INtelligent Systems - GRAINS

Description In the process of training deep networks, often different optimizers produce network instances with more or less different generalization properties. Recent studies show how through statistical techniques it is possible to derive, from an ensemble of networks with different generalization characteristics, an effective network with better generalization characteristics than those of the starting ensemble. A second aspect connected with ensemble learning is to use a new learning technique that considers an ensemble of networks bound to have a certain relative distance (in the parameter space of the network) called Entropic Stochastic Gradient Descent [ESGD ] (see for example https://arxiv.org/abs/2006.07897). Preliminary studies show how this learning technique allows the network ensemble to avoid local minima and to explore regions of the space of optimal solutions for robustness and generalization capacity.

The aim of this thesis is to study the integration of these techniques in complex deep neural networks to be applied in the medical field, and in particular for the analysis of high resolution mammographic images. The implementation in PyTorch of these new optimization techniques will be required, and their experimental comparison with other optimization techniques in terms of generalization error and convergence speed.

As part of this thesis, the following activities will be carried out: i) "Static" statistical analysis of network ensembles: stacking, weighted ensemble average, ii) integration and analysis of ESGD optimizer and iii) Estimation of the uncertainty of deep neural networks complex (multi-stream)

Required skills deep learning; Python and Pytorch programming; statistics

Deadline 31/10/2022      PROPONI LA TUA CANDIDATURA

© Politecnico di Torino
Corso Duca degli Abruzzi, 24 - 10129 Torino, ITALY