Ricerca CERCA

Sparsification and binarization of neural networks


Reference persons ENRICO MAGLI

Research Groups Image Processing Lab (IPL)

Thesis type RESEARCH

Description Deep neural networks (DNNs) have become the standard technology for machine learning in many application fields from image classification to segmentation and many other visual tasks. Generally, DNNs exhibit stunning performance which can be greater than humans' in specific cases. However, since they might be composed of millions, or even billions of parameters, DNNs are generally expensive in terms of computational power required for training (GPUs are necessary), training data, and storage size. In recent years, researchers have been trying to reduce the resources needed by DNNs to perform their tasks in order to be able to deploy them on computationally limited, energy-constrained devices such as embedded systems or phones. The two main techniques for tackling this problem are pruning or sparsification, i.e. the reduction of the number of parameters in the network's topology to a desired level) and quantization, i.e. the reduction to the number of bits used to represent the network's parameter values in the memory, down to even 1 bit per parameter.

The thesis will employ these two techniques one at a time or also jointly in order to reduce the impact of DNNs on computations and memory while ideally preserving the same performance of the original full-size, full-precision DNNs on the given task. The student will investigate the influence of the parameters' initialization distribution on performance after pruning and/or quantization; the influence on the amount of bits used for representing the parameters when applying quantization, specifically focusing on low-bit representations such as 4 bits, 3 bits, 2 bits, 1 bit; apply pruning and/or quantization to different problems than image classification, which is generally the most studied task; analyze how compact DNNs perform on inputs that are not images (for example point clouds or other sparse data); extend the 1-bit binary quantization framework to the ternarization case in which an additional quantization interval is introduced so that quantization and pruning are performed jointly, in an end-to-end training situation; and more.

Required skills Basic deep learning skills, including Python and usage of Pytorch or Tensorflow

Deadline 29/11/2024      PROPONI LA TUA CANDIDATURA