Politecnico di Torino | Servizi per la didattica

KEYWORD

Scaling Pruning at Initialization to Large Vision Models: on the Importance of Data for Preserving Pre-trained Capabilites

External reference persons Leonardo Iurada, Marco Ciccone

Research Groups DAUIN - GR-23 - VANDAL - Visual and Multimodal Applied Learning Lab

Thesis type RESEARCH / EXPERIMENTAL

Description Pruning at initialization (PaI) has emerged as a powerful technique for neural network compression. Recent advancements rely on saliency scores to estimate parameter importance, but this approach suffers from computational bottlenecks when applied to large models due to the requirement of computing gradients over all parameters. This work proposes an extension of the Path eXclusion (PX) algorithm for PaI drawing insights from the parameter-efficient fine-tuning literature. By leveraging this approach, we aim at overcoming the scalability limitations, enabling the application of our method to models of any size. Furthermore, the prohibitive costs of training large models from scratch are further reduced when leveraging pre-trained models. Consequently, paramount importance should be placed in preserving the pre-trained model's semantic understanding during the pruning process. Although recent trends in PaI endorse data-agnostic pruning algorithms, recent developments suggest that such approach might not be optimal, particularly when applied to pre-trained models. Data-driven PaI approaches seemingly overcome such limitation. However, the role and necessity of data in PaI remains an open question. Our work investigates this aspect across various computer vision tasks, including classification, semantic segmentation, and object detection. We employ mechanistic interpretability metrics to determine whether data utilization is a key factor in preserving semantic content during PaI. We posit that data acts as a catalyst, unlocking a pre-trained model's inherent capabilities for a given task, similar to very recent observations made in the model merging literature. By elucidating the relationship between data and task-specific semantics, we aim to pave the way for broader advancements and applications of PaI methods across various computer vision tasks at scale.

Deadline 15/07/2024 PROPONI LA TUA CANDIDATURA