Politecnico di Torino | Servizi per la didattica

KEYWORD

Exploring Deep Learning Techniques to Improve Voice Disorder Diagnoses

keywords DEEP LEARNING,, VOICE DISORDERS, VOICE RECOGNITION,

Research Groups DAUIN - GR-04 - DATABASE AND DATA MINING GROUP - DBDM

Thesis type EXPERIMENTAL AND MODELING

Description Voice disorders pose significant challenges for accurate diagnosis and classification, often requiring specialized expertise and subjective assessments. This thesis will explore the application of deep learning techniques in analysing voice samples of patients with various types of voice disorders, aiming to improve the accuracy and efficiency of diagnosis. The research methodology will consider a diverse dataset of voice samples from patients diagnosed with different voice disorders, including vocal cord nodules, polyps, vocal fold paralysis, and laryngeal cancer. These voice samples will be first pre-processed to extract relevant acoustic features, such as pitch, jitter, shimmer, and spectral characteristics, which serve as input for the deep learning models.

Multiple deep learning architectures, including convolutional neural networks (CNNs) will be trained and evaluated on the dataset to classify voice disorders accurately. Transfer learning techniques will be also employed to leverage pre-trained models and optimize performance. Finally, we will also consider novel End-to-End (E2E) approaches based on the Transformer architecture that enable to directly process the voice sample without preprocessing. These approaches have recently reached state-of-the-art performances in different type of Audio-related classification benchmarks. Their employment in the field of Voice Disorder Identification is yet to be studied.

Furthermore, interpretability techniques will be employed to analyse the learned representations of the deep learning models, potentially providing interesting insights into the discriminatory features that contribute to an accurate classification. This knowledge could aid clinicians in understanding the underlying characteristics of voice disorders and inform treatment decisions.

Required skills Python, Machine Learning, basic notions of Deep Learning

Deadline 13/11/2024 PROPONI LA TUA CANDIDATURA