KEYWORD |
DAUIN - GR-04 - DATABASE AND DATA MINING GROUP - DBDM
Exploring Deep Learning Techniques to Improve Voice Disorder Diagnoses
keywords DEEP LEARNING,, VOICE DISORDERS, VOICE RECOGNITION,
Reference persons TANIA CERQUITELLI
Research Groups DAUIN - GR-04 - DATABASE AND DATA MINING GROUP - DBDM
Thesis type EXPERIMENTAL AND MODELING
Description Voice disorders pose significant challenges for accurate diagnosis and classification, often requiring specialized expertise and subjective assessments. This thesis will explore the application of deep learning techniques in analysing voice samples of patients with various types of voice disorders, aiming to improve the accuracy and efficiency of diagnosis. The research methodology will consider a diverse dataset of voice samples from patients diagnosed with different voice disorders, including vocal cord nodules, polyps, vocal fold paralysis, and laryngeal cancer. These voice samples will be first pre-processed to extract relevant acoustic features, such as pitch, jitter, shimmer, and spectral characteristics, which serve as input for the deep learning models.
Multiple deep learning architectures, including convolutional neural networks (CNNs) will be trained and evaluated on the dataset to classify voice disorders accurately. Transfer learning techniques will be also employed to leverage pre-trained models and optimize performance. Finally, we will also consider novel End-to-End (E2E) approaches based on the Transformer architecture that enable to directly process the voice sample without preprocessing. These approaches have recently reached state-of-the-art performances in different type of Audio-related classification benchmarks. Their employment in the field of Voice Disorder Identification is yet to be studied.
Furthermore, interpretability techniques will be employed to analyse the learned representations of the deep learning models, potentially providing interesting insights into the discriminatory features that contribute to an accurate classification. This knowledge could aid clinicians in understanding the underlying characteristics of voice disorders and inform treatment decisions.
Required skills Python, Machine Learning, basic notions of Deep Learning
Deadline 13/11/2024
PROPONI LA TUA CANDIDATURA