Politecnico di Torino | Servizi per la didattica

KEYWORD

Integrating large-scale language models with knowledge graph to create natural language processing based biomedical tools

azienda Tesi esterna in azienda

Parole chiave BIOMEDICAL TOOLS, KNOWLEDGE GRAPHS, LARGE-SCALE LANGUAGE MODELS, MACHINE LEARNING, NAMED ENTITY DISAMBIGUATION, RECOMMENDER SYSTEMS, SEMANTIC TECHNOLOGIES, SEMANTIC WEB

Riferimenti ANTONIO VETRO'

Riferimenti esterni Giovanni Garifo (giovanni.garifo@polito.it) , Giuseppe Futia (giuseppe.futia@gmail.com)

Gruppi di ricerca DAUIN - GR-22 - Nexa Center for Internet & Society - NEXA

Tipo tesi AZIENDALE, SPERIMENTALE APPLICATA, SPERIMENTALE, IN AZIENDA, SVILUPPO SOFTWARE

Descrizione Knowledge Graphs (KGs) are receiving growing interest as a knowledge representation framework to shape the interconnected nature of biomedical data, representing heterogeneous relationships between diagnoses, related treatments, drugs, and associated effects.

The capacity to extract information from textual content and map such information to KGs is a critical task in Natural Language Processing (NLP) for building intelligent advisory systems. More precisely, this task is known as Named Entity Disambiguation (NED), and it allows to map named entities mentioned in the text, e.g., diagnoses and medicines, to the related KG entities, e.g., “Type 2 Diabetes” and “Insulin.”

Despite the widespread success of Large-scale Language Models (LLMs) in NLP tasks, there are many opportunities to leverage the structural knowledge in KG for NED. The thesis will allow the student to develop new approaches in this field by combining hybrid strategies (LLMs + KG) to create useful NLP-based biomedical tools. In particular, the student will investigate which KG features may have the most significant impact in this context.

The thesis proposal is in collaboration with Graph Aware S.r.l. company. The interaction with company will be conducted both remotely and physically at the Nexa Center for Internet & Society.

Vedi anche https://graphaware.com/

Conoscenze richieste The thesis requires excellent development skills in Python and basic knowledge of Natural Language Processing and Machine Learning techniques.
Grade point average equal to or higher than 26 can play a relevant role in the selection of the candidate.

Note When sending your application, we kindly ask you to attach the following information:

- list of exams taken in you master degree, with grades and grade point average
- a résumé or equivalent (e.g., linkedin profile), if you already have one
- by when you aim to graduate and an estimate of the time you can devote to the thesis in a typical week

Scadenza validita proposta 14/04/2023 PROPONI LA TUA CANDIDATURA