KEYWORD |
Integrating large-scale language models with knowledge graph to create natural language processing based biomedical tools
Thesis in external company
keywords BIOMEDICAL TOOLS, KNOWLEDGE GRAPHS, LARGE-SCALE LANGUAGE MODELS, MACHINE LEARNING, NAMED ENTITY DISAMBIGUATION, RECOMMENDER SYSTEMS, SEMANTIC TECHNOLOGIES, SEMANTIC WEB
Reference persons ANTONIO VETRO'
External reference persons Giovanni Garifo (giovanni.garifo@polito.it) , Giuseppe Futia (giuseppe.futia@gmail.com)
Research Groups DAUIN - GR-22 - Nexa Center for Internet & Society - NEXA
Thesis type EXPERIMENTAL / DEVELOPMENT, EXPERIMENTAL, IN COMPANY, SOFTWARE DEVELOPMENT
Description Knowledge Graphs (KGs) are receiving growing interest as a knowledge representation framework to shape the interconnected nature of biomedical data, representing heterogeneous relationships between diagnoses, related treatments, drugs, and associated effects.
The capacity to extract information from textual content and map such information to KGs is a critical task in Natural Language Processing (NLP) for building intelligent advisory systems. More precisely, this task is known as Named Entity Disambiguation (NED), and it allows to map named entities mentioned in the text, e.g., diagnoses and medicines, to the related KG entities, e.g., “Type 2 Diabetes” and “Insulin.”
Despite the widespread success of Large-scale Language Models (LLMs) in NLP tasks, there are many opportunities to leverage the structural knowledge in KG for NED. The thesis will allow the student to develop new approaches in this field by combining hybrid strategies (LLMs + KG) to create useful NLP-based biomedical tools. In particular, the student will investigate which KG features may have the most significant impact in this context.
The thesis proposal is in collaboration with Graph Aware S.r.l. company. The interaction with company will be conducted both remotely and physically at the Nexa Center for Internet & Society.
See also https://graphaware.com/
Required skills The thesis requires excellent development skills in Python and basic knowledge of Natural Language Processing and Machine Learning techniques.
Grade point average equal to or higher than 26 can play a relevant role in the selection of the candidate.
Notes When sending your application, we kindly ask you to attach the following information:
- list of exams taken in you master degree, with grades and grade point average
- a résumé or equivalent (e.g., linkedin profile), if you already have one
- by when you aim to graduate and an estimate of the time you can devote to the thesis in a typical week
Deadline 14/04/2023
PROPONI LA TUA CANDIDATURA