KEYWORD |
DAUIN - GR-09 - GRAphics and INtelligent Systems - GRAINS
Information extraction from semi-structured documents
Parole chiave NATURAL LANGUAGE PROCESSING, RETI NEURALI PROFONDE, VISIONE ARTIFICIALE
Riferimenti FABRIZIO LAMBERTI, LIA MORRA
Gruppi di ricerca DAUIN - GR-09 - GRAphics and INtelligent Systems - GRAINS
Descrizione The main goal of this thesis is to design and implement a proof of concept for extracting information items (such as date, gross amount, and tax code) from scanned semi-structured documents (invoices) in close collaboration with a leading insurance company. The proposed system will exploit recent advances in self-supervised vision-language-layout transformer models that can analyze each element based on their content (graphical and textual) as well as their position and relationship with respect to the rest of a document [1,2]. Starting from a generic pre-trained model on the English language, the system will be fine-tuned to the task of Information Extraction on Italian documents. Performance will be evaluated based on standard and ad-hoc metrics for information extraction reflecting (a) the percentage of documents for which information can be sufficiently extracted and (b) the accuracy of the extracted information. Experience with deep learning and Pytorch is a prerequisite. Good programming and analytical skills are required.
Suggested reading:
[1] Unifying Vision, Text, and Layout for Universal Document Processing. https://arxiv.org/pdf/2212.02623.pdf
[2] OCR-free document understanding transformer. https://arxiv.org/pdf/2111.15664.pdf
Conoscenze richieste machine learning, deep learning, Python, Pytorch
Scadenza validita proposta 20/02/2024
PROPONI LA TUA CANDIDATURA