PORTALE DELLA DIDATTICA

Ricerca CERCA
  KEYWORD

DAUIN - GR-09 - GRAphics and INtelligent Systems - GRAINS

Information extraction from semi-structured documents

Parole chiave NATURAL LANGUAGE PROCESSING, RETI NEURALI PROFONDE, VISIONE ARTIFICIALE

Riferimenti FABRIZIO LAMBERTI, LIA MORRA

Gruppi di ricerca DAUIN - GR-09 - GRAphics and INtelligent Systems - GRAINS

Descrizione The main goal of this thesis is to design and implement a proof of concept for extracting information items (such as date, gross amount, and tax code) from scanned semi-structured documents (invoices) in close collaboration with a leading insurance company. The proposed system will exploit recent advances in self-supervised vision-language-layout transformer models that can analyze each element based on their content (graphical and textual) as well as their position and relationship with respect to the rest of a document [1,2]. Starting from a generic pre-trained model on the English language, the system will be fine-tuned to the task of Information Extraction on Italian documents. Performance will be evaluated based on standard and ad-hoc metrics for information extraction reflecting (a) the percentage of documents for which information can be sufficiently extracted and (b) the accuracy of the extracted information. Experience with deep learning and Pytorch is a prerequisite. Good programming and analytical skills are required.

Suggested reading:
[1] Unifying Vision, Text, and Layout for Universal Document Processing. https://arxiv.org/pdf/2212.02623.pdf
[2] OCR-free document understanding transformer. https://arxiv.org/pdf/2111.15664.pdf

Conoscenze richieste machine learning, deep learning, Python, Pytorch


Scadenza validita proposta 20/02/2024      PROPONI LA TUA CANDIDATURA




© Politecnico di Torino
Corso Duca degli Abruzzi, 24 - 10129 Torino, ITALY
Contatti