PORTALE DELLA DIDATTICA

Ricerca CERCA
  KEYWORD

DAUIN - GR-09 - GRAphics and INtelligent Systems - GRAINS

Information extraction from semi-structured documents

keywords COMPUTER VISION, DEEP LEARNING, NATURAL LANGUAGE PROCESSING, TRANSFORMERS

Reference persons FABRIZIO LAMBERTI, LIA MORRA

Research Groups DAUIN - GR-09 - GRAphics and INtelligent Systems - GRAINS

Description The main goal of this thesis is to design and implement a proof of concept for extracting information items (such as date, gross amount, and tax code) from scanned semi-structured documents (invoices) in close collaboration with a leading insurance company. The proposed system will exploit recent advances in self-supervised vision-language-layout transformer models that can analyze each element based on their content (graphical and textual) as well as their position and relationship with respect to the rest of a document [1,2]. Starting from a generic pre-trained model on the English language, the system will be fine-tuned to the task of Information Extraction on Italian documents. Performance will be evaluated based on standard and ad-hoc metrics for information extraction reflecting (a) the percentage of documents for which information can be sufficiently extracted and (b) the accuracy of the extracted information. Experience with deep learning and Pytorch is a prerequisite. Good programming and analytical skills are required.

Suggested reading:
[1] Unifying Vision, Text, and Layout for Universal Document Processing. https://arxiv.org/pdf/2212.02623.pdf
[2] OCR-free document understanding transformer. https://arxiv.org/pdf/2111.15664.pdf

Required skills machine learning, deep learning, Python, Pytorch


Deadline 20/02/2024      PROPONI LA TUA CANDIDATURA




© Politecnico di Torino
Corso Duca degli Abruzzi, 24 - 10129 Torino, ITALY
Contatti