Politecnico di Torino | Servizi per la didattica

KEYWORD

Area Ingegneria

AI models for high-level semantic image interpretation

Parole chiave ANALISI DI IMMAGINI, DEEP LEARNING, VISUAL BIG DATA

Gruppi di ricerca DAUIN - GR-09 - GRAphics and INtelligent Systems - GRAINS

Descrizione Context: Social media platforms have a profound impact on the way individuals choose to (re)present themselves in the digital era. Through the analysis of visual big data, we seek to understand how face representations have changed over time. Within the context of the FACETS (Face Aesthetics in Contemporary E-Technological Societies) project, we have collected user profile images from Facebook and Instagram. We are also developing, in collaboration with the Department of Philosophy at Università di Torino, visual big data analytics tools through computational image analysis and deep learning techniques that borrow from other disciplines, such as socio-semiotics and visual semiotics. A computational pipeline, FRESCO, has already been developed to extract semantic characteristics (composition, content, etc.) from images

Open research questions to be tackled in one or more research thesis include:

- Design and extend the existing computational pipeline with new tools and techniques to extract semantic characteristics from images  (composition, content, etc.) 

- Develop a user-friendly analytics pipeline to apply the FRESCO pipeline to image collections to extract information about culturally relevant aspects (e.g., how does gender affect the kind of images we publish on social media? how do self-representation change over time?) and for applications in specific domains (e.g., marketing)

- Adapt the proposed pipeline to the analysis of other types of images (advertisement, meme, artworks, AI-generated images, …). Particularly relevant is the use of the proposed pipeline to detect and quantify biases in AI-generated images  

- Design, train and evaluate deep neural networks that mimic high-level semantic analysis (which emotions are solicited by a given image? Which values are expressed? ) 

- Investigate the applicability of Multi-modal Large Language Model (LLM) such as Gemini to extract high-level interpretation from images and reproduce the type of in-depth analysis performed by experts, such as semioticians

Scadenza validita proposta 26/02/2025 PROPONI LA TUA CANDIDATURA