KEYWORD |
DAUIN - GR-09 - GRAphics and INtelligent Systems - GRAINS
Generative AI models for enhanced text-to-image synthesis
keywords ARTIFICIAL INTELLIGENCE, DEEP LEARNING, COMPUTER V, GENERATIVE AI
Reference persons LIA MORRA
Research Groups DAUIN - GR-09 - GRAphics and INtelligent Systems - GRAINS
Description While current generative text-to-image latent diffusion models have reached unprecedented results in terms of visual fidelity, there are still open issues to be addressed in exerting precise control over the generated images. Generative models have difficulty in creating correct images when the textual prompt contains many details and often with object placement and spatial awareness. Recent text-to-image latent diffusion models have shown substantial improvements in prompt following, yet still struggle with the use of words such as “left” or “behind”. One possible reason lies in the inherent limitations of the text embedding employed to condition the generation process, that fails to learn sufficiently detailed and disentangled representation. This research project aims to overcome limitations in current generative text-to-image models by exploiting more structured representations for conditioning the image generation process.
After an analysis of the state of the art on text-to-image synthesis, the candidate will: i) define a benchmark by indentifying a suitable dataset of challenging prompts, biased outputs and failures will be created by extensively reviewing open and closed source system, as well as the relevant literature; ii) investigate new architectures that richer, more structured representations, such as scene graphs, as an intermediate step to disambiguate textual prompts, incorporate greater spatial awareness and increase control in image composition. Skills possessed or to be acquired: Generative models, latent diffusion models; strong analytical skills and general interest in research-oriented projects.
Required skills Skills possessed or to be acquired: Generative models, latent diffusion models; strong analytical skills and general interest in research-oriented projects.
Deadline 26/02/2025
PROPONI LA TUA CANDIDATURA