Politecnico di Torino | Servizi per la didattica

KEYWORD

Area Engineering

Compensating dataset bias with high-level semantic prior

Reference persons LIA MORRA

Research Groups DAUIN - GR-09 - GRAphics and INtelligent Systems - GRAINS

Description Deep neural networks, such as convolutional neural networks, are sensitive to the presence of dataset bias, which arises when the distribution of training samples is skewed and conflates the presence of unrelated attributes. For instance, photos of wolves may be associated more frequently with a snowy background, photos of cats are typically taken indoor, photos of bikes are typically associated with sunny weather, certain professions are typically associated to a certain gender, and so forth. As a consequence, the network may learn to “take shortcuts” and rely on undesired attributes to make predictions, and fail to generalize on unseen instances. For instance, CNNs may make predictions based on the background, rather than the object, or may base their predictions on undesirable attributes (such as gender). Typically, this is corrected by verifying the predictions (e.g., through post-hoc explanations), adjusting the dataset (e.g., by acquiring new data) and re-training the network. The goal of this thesis is to investigate neural-symbolic techniques through which deep neural networks can be trained to “comply” with desiderata in the form of logical constraints (e.g., “a doctor can be a men or women with equal probability”) to prevent dataset bias from affecting the training process. Strong programming and analytical skills are required. Experience with Tensorflow/Keras is preferred or will be acquired.
Suggested reading:
Faster-LTN: a neuro-symbolic, end-to-end object detection architecture https://arxiv.org/abs/2107.01877
Shortcut learning in deep neural networks https://arxiv.org/abs/2004.07780

Deadline 11/07/2023 PROPONI LA TUA CANDIDATURA