Telecommunication Networks Group
Theses at Politecnico
Multi-view Classification Using an Autonomous Driving Simulator
External reference persons Marco Palena
Research Groups Telecommunication Networks Group
Thesis type DESIGN AND EXPERIMENTS, EXPERIMENTAL AND SIMULATION
Description In modern Intelligent Transportation Systems (ITSs), both Connected Vehicles (CVs) and Road-Side Units (RSUs) collect large amounts of data from multiple sensors, including LiDAR, cameras, GPS/IMU, radar, etc. Vehicle-to-Infrastructure (V2I) and Infrastructure-to-Infrastructure (I2I) communications allow such data to be shared over the network so that it can be processed either at the edge or on the cloud. With the advent of Multi-access Edge Computing (MEC), an ever increasing part of this processing is shifting toward the edge of the network.
Object classification from visual data plays a key role in intelligent transportation systems (ITS). Diverse ranges of ITS tasks, such as Vehicle Classification, Driver Identification and Pedestrian Detection, rely on some sort of vision-based object recognition to achieve their goal. The best algorithms for image classification in the state-of-the-art are based on convolutional neural networks (CNNs), deep learning models that require training over extensive datasets of images.
For complex classification tasks, the visual information conveyed by a single image may be insufficient to make accurate predictions. Fusing data coming from different sensors can help overcome this limitation. In the context of an ITS, multiple agents (being them CVs or RSUs) may use different types of sensors to capture data with similar high-level semantics, for instance sensing the same area along a road.
Most of the available datasets for computer vision tasks in an ITS provide multi-sensor data collected from the perspective of a single sensing agent. For instance, the KITTY dataset provides a real-world benchmark for computer vision tasks combining stereo camera, LiDAR and GPS/IMU data collected from a single vehicle. With some notable exceptions (like OPV2V) there are not many datasets available that aggregate data collected from multiple sensing agents (being them CVs or RSUs).
In this context, we are interested in studying schemes to efficiently fuse image data depicting the same object from multiple perspectives, i.e., collected by different agents, for the purpose of classification (a problem known as multi-view object classification). The aim of this thesis is to generate a synthetic database for the multi-view classification task in the context of ITSs. The student will work with CARLA, an open-source simulator for autonomous driving research, to generate multi-view samples for supervised learning. Each sample will consist in a class label and a collection of images captured by virtual cameras set in the simulated environment, depicting the same object from the perspective of multiple static and/or dynamic actors.
Required skills Knowledge in Python programming
Deadline 06/12/2024 PROPONI LA TUA CANDIDATURA