PORTALE DELLA DIDATTICA

Ricerca CERCA
  KEYWORD

DAUIN - GR-03 - COMPUTER NETWORKS GROUP - NETGROUP

Theses at Politecnico

Programming Robotic Arms via NLP

estero Thesis abroad


keywords DEEP LEARNING, MACHINE LEARNING, NATURAL LANGUAGE PROCESSING, VIDEO PROCESSING

Reference persons GUIDO MARCHETTO, ALESSIO SACCO

External reference persons Prof. Flavio Esposito, Saint Louis University, USA

Research Groups DAUIN - GR-03 - COMPUTER NETWORKS GROUP - NETGROUP

Description This activity focuses on enhancing the control capabilities of a robotic arm through a language-based vision control system. The implementation involves setting up the Robot Operating System 2 (ROS2) to control the robot, and the integration of speech-to-text functionality is achieved by selecting a compatible speech recognition service/library for ROS, capturing audio input, converting it to text, and utilizing recent LLMs, e.g., ChatGPT, to generate corresponding commands. The integration process involves obtaining API keys, programmatically sending user inputs to the LLM, and processing the model's outputs to determine subsequent actions. A software bridge is coded to parse this responses and convert them into ROS commands. This bridge interprets natural language input, extracts actionable commands, and translates them into ROS-compatible messages to control the robot.
Finally, the last step involves creating a custom General Behavior Tree (GBT) with specific data and incorporating deep learning models for real-time object detection and recognition (such as YOLO).


Deadline 05/02/2025      PROPONI LA TUA CANDIDATURA




© Politecnico di Torino
Corso Duca degli Abruzzi, 24 - 10129 Torino, ITALY
Contatti