KEYWORD |
Stress testing chatbots: evaluating factuality, reasoning, abstraction, and other safety challenges
keywords AI AUDIT, AI ETHICS, ARTIFICIAL INTELLINGENCE, CHAT-GPT, CHATBOT, DATA SCIENCE, RESPONSIBLE AI, SOFTWARE ENGINEERING
Reference persons ANTONIO VETRO'
Research Groups DAUIN - GR-16 - SOFTWARE ENGINEERING GROUP - SOFTENG, DAUIN - GR-22 - Nexa Center for Internet & Society - NEXA
Thesis type RESEARCH / EXPERIMENTAL, RESEARCH, INNOVATIVE
Description Introduction:
Chatbots have become increasingly popular in recent years, with many industries using them to provide customer support, automate conversations, and even offer therapy services. However, chatbots are still in their early stages of development, and there is a need to assess their capabilities and limitations. In this thesis, the student will conduct a series of stress tests for chatbots that evaluates their factuality, reasoning, abstraction, and other safety challenges.
Research Questions:
How accurate and reliable are chatbots in providing factual information?
To what extent can chatbots conduct deductive or inductive reasoning ?
How chatbots can simulate abstraction of concepts and provide meaningful responses?
How effectively can chatbots engage in emotional conversations?
What other risks chatbot conversations embed (e.g., privacy, disinformation, harms of representation, harmful content, cybersecurity, privacy, etc) ?
Methodology:
To conduct this study, the student will use a mixed-methods approach that includes both quantitative and qualitative data collection and analysis. The study will involve testing existing chatbots (e.g., ChatGPT) and simulate conversations in different scenarios. The chatbot will be tested to respond to questions according to different risk categories, and provide information on a range of topics, including science, history, and current events.
To evaluate the chatbot's performance, the student will conduct a series of stress tests that assess chatbots' factuality, reasoning, abstraction, and other safety challenges. The student might also conduct surveys and interviews to gather feedback from other people about their experience with the chatbots.
Expected Outcome:
This study aims to provide insights into the limitations and capabilities of chatbots. This study will contribute to the ongoing debate about the ethical implications of chatbots.
Required skills Good programming skills and basic knowledge of common data analytics tools and techniques. Curiosity. Grade point average equal to or higher than 26 can be a criterion for selection of candidate.
Notes When sending your application, we kindly ask you to attach the following information:
- list of exams taken in you master degree, with grades and grade point average
- a résumé or equivalent (e.g., linkedin profile), if you already have one
- by when you aim to graduate and an estimate of the time you can devote to the thesis in a typical week
Deadline 17/11/2024
PROPONI LA TUA CANDIDATURA