Smart crawler for Telegram Channels
External reference persons Nikhil Jha
Thesis type EXPERIMENTAL
Description Telegram is a chat platform that offers private communication means. For this, malicious users abuse the platform to share and sell illegal information, e.g., stolen personal data, illegal material, etc.
The thesis focuses on the design, engineering and testing of an automatic crawler that can join telegram channels, look for messages, extract links to other channels, and then keep exploring the graph of channels and users to automatize data collection. The goal is to create a smart crawler that is able to filter and prioritize the exploration of some links and channels (based on the language, their content, their likelihood of containing malicious messages, etc.) to speed up the exploration speed, and the quality of collected information. For this, we will use machine learning and NLP methodologies to define which resource to explore.
The second part of the thesis will focus on the analysis of the data, creating machine learning, graph mining and AI algorithms to automatically classify the information, signal possible abuses and support the work of the analysts in extracting valuable information from the raw data.
The thesis will build on a crawler prototype developed by the SmartData@PoliTO researcher that is currently able to craw the Telegram platform in a scalable manner.
Required skills - Interest in cyber-security
- Interest in machine learning and AI algorithms
- Good programming skill (Python)
- Good knowledge on machine learning classifiers
Deadline 30/04/2024 PROPONI LA TUA CANDIDATURA