KEYWORD |
Generalizable language models for network measurement and cybersecurity
Reference persons MARCO MELLIA, LUCA VASSIO
Research Groups DATABASE AND DATA MINING GROUP - DBDM, SmartData@PoliTO, Telecommunication Networks Group
Description Firewall/IPS and EDR and Cloud security services analyze huge amounts of structured data to detect and classify threats, mainly based on human-written rules.
The thesis goal is to understand if lightweight and generalizable language models can extract insights from raw data. A key objective is to ensure generalization abilities beyond syntactic heuristics. A possible solution is to create multi-modal embeddings to conceptually constrain the embeddings towards the right task.
Thesis Goal
- Propose techniques based on language models for identify network traffic threats
- Ensure generalization beyond simple rule by crafting proper training and validation data
- Use techniques based on multi-modal embeddings (similar to OpenAI CLIP)
Required skills - Good programming skills (such as Python and Spark)
- Machine Learning knowledge (such as Torch, Tensorflow)
- Basics of NLP
- Basics of Networking and security
Notes Note: Possible graduation prize of 2000 euros.
A GPA of at least 27/30 is requested.
Deadline 25/11/2025
PROPONI LA TUA CANDIDATURA