PORTALE DELLA DIDATTICA

Ricerca CERCA
  KEYWORD

Design, Implementation and Analysis of a Monitoring System for a Big Data Cluster

keywords BIG DATA ANALYSIS, MONITORING, HADOOP, SPARK

Reference persons MARTINO TREVISAN

External reference persons Idilio Drago (Prof. Università di Torino)

Research Groups SmartData@PoliTO

Description Large quantities of data are typically processed on Big Data clusters, sets of servers that are configured to operate coordinately. The so-called Big Data clusters run various frameworks to orchestrate the resources. The most popular are Apache Hadoop, Spark and Kubernetes.
To ensure the proper operation of a Big Data cluster it is fundamental to collect, process and analyze the big amount of log files generated by servers, agents and applications. The goal of the thesis is to design and implement a system to collect the telemetry of the Big Data cluster of Politecnico di Torino. Once data are collected, they shall be analyzed using state-of-the-art Data Science approaches to mine rules with the goal of unveiling and understanding malfunctionings and misconfigurations.

Required skills Linux systems
Data Science
Big Data
Basic Networking


Deadline 04/10/2022      PROPONI LA TUA CANDIDATURA




© Politecnico di Torino
Corso Duca degli Abruzzi, 24 - 10129 Torino, ITALY
Contatti