Politecnico di Torino | Servizi per la didattica

KEYWORD

Design, Implementation and Analysis of a Monitoring System for a Big Data Cluster

Parole chiave BIG DATA ANALYSIS, MONITORING, HADOOP, SPARK

Riferimenti esterni Idilio Drago (Prof. Università di Torino)

Descrizione Large quantities of data are typically processed on Big Data clusters, sets of servers that are configured to operate coordinately. The so-called Big Data clusters run various frameworks to orchestrate the resources. The most popular are Apache Hadoop, Spark and Kubernetes.
To ensure the proper operation of a Big Data cluster it is fundamental to collect, process and analyze the big amount of log files generated by servers, agents and applications. The goal of the thesis is to design and implement a system to collect the telemetry of the Big Data cluster of Politecnico di Torino. Once data are collected, they shall be analyzed using state-of-the-art Data Science approaches to mine rules with the goal of unveiling and understanding malfunctionings and misconfigurations.

Conoscenze richieste Linux systems
Data Science
Big Data
Basic Networking

Scadenza validita proposta 04/10/2022 PROPONI LA TUA CANDIDATURA