Energy/performance tradeoff in ML-enhanced cloud networks
Thesis in external company Thesis abroad
Reference persons PAOLO GIACCONE
External reference persons Leonardo Linguaglossa, Telecom Paris, email@example.com
Research Groups Telecommunication Networks Group
Thesis type EXPERIMENTAL - DEVELOPMENT, RESEARCH
Description Alongside with novel paradigms such as Software-defined Networking (SDN) and Network Function Virtualization (NFV), we observe nowadays a process of network softwarization, that consists in replacing hardware black-box devices with software white-box components or virtual network functions (VNFs) implementing an equivalent network function . In this context, a server owner may provide some resources such as CPUs, Network Interface Cards (NICs), or storage to their tenants following a Service-level agreement (SLA). Tenants can then utilize the allocated resources to deploy the VNFs that create the high-level application such as firewall, intrusion detection or load balancing. Off-chip devices such as FPGAs or GPUs can be externally connected to the COTS server in order to offload some of the functionalities, although this is usually discouraged in high-speed contexts due to the cost of the cross-server interactions. This is especially relevant as machine learning (ML) techniques are starting to be applied within network applications for a plethora of scenarios, ranging from anomaly detection , to performance prediction . Specifically, a tradeoff emerges in what regards the placement of the ML processes. On the one hand, if the ML is placed alongside the data path, this may provide new collectible data with a resolution that cannot be reached using legacy equipment with pre built-in monitoring functionalities. On the other hand, this massive data availability comes at a cost: software measurements require CPU cycles that are subtracted to the network functions that compete for the same underlying compute infrastructure: this can highly affect the measured values, thus biasing the collected data as well as the whole system. Furthermore, the deployment of multiple ML and data processing components, may require additional servers to be turned on to host the ML computation, which increases the energy footprint of the overall network applications.
In this project, we quantitatively study the energy/performance tradeoff of ML-enhanced high-speed networks, where ML applications are deployed alongside the main data path in standard COTS servers. Computation consolidation techniques have been proposed in order to reduce the resource usage and thus energy footprint of computation. However, when both the ML and the network application compete for the same resources, the execution flow of the application may be severely affected by ML, which may require strong isolation, impeding consolidation. The objective of the internship is to study such tradeoff via measurements campaigns on a real equipment, and to offload ML computation on low-power embedded devices. We verify that, by doing so, we can decrease the energy consumption of computation with a small (quantifiable) performance loss.
The candidate will explore the related work of high-speed machine learning techniques commonly adopted in high-speed cloud systems and utilize the given infrastructure to implement and deploy a state-of-the-art use case of ML-enabled application. Finally, data collection will be performed both using software methods (e.g., internal counters), physical methods (e.g., explicit temperature measurements) or indirect methods (e.g., with models of the energy consumption).
See also stage_edf_2022_en.pdf
Required skills Students pursuing their MSc2 (or equivalent) studies in Computer Science, Electrical or Telecommunications engineering). The candidate must have excellent programming skils. Proficiency with ML libraries in Python is a big plus.
Notes The internship will take place at Telecom Paris, 19 place Marguerite Perey, Palaiseau (France).
It is possible to cumulate the monthly pay of 500€ with other national or international scholarships (Erasmus, other individual fundings).
 Zhang T., Qiu H., Linguaglossa L., Cerroni W., Giaccone P., “NFV Platforms: Taxonomy, Design Choices, and Future Challenges”, IEEE Transactions on Network and Service Management, 2020
 Putina A., Barth S. et al., “Telemetry-based stream-learning of BGP anomalies”, Big-DaMa, 2018
 Geyer F., “DeepComNet: Performance evaluation of network topologies using graph-based deep learning”, Performance Evaluation, 2019
 Linguaglossa L., Geyer F., Shao W., Brockners F., Carle G., “Demonstrating the Cost of Collecting In-Network Measurements for High-Speed VNFs”, IFIP TMA Demo, 2019
Deadline 23/11/2023 PROPONI LA TUA CANDIDATURA