Power-Aware Acceleration of Deep Learning (DL)
Training and Inference on High-Performance Computing (HPC) Servers
Modern cloud datacenters are plagued by two major challenges: high energy consumption and performance variability. This is, to a great part, due to the numerous types of devices that access high-resolution media content. It is an enormous task to allocate and manage resources for today’s complex applications when running on multicore servers.
This problem is particularly true in the case of video streaming services, which account for about 80% of global traffic. With the increasing growth in user demand for higher quality video, encoding has become a key application in current High-Performance Computing (HPC). While the High-Efficiency Video Coding (HEVC) standard satisfies the emerging large video resolutions, it poses a high computational burden in terms of power and thermal management. To meet this challenge, it is required to integrate application-level configuration and system-level parameters on top of algorithmic optimization.
Our research proposes a machine learning (ML)-based approach for power and temperature management of multistream HEVC encoding on multicore servers, where multiple video streams need to be processed concurrently. Our approach is able to learn from the obtained encoding time, encoding efficiency, and server’s power and temperature, and dynamically sets both encoding configuration and frequency during runtime for arbitrary video. We develop a resolution-aware video assignment strategy to reduce temperature while maintaining the desired encoding time.
On average, compared to a state-of-the-art technique, the implementation of our approach improves BD-PSNR and BD-rate by 0.54 dB and 8 percent, respectively, and reduces the encoding time, power consumption, and average temperature by 15.3, 13, and 10 percent, respectively.
In related research, we propose a centralized-distributed low-overhead failure-aware dynamic VM consolidation strategy to minimize energy consumption in large-scale data centers. Our approach selects the most adequate power mode and frequency of each host during runtime using a distributed multi-agent Machine Learning (ML) based strategy, and migrates the VMs accordingly using a centralized heuristic. The implementation of our Multi-AGent machine learNing-based approach for Energy efficienT dynamIc Consolidation (MAGNETIC) shows how it reduces data center energy consumption by up to 15% compared to other works in the state-of-the-art, guaranteeing the same QoS and reducing the number of VM migrations and host power mode transitions by up to 86% and 90%, respectively. Moreover, it shows better scalability than all other approaches, taking less than 0.7% time overhead to execute for a data center with 1500 VMs.