A Center for Sustainable Cloud Computing

Artificial neural networks, a subset of machine learning, are having a major impact on many facets of life and industry. Rapid advancements in the field have allowed us to intervene in, and improve, processes as varied as voice and image recognition, development of new drugs, and reducing fraudulent activities. They have even helped those with hearing impairments to discern and isolate required sounds amidst a cacophony. But as the ambit of neural networks increases, so does their size. A huge bandwidth running into several PetaBytes is required to achieve an accurate model. To address that problem, two EPFL students have developed PowerSGD, an algorithm that allows compression of the needed bandwidth without compromising the accuracy of the training.

Thijs Vogels and Sai Praneeth Karimireddy are pursuing their Ph.D. under the guidance of Professor Martin Jaggi, head of the Machine Learning and Optimization Laboratory at the School of Computer and Communications Sciences (IC). In their research, Vogels and Karimireddy have applied the power method to achieve a reduction of up to 99% of the communication among Graphics Processing Units without affecting the model’s accuracy. Apart from the communication compression, PowerSGD also helps crunch energy consumption, thus playing a role in the fight against climate change.

The algorithm has already been adopted by the software industry, including the world’s most popular deep learning software PyTorch. PyTorch is an extremely versatile software employed by 80% of academic publications using deep learning. It is also used by Tesla’s Autopilot AI application and Facebook’s translation software. The newest version (PyTorch 1.8) comes with PowerSGD built-in, allowing users to activate communication compression with a simple software switch. Apart from PyTorch, PowerSGD also features in Open-AI’s DALL-E, which can generate creative images from text.

Carrying their work forward, the EPFL researchers are applying the same principle to decentralized learning. That is a major step forward because it could help mitigate the risk of data leakage and privacy concerns about sensitive information such as medical records or data stored on mobile phones.