Training of large-scale machine-learning models is extremely challenging because the training data is much more than the memory capacity. However, scientists at IBM and EPFL have collaborated to develop a novel scheme that enables the use of accelerators such as GPUs and FPGAs to speed up the training of machine learning models. They presented their findings at the 31st Annual Conference on Neural Information Processing Systems (NIPS) in Long Beach, California.

As explained by the researchers Celestine Dünner, Thomas Parnell (IBM Research), and Martin Jaggi (EPFL), the scheme is particularly relevant in today’s milieu where computing systems are becoming increasingly heterogeneous. The lack of uniformity in terms of size, complexity, and power inhibits the development of efficient algorithms. However, the study proposes a new generic and reusable component to efficiently distribute the workload among heterogeneous compute units to accelerate large-scale learning.

GPUs and FPGAs typically have a limited memory capacity, which was a major challenge for the researchers. They had to devise a method that could enable scientists to take advantage of the superior compute power of these accelerators. Toward this objective, they demonstrated that the problem could be solved by being selective about which data to train on. If one makes smart choices and leverages the heterogeneous character of data, it is possible to accelerate the training process. In this context, the study proposes DUHL, an efficient gap memory-based strategy, to select which part of the data to make available for fast processing. For their large-scale experiments, the scientists used a 30-gigabyte version of the Kaggle Dogs vs. Cats ImageNet dataset to show that it is possible to train 40,000 photos of cats and dogs in less than one minute, which is 10X faster than the existing methods for limited memory training.

The speed and efficiency of the new algorithm can enable scientists to re-train the models frequently and even adapt to changes in real time. It also has financial implications because faster learning implies significant savings in costs for cloud applications. Thus, the novel scheme has immense potential for data science practitioners in research institutes and various industrial sectors.

Further Readings

https://arxiv.org/pdf/1708.05357.pdf
https://www.ibm.com/blogs/research/2017/12/10x-faster-using-gpu/
https://twitter.com/NvidiaAI/status/938187394699862016
https://www.nextbigfuture.com/2017/12/ibm-demonstrates-10x-faster-large-scale-machine-learning.html
http://www.greencarcongress.com/2017/12/20171206-ibm.html
http://awesummly.com/news/5229951/ibm-scientists-demonstrate-10x-faster-large-scale-machine-learning-using-gpus/
http://justlinkit.io/articles/8195881/distributing-control-of-deep-learning-training-delivers-10x-performance-improvement.html