A Center for Sustainable Cloud Computing

Sustainable Data Centers

LAMP Drives New Synergy Between Big Data And Programming Technologies

Big Data and SCALA

Forming a foundation of programming languages for big data applications.

Data-intensive computing on a large scale, commonly referred to as big data, is gaining traction across industries. However, the crucial decision in implementing big data is to settle on the most appropriate programming language that can be used by developers and data scientists. That will help you develop a strong synergy between your programming and database technologies, resulting in seamless management and analysis of big data.

Among the favoured pack of programming languages today is Scala, the preferred implementation language for big data frameworks used by many renowned developers. But programming languages (including Scala) need to bridge the gap in scale to handle the fundamental data structures needed for database access. Toward this objective, Professor Martin Odersky at EPFL’s Programming Methods Laboratory (LAMP) is working assiduously to develop methods that can better express and export primary programming abstractions implemented in the interaction between database and programming languages.

The study revolves around three key aspects:

  • Projecting data and embedding programming abstractions in Scala
  • Improving performance through projecting control by concretizing queries as data that can perform well in sync with various backends
  • Developing distributed programming abstractions, a key component of technologies related to big data frameworks

Big data presents new challenges that call for advanced computing. One of the shortcomings of programming languages is that they have limited number of fields whereas big data frameworks need many more. This can be overcome if more flexible data structures can be accommodated. Besides, greater optimization is needed for an effective confluence between big data frameworks and programming technologies. The ongoing project aims to achieve both objectives by expressing and exporting fundamental programming abstractions that are used in the interplay between databases and programming languages.

The end result of the project, funded by the Swiss National Science Foundation (SNSF), is expected to form a strong bedrock of programming languages, which can be used to create advanced big data applications and next-gen data engines.

Suggested readings: