This large long-term project introduces a revolutionary staged design for highperformance, evolvable DBMS that are easy to tune and maintain. We break the database system into modules and encapsulate them into self-contained stages connected to each other through queues.

With the advent of highly-parallel chip multiprocessors database system designers are called to revisit their designs. We study the performance of commercial database systems in evolving computer architectures, and we believe that conventional database designs are inherently restricted in performing highly in such environments. On the other hand, the different approach taken by staged database designs makes them more suitable for high performance in the new computing landscape.

We propose the use of staging for database systems. According to the Staged Database System design, the previously monolithic database system is decomposed to a set of stages, each with its own queue and thread support. New queries queue up in the first stage, they are encapsulated into a “packet”, and pass through the five stages shown on the top of the figure below. A packet carries the query’s “backpack:” its state and private data. Inside the execution engine, a query can issue multiple packets to increase parallelism.

There are multiple research problems associated with this new database system architecture, ranging from optimizing hardware resource usage, to job queueing and scheduling with multiple constraints,

and to multi-query processing and optimization.

When running Online Transaction Processing (OLTP), instruction-related delays in the memory subsystem account for 25 to 40% of the total execution time. In contrast to data, instruction misses cannot be overlapped with out-of-order execution, and instruction caches cannot grow as the slower access time directly affects the processor speed. The challenge is to alleviate the instruction related delays without increasing the cache size.

We propose Steps, a technique that minimizes instruction cache misses in OLTP workloads by multiplexing concurrent transactions and exploiting common code paths. One transaction paves the cache with instructions, while close followers enjoy a nearly miss-free execution. Steps yields up to 96.7% reduction in instruction cache misses for each additional concurrent transaction, and at the same time eliminates up to 64% of mispredicted branches by loading a repeating execution pattern into the CPU.

Our focus is in four directions:

  • Study the performance of database systems when running OLTP and DSS workload on emerging hardware, such as the highly-parallel chip multiprocessors.
  • Build a staged relational query engine that can optimally manage available disk bandwidth, RAM, and CPU cycles across multiple concurrent queries, and provide a significant performance boost over conventional query engines.
  • Apply the Staged DB design coupled with smart scheduling to OLTP engines in order to optimize both instruction and data cache (processor cache) performance, as well as, to improve the (intratransaction) parallelism in those workloads.
  • Optimize chip multiprocessors for commercial workloads, especially database applications, such as on-line transaction processing OLTP and decision support applications (DSS).