Analyzing the properties and behavior of large software systems
Prof. George Candea ~ Project Website
Selective symbolic execution, embodied in the S2E system, automatically selects the minimal set of paths that need to be explored for a given analysis, thus reducing resource needs by orders of magnitude compared to classic symbolic execution. S2E weaves the execution back and forth between the symbolic and the concrete domain by automatically and transparently converting symbolic to concrete data and vice versa. These conversions are governed by execution consistency models, a unified way of reasoning about families of paths through programs. S2E directly runs x86 machine code and requires no abstract models of the code and its environment; it can run and analyze entire software stacks, such as Microsoft Windows with all its libraries and applications. We used S2E to build a tool that tests proprietary Windows drivers and a comprehensive performance profiler, which constitutes the first use of symbolic execution in performance analysis.
S2E is a platform for writing tools that analyze the properties and behavior of software systems. We have used S2E to develop a comprehensive performance profiler, a reverse engineering tool for proprietary software, and a bug finding tool for both kernel-mode and user-mode binaries. Building these tools on top of S2E took less than 770 LOC and 40 person-hours each.
S2E’s novelty consists of its ability to scale to large real systems, such as a full Windows stack. S2E is based on two new ideas:
- Selective symbolic execution, a way to automatically minimize the amount of code that has to be executed symbolically given a target analysis; and
- Relaxed execution consistency models, a way to make principled performance/accuracy trade-offs in complex analyses.
These techniques give S2E three key abilities:
- to simultaneously analyze entire families of execution paths, instead of just one execution at a time;
- to perform the analyses in-vivo within a real software stack – user programs, libraries, kernel,drivers, etc. – instead of using abstract models of these layers; and
- to operate directly on binaries, thus being able to analyze even proprietary software.
Conceptually, S2E is an automated path explorer with modular path analyzers: the explorer drives the target system down all execution paths of interest, while analyzers check properties of each such path (e.g., to look for bugs) or simply collect information (e.g., count page faults). Desired paths can be specified in multiple ways, and S2E users can either combine existing analyzers to build a custom analysis tool, or write new analyzers using the S2E API.
S2E helps make analyses based on symbolic execution practical for large software that runs in real environments, without requiring explicit modeling of these environments.
S2E is built upon the KLEE symbolic execution engine and the QEMU virtual machine emulator.