Skip to content
Richard L Ford edited this page Apr 21, 2021 · 18 revisions

SVF Design

The SVF framework is developed on top of the LLVM compiler. As shown in the following figure, the source code of a program is first compiled using clang into bit-code files, which are then merged together using LLVM Gold Plugin at link time stage (LTO) to produce a whole-program bc file. Then interprocedural pointer analysis is performed to produce the points-to information required for constructing a memory SSA form so that the def-use chains for both top-level and address-taken variables are identified. The generated value-flow information can be used to support a range of client applications, such as memory leak detection and null pointer detection. The value-flows can also be used to bootstrap a more precise pointer analysis so that the precision for both the value-flows and the pointer analysis can be improved in an iterative manner. framework

Pointer Analysis

Our design separates a pointer analysis implementation into three loosely coupled components i.e. Graph, Rules, and Solver as shown in the figure. Graph is a higher-level abstraction extracted from the LLVM IR of the program, indicating where pointer analysis should be performed. The Rules component defines how to derive the points-to information from each statement, i.e., constraint on the graph. Solver determines in what order to process all the constraints. SVF provides a clean and reusable interface for users to write their own pointer analysis implementations by combining the three components in a flexible way.

Value-Flow Construction

Based on the points-to information obtained, we first perform a lightweight Mod-Ref Analysis to capture inter-procedural reference and modification side-effects of program variables. Given the mod-ref results and points-to information, the indirect uses and defs at stores, loads and callsites in each procedure are annotated using alias sets, with each representing a set of abstract memory objects that may be accessed indirectly. Note that both Open64 and GCC adopt an intraprocedural memory SSA, which is computed intraprocedurally, with all non-local variables grouped into one single alias set. In contrast, SVF provides a Mem Region Partitioning module to allow users to define their own memory region partitioning strategies to affect how alias sets are formed, so that scalability and precision trade-offs can be made flexibly when analyzing large C programs.