Scaling static whole-program analysis to modern C and C++ software development : statically analyzing C and C++ software with PhASAR / Philipp Dominik Schubert ; Advisors: Prof. Dr. Eric Bodden, JProf. Dr. Ben Hermann. Paderborn, 2024
Inhalt
- Abstract
- Introduction
- Background
- The Idea of Static Data-Flow Analysis
- Procedure Boundaries and Context Sensitivity
- The Zoo of Sensitivities
- Distributive Data-Flow Analysis Problems
- The IFDS and IDE Frameworks
- Helper Analyses for Precise Whole-Program Data-Flow Analysis
- Control Flow and Callgraph Information
- Points-to and Alias Information
- Type Hierarchy Information
- Data-Flow Information and Client Analyses
- Soundness and Completeness
- Precision and Performance
- Static Versus Dynamic Analysis
- The LLVM Compiler Infrastructure
- PhASAR
- Introduction
- Related Work
- Architecture
- PhASAR's Implementation
- Encoding an IFDS Analysis
- Encoding an IDE Analysis
- Encoding an Analysis Within the Monotone Framework
- Use PhASAR as a Library
- Handling of Intrinsic Functions and libc
- A Note on PhASAR's Soundness
- Scalability
- The Need for Dedicated Debugging Capabilities
- Instrumenting Static Analysis
- Analysis Development Process
- Implementation
- Experience Report
- Related Work
- Conclusions
- The Burden of Correctly Handling Global Variables
- Framework Support for Global Variables
- Background and Problem Description
- Modeling the Effects of Globals
- Implementation
- Case Study: Constant Propagation
- An Analysis Writer's Perspective
- Global Variables in Real-World Programs
- Experimental Setup
- rq:1RQ1: Usages of Global Variables
- rq:2RQ2: Precision
- rq:3RQ3: Performance
- Related Work
- Conclusions
- A Few Years Later: Designing Static Analysis Implementations
- Experiences From Building a Static Analysis Framework
- Background
- Lessons Learned
- Modularity and Encapsulation
- Accessing Information
- Bugs and Debugging
- Parametrization, Configuration and Usability
- Flexible Usage Modes
- Analyzing C, C++, and LLVM IR
- Build Systems
- LLVM IR Generation
- Contributing Guidelines
- Related Work
- Conclusions
- Future Work
- Conclusions
- Variability
- Introduction
- Motivating Example
- Analysis
- Transforming Preprocessor Directives
- Phases of the Desugarer
- Desugaring C Type Specifications
- Desugaring Function Definitions
- Limitations of the Transformation
- Variational Data-flow Analysis
- Implementation
- Experiments
- Experimental Setup
- rq:4RQ4: Analysis Correctness
- rq:5RQ5: Analysis Efficiency
- rq:6RQ6: Analysis Precision
- Related Work
- Conclusions
- Modularity
- Introduction
- Motivating Example
- Strategy
- Idea of the Algorithm
- Summary Generation
- Type Hierarchies
- Intra-Procedural Points-To Information
- Callgraphs and Inter-Procedural Points-To Information
- Data-Flow Information
- Merging Analysis Summaries
- Type Hierarchies
- Callgraphs and Points-To Information
- Fixed-Point Iteration for Callgraph and Points-To Graph
- Data-Flow Information
- Analyzing the Main Application
- Removing Dependencies Ahead of Time
- Implementation
- Experiments
- Limitations
- Related Work
- Conclusions
- Incrementality
- Introduction
- Motivating Example
- Terminology and Notation
- Incremental Update Analysis
- Preparing Commit Metadata
- Change Scenarios
- Compute Whole Program Information
- Compute Incremental Updates
- Implementation
- Evaluation
- Research Questions
- Experimental Setup
- rq:10RQ10: Performance
- rq:13RQ13: Correctness
- rq:11RQ11: Change Characteristics
- rq:12RQ12: Helper Analyses
- Threats to Validity
- Related Work
- Conclusions
- Incrementality: Data
- Applications of PhASAR
- Combining Repository Mining and Static Code Analysis
- Static Configuration-Logic Identification
- White-Box Penetration Testing
- Running Example
- Overview of the Static Analysis Engine
- Design and Implementation
- Taint Configurations
- Taint Analysis
- Path Sensitivity and Performance Optimizations
- Symbolic Execution
- Results and How to Access Them
- Path Collection
- Emitting the Exploded Super-Graph
- Emitting Analysis Coverage
- Emitting Full JSON Reports
- Insights and Lessons Learned
- Conclusions
- Conclusions and Future Work
- Bibliography
