Thesis Proposal: Hybrid Resource-Bound Analysis of Programs

Abstract

Resource-bound analysis aims to infer symbolic bounds of worst-case resource usage (e.g., running time, memory, and energy) of programs as functions of program inputs. Resource analysis has numerous applications, including job scheduling in cloud computing and prevention of side-channel attacks. Various resource analysis technique have been developed, and they have unique strengths and weaknesses that complement each other. (Automatic) static resource analysis, which analyzes the source code of programs, is sound: if it successfully infers a cost bound, it is guaranteed to be a valid bound. However, due to undecidability of resource analysis in general, every static analysis technique is incomplete: there exists a program that the analysis technique cannot handle. Meanwhile, data-driven analysis, which statistically analyzes cost measurements obtained by running programs on many inputs, can infer a candidate cost bound for any program. However, it does not guarantee soundness of inference results.

To overcome limitations of individual analysis techniques, I propose hybrid resource analysis, which integrates two complementary analysis techniques to retain their strengths while mitigating their respective weaknesses. The user first specifies which analysis techniques are used to analyze which code fragments and quantities. Hybrid analysis then performs its constituent analysis techniques on their respective code fragments and quantities. Finally, their inference results are combined into an overall cost bound.

The development of hybrid resource analysis has been driven by the desire to go beyond Automatic Amortized Resource Analysis (AARA), a state-of-the-art type-based static resource analysis technique. I start by proving polynomial-time completeness of AARA. I next introduce Bayesian data-driven analysis, which conducts Bayesian inference on cost measurements to infer a posterior distribution of symbolic cost bounds. I then present the first hybrid resource analysis, Hybrid AARA, followed by a discussion of its limitations. To overcome these limitations, I introduce the second hybrid resource analysis, resource decomposition. I additionally describe Swiftlet, which instantiates the resource-decomposition framework with AARA and Bayesian resource analysis.

For proposed work, my collaborators and I plan to develop data-driven-analysis for statistically inferring not only a worst-case symbolic cost bound but also a worst-case input generator, which is a program generating worst-case program inputs of various sizes. In existing data-driven analyses, program inputs used for recording cost measurements are usually either generated randomly or assumed to be representative of real-world workload. Consequently, it is challenging to statistically infer worst-case bounds of those programs (e.g. quicksort) whose average-case complexity is significantly lower than the worst-case complexity. By testing programs with various input generators and inferring worst-case input generators, we can improve the inference quality of data-driven resource analysis.

Related