Compiling Relational Queries Over Program
Traces To Instrumentation
(joint work with Simon
Goldsmith and Alex Aiken)
Abstract:
Instrumenting programs with code to monitor their dynamic behaviour is
a technique as old as computing. Today, most instrumentation is either
inserted manually by programmers, which is tedious, or automatically by
specialized tools, which are nontrivial to build and monitor particular
properties. We introduce a general "Program Trace Query Language" in
which programmers can write expressive declarative queries about
program behaviour. PTQL is based on relational queries over program
traces with explicit timestamps. We argue that PTQL is more amenable to
human and machine understanding than competing languages, such as
languages based on temporal logic, especially for object-oriented
programs. We also describe a compiler, "Partiqle", that takes a PTQL
query and a Java program and produces an instrumented program. This
instrumented program runs normally but also evaluates the PTQL query
on-line. We explain some novel optimizations required to compile
relational queries into efficient instrumentation. To help evaluate our
work, we present the results of applying a variety of PTQL queries to a
set of benchmark programs, including the Apache Tomcat Web server. The
results show that our prototype system already has usable performance,
and that our optimizations are important to obtaining this performance.
Our queries also revealed significant (and apparently unknown)
performance bugs in the 'jack' SpecJVM98 benchmark, in Tomcat, and in
the IBM Java class library, and some uncomfortably clever code in the
Xerces XML parser.