Model checking of the PEGASUS intermediate representation for proving correctness of optimizations in the CASH compiler

Ajay Mathew (ajaymathew@cmu.edu)

Sumit K. Jha (jha+@cs.cmu.edu)

Group Info:

Ajay Mathew ajaymathew@cmu.edu

Sumit Jha sumit.jha@cs.cmu.edu

Project Web Page:

Online Description of this ongoing project is available at www.cs.cmu.edu/~jha/VCASH/

An interim report is available here and also as a zipped file here.

Research Project Description:

We will first perform a literature survey and compare two competing model checking approaches for their suitability to the application – the verification of optimizations on the PEGASUS intermediate representation.

1. The first approach will use a SAT based bounded model checking approach to show the equivalence of the optimized PEGASUS intermediate representation against the non-optimized PEGASUS intermediate representation.

a. Similar techniques have been used by the CBMC model checker to show the equivalence of a Verilog program against golden C models.

b. We will also study the use of bit vector decision procedures for bounded model checking.

c. The result of this study will indicate whether SAT based bounded model checking is an adequate approach to prove the equivalence of PEGASUS intermediate representations.

2. The second approach will use results on temporal logic model checking which show that data flow analysis is simply the model checking of abstract interpretations. We will use an existing model checker and temporal logic statements about optimizations being performed to check whether valid optimizations have been performed.

a. We will develop an algorithmic framework for checking optimizations using temporal logics.

b. We will study the PEGASUS intermediate representation and its translation to Kripke structures for analyzing the correctness of optimizations using temporal logics.

c. The result of this stuffy will be an understanding of the feasibility of using temporal logic model checking for checking correctness of optimizations.

1. We will study the above approaches and the associated literature; and implement the most feasible approach in the CASH compiler.

Logistics:

Plan of Attack and Schedule:

March 20 – March 27

We study the PEGASUS intermediate representation, the optimizations in CASH compiler and how these optimizations affect the PEGASUS intermediate representation..

March 27 – April 3

We study the translation of PEGASUS to Kripke structures for model checking. We also study the translation of optimization rules to temporal logic. We develop an algorithm for performing model checking for checking correctness of optimizations.

April 3 – April 10

We perform manual experiments (using model checkers) to check the effectiveness of model checkers and temporal logic formulae for representing and analyzing compiler optimizations.

April 10 – April 17

We develop a theoretical framework for bounded model checking of optimizations. We perform manual experiments on small examples and translations to check the suitability of decision procedures to handle the bounded model checking of PEGASUS intermediate representations.

****************MILESTONE***************

April 17 – April 24

A. We write a translator from PEGASUS to the language of a model checker.

B. We write a symbolic execution engine to produce SAT formula / bit vector formula from PEGASUS IR.

April 24 – May 1

A. We write temporal logic properties for a subset of optimizations in PEGASUS.

B. We interface the output of the symbolic simulation engine with decision procedures

May 1 – May 8

A. We integrate the CASH compiler, the model checker and the temporal logic generator to perform model checking on the PEGASUS IR as it is being optimized.

B. We perform experiments and record the performance of the symbolic equivalence checker.

Milestone:

We will provide algorithms for the temporal logic model checking and equivalence based bounded model checking for checking optimizations of the PEGASUS intermediate representation. We will also present experimental evidence to argue the more efficient way to perform the correctness verification of optimizations.

Literature Search:

Bounded Model Checking [1] techniques have been used to prove the bounded equivalence of C programs in the CBMC tool [2]. Bit vector decision procedures [3] have matured since CBMC was developed. Several competing bit vector decision procedures have been developed like Yices, UCLID etc. Also, a parallel body of work has immersed on viewing compiler optimizations as static analysis of abstract interpretations of programs [4]. The project builds on these developments in the are of compilers and verification, and seeks to apply them to the CASH compiler framework.

Resources Needed:

We will need the PEGASUS simulator, the SMV model checker and the Yices / UCLID decision procedure. We have access to the SMV model checker and the Yices decision procedure.

Getting Started:

We have collected the relevant papers and have started developing algorithmic techniques for the two approaches. A relevant draft is available on request.

****RISKS****

The project may report that existing decision procedures are not developed enough to prove the equivalence of PEGASUS intermediate representations. The other approach may not be able to handle all kinds of complex optimizations. Is that an acceptable risk for the project?

References

[1] Bounded model checking. A. Biere, A. Cimatti, E. M. Clarke, O. Strichman, and Yunshan Zhu. In Highly Dependable Software, volume 58 of Advances in Computers. Academic Press, 2003.

[2] A Tool for Checking ANSI-C Programs. Edmund Clarke, Daniel Kroening and Flavio Lerda. In Tools and Algorithms for the Construction and Analysis of Systems (TACAS 2004)

[3] The Yices tool webpage at SRI. http://yices.csl.sri.com/introduction.shtml

[4] Data flow analysis is model checking of abstract interpretations. David A. Schmidt. In Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages.