Research Experiences for Undergraduates and Masters Students

Plaid Research Group: Programming Languages and Software Engineering

The Plaid research group does research applying programming language techniques to software engineering problems. We work on all kinds of languages, but many projects focus in particular on object-oriented systems. Projects range from foundational development of new object models and core calculi, to pragmatic language extensions or analyses that can verify properties of programs written in industrial languages like Java. The group name comes from Plaid, a new object-oriented language currently in development based on technologies developed within the group and elsewhere.

Logistics

We expect to have 1-2 research opportunities open for Summer 2011, funded primarily from NSF's Research Experiences for Undergraduates (REU) program. REU funding is restricted to United States citizens and permanent residents; non-citizens/non-residents are welcome to apply, but a paid position will be contingent on finding another appropriate source of funding. Positions are open until filled, and compensation is negotiable (or students can get course credit). Please contact jonathan.aldrich@cs.cmu.edu to apply.

The NSF REU program emphasizes broadening participation in computing, and accordingly members of underrepresented groups are especially encouraged to apply.

Current Projects

We have multiple opportunities available in the general area of programming language and software engineering research. Descriptions of some potential projects are below, but I'm happy to discuss alternative possibilities as well; just contact me if you're interested.

1. Design and Implementation of the Plaid Language

Plaid is a new general-purpose programming language that epitomizes Typestate-Oriented Programming, a generalization of the object-oriented, functional, and imperative paradigms. In Plaid, a state is like a class, with methods and fields, except that the state of an object can change over time. This allows interesting new ways of designing programs, and creates new challenges for efficient implementation and for a linear-logic based type system that can statically track the changing state of objects.

We are looking for people to work on a compiler and typechecker for the language, as well as involvement in the language design and the type theory for students with the right background (courses in PL or compilers are a plus). For more information, see the Plaid language web page.

2. Improved Error Messages in the Fusion Protocol Checker

Our studies of industrial software frameworks has shown that the protocols to use them are extremely complex and easy for programmers to misuse. To prevent developers from misusing these protocols, we created the Fusion system. Fusion allows framework developers to specify the complex protocols to use the framework, and an associated static analysis runs at compile time to ensure that plugins meet these specifications. Fusion is unique among specification and static analysis systems in that only the framework is specified; the goal of the system is to make it usable by plugin developers that have little knowledge of the framework and no knowledge of Fusion specifications.

However, plugin developers must currently be able to view and understand the specification when the static analysis detects an error. This is a major hindrance to adoptability, and our proposed research seeks to solve this problem. The impact of this research will affect not only Fusion, but many other specification language and static analysis systems as well.

This research will seek to make the error messages of Fusion more understandable through visualizations and through task-directed error messages. Specific sub problems and goals are:

How do we describe an error under complex heap configurations? As is common for static analyses, Fusion detects potential errors that could occur under a variety of heap configurations at runtime. When an error is found only under one of these potential heaps, we believe that users will be helped by providing visualizations of the potential heaps and highlighting the error.
Can we adjust the specifications without requiring the user to learn the specification language? Specifications may in some cases be incorrect, and this may be even more likely if the specifications are automatically generated through a dynamic analysis, such as one we have developed in recent work. It may be possible to fix these specifications using semi-automated techniques that would not require the programmer to learn the specification language itself.
Can the system provide users with code-based suggestions for how to fix an error? In prior work with Trisha Quan, a past NSF REU undergraduate, we created a system that would automatically create human-readable error messages from complex logical specifications. We would like to extend this work to provide not only a human readable error message, but also provide a code-based suggestion to the user for how to fix the error.

The ideal student for this project will have enough programming maturity to work on a large project that uses several software frameworks. The student should be interested in a breadth of software engineering research, as this project spans from logic and programming language research to human-computer interaction. However, the project goals can be adapted for a student who is interested only in deep study of one of these areas.

Past Undergraduate Research Projects

We've had a lot of great REUs in the past. Here are some of the undergraduates that have done research in the Plaid group:

Owen Yamauchi and Will Cooper studied ways to infer ownership types more locally
Key Shin and Kevin McInerney adapted the Twelf theorem proving system into the SASyLF tool, in order to improve teaching undergraduates about programming language proofs
Tianyuan Wang and Le Wei built novel user interfaces for SASyLF and evaluated their usability
Trisha Quan found ways to provide better error messages when architectural constraints expressed in first order logic are violated.
Chris Martens developed a new object model and type system for an object-oriented language that supports more flexible object and function extension
Kevin Donnelly researched Selective Open Recursion, a way of specifying the architectural design intent for when a function call should be exposed as a callback to subclasses
Lee Salzman integrated a dynamic prototype-based language model with multiple dispatch
Andi Bejleri and Matthew Kehrt designed a type system for an imperative and dynamic prototype-based language
Paul Richardson developed a novel visualization tool for our Plural typestate checking system
Zachary Sparks and Mark Hahnenberg developed a prototype interpreter and compiler for Plaid, a novel language currently being designed whose type system incorporates aliasing permissions

These students have also published a number of peer-reviewed papers:

Jonathan Aldrich and Kevin Donnelly. Selective Open Recursion: Modular Reasoning about Components and Inheritance. Proc. FSE 2004 Workshop on Specification and Verification of Component-Based Systems, November 2004.
Lee Salzman and Jonathan Aldrich. Prototypes with Multiple Dispatch: An Expressive and Dynamic Object Model Proc. OOPSLA 2004 Workshop on Revival of Dynamic Languages, 2004.
Lee Salzman and Jonathan Aldrich. Prototypes with Multiple Dispatch: An Expressive and Dynamic Object Model. Proceedings of the European Conference on Object-Oriented Programming, July 2005.
Andi Bejleri, Jonathan Aldrich, and Kevin Bierhoff. Ego: Controlling the Power of Simplicity. Proceedings of the Workshop on Foundations of Object Oriented Languages, January 2005.
Matthew Kehrt and Jonathan Aldrich. A Theory of Linear Objects. In 2008 International Workshop on Foundations of Object-Oriented Languages (FOOL '08), January 2008.
Jonathan Aldrich, Robert J. Simmons, and Key Shin. SASyLF: An Educational Proof Assistant for Language Theory. Proc. Functional and Declarative Programming in Education, 2008.
Ciera Jaspan, Trisha Quan, and Jonathan Aldrich. Error Reporting Logic. In Proceedings of the International Conference on Automated Software Engineering, September 2008.
Jonathan Aldrich, Joshua Sunshine, Darpan Saini, and Zachary Sparks. Typestate-Oriented Programming. In Proceedings of Onward!, 2009.

So--it's a great group to do research in. We'd love to have you join us! Contact jonathan.aldrich@cs.cmu.edu for more information.

Jonathan Aldrich