Semantics of Sequential and Parallel Programs

Stephen Brookes
Carnegie Mellon University
Department of Computer Science

15-812
Fall 2000
MW 10:30-11:50
Starts on Wednesday, 13 September
Wean 4615A

This course can be taken as a ``starred course'' to fulfil the Programming Languages core requirement of the CMU CS Ph.D. program.

Course Outline

This course introduces foundational concepts and techniques of programming language semantics, and demonstrates their application in program analysis and synthesis, and their relevance in language design and implementation. We bring out the key concepts and techniques, building in the necessary mathematics and logic as we go. We aim to demonstrate the utility of a scientific approach to programming languages in answering questions about the pro's and con's of various languages, about compiler correctness, and for other practical purposes.

Syllabus

A tentative syllabus for the course, subject to change, includes:

Introduction

Static and dynamic semantics

Denotational and operational semantics

Notions of semantic equivalence

Levels of abstraction

Sequential imperative programming

Stores, environments, and state transformations

Hoare's Logic for partial and total correctness

Jumps and continuations

Correctness of compiler optimizations

Functional programming

Types and polymorphism

Call-by-name, call-by-value, and lazy evaluation

Direct and continuation-style semantics

Parallel imperative programming

Shared-variable programs

Communicating processes and dataflow networks

Safety and liveness properties, fairness

Combining functional and imperative programming

Logic programming (if time permits)

Texts

Lecture notes will be handed out on a regular basis. The following texts are suggested for background reading:

Reynolds, Theories of Programming Languages, Cambridge University Press, ISBN 0-521-59414-6.

Winskel, The Formal Semantics of Programming Languages: An Introduction, MIT Press, ISBN 0-262-23169-7.

Paulson, ML for the working programmer, Cambridge University Press, ISBN 0-521-39022-2.

Pre-requisites

This course should be accessible to graduate students and masters-level students in CS, Math, or Philosophy. Some mathematical maturity will be expected. Advanced undergraduates wishing to take this course should contact the instructor for permission.

Introduction

There is a wide variety of programming languages, ranging from low-level machine code and assembler to high-level languages such as C, Java, and Standard ML. A few major paradigms have been identified (e.g. ``functional'', ``imperative'', ``object-oriented'') and a particular language may belong to one or more of these. Although this type of classification can be useful to a limited extent, it is much more important to understand the relationship between program syntax (what we write) and algorithms (what happens, in abstract terms, when a program runs). This is one of the purposes of programming language semantics. A semantics for a programming language is a mathematical model that reflects the intended computational behavior of programs. To be useful, a semantics needs to be:

tractable: as simple as possible without losing the ability to express behavior accurately;

abstract: uncluttered by irrelevant detail;

computational: an accurate abstraction from runtime behavior.

To support syntax-directed, or modular, program development and analysis a semantics needs to be compositional.

In addition to its role in validating program correctness, semantics can provide a firm foundation for many activities related to programming and programming languages.

Semantics can yield answers to questions of programming language design. For example, is Java ``better'' than C++? Many such questions simply don't have an unambiguous yes/no answer. Semantics can help to clarify the picture by allowing us to make rigorous statements about properties of programs and the interactions between program constructs.

Lessons learned in programming language theory can also guide the design of special-purpose ``little'' languages, such as scripting languages or mark-up languages. Despite their narrow scope of application, such languages still need careful design.

Semantics can influence language implementors. For example, many code optimization tricks used in compilers (e.g. loop unrolling, code motion, goto-chasing) rely for their soundness on results from programming language theory.

The theory of programming languages provides foundations for the kind of ``formal methods'' advocated in software engineering, by allowing precise specifications of intended program behavior and permitting proofs that a program does (or perhaps does not) meet its specification.

We will focus on two of the most successful styles of semantic description: denotational and operational semantics. Each style has its advantages and disadvantages, so it is important to be able to discern the pro's and con's of a particular semantics, and it is worth understanding the ideas behind each style. For example, a denotational semantics is automatically compositional, while typically this is not the case for an operational semantics; nevertheless one can usually still justify modular program analysis in the operational style, although the style of semantic presentation tends to make it harder. It is often stated that operational semantics is ``easier to understand'', but this is not a universal truth. Despite these claims, neither style is clearly ``better'' than the other. It is best to regard the two styles as complementary approaches to understanding languages and programs.

As is traditional in semantics texts and research, we do not work with entire ``real'' programming languages like C and Java. Instead we focus on small ``core'' languages with an idealized (and simplified) syntax. Each core language is chosen to exhibit the features common to languages in a specific paradigm, with a minimum of syntactic distraction. In each case, we will show how a semantics can be used to analyze individual programs, to design correct programs, and to prove general laws of program equivalence that can be used to simplify programs while preserving correctness.

We will begin with a simple sequential imperative programming language, since it requires little semantic sophistication and allows us to introduce the major concepts, notation, and techniques painlessly. We then apply and generalize these ideas and techniques to a series of more complex languages. The approach is incremental, studying each feature (e.g. parallelism) in as simple a setting as possible to minimize the potential for confusion and bring out the key issues caused by the feature itself, rather than having to deal with a grab-bag language containing many logically disparate features (like parallelism and modules) and trying to disentangle.

The course will have a strong mathematical flavor, but also a strong emphasis on writing programs. Students will be expected to write programs in Standard ML. Despite its primarily functional nature, ML can be used to illustrate imperative, object-oriented, and concurrent programming issues as well as issues related to functional programming. By limiting to a single language we avoid the need to learn a new language every week. We also use ML to implement semantic definitions for other languages, thus providing a testbed facility for debugging semantic descriptions and for trying out various combinations of language features.

Grading

Grading will be based on homeworks, a midterm examination, a final examination, and (to a lesser, more informal sense) on class participation. Homework will include both written assignments and programming problems. The midterm and final will be open-book.

The course will be graded (for CS graduate students) on a pass/fail basis. Undergraduates will be assigned a letter grade.

Students are encouraged to form study groups to discuss homework problems and course material, but any work turned in for a grade must be solely the work of the individual. It is not acceptable to solve a problem collectively in a group and then write the answers up for individual submission. Copying or adaptation from a published book or article, or from previous core course documentation, is also unacceptable.