CS 15-212: Principles of Programming
(Spring 2009)

Schedule of Classes

In this course, there will be two types of lectures:

Sun 11 Jan
Lecture 1
Welcome and Course Introduction
We outline the course, its goals, and talk about various administrative issues.

Inductive Computing
We begin by reviewing a few simple proofs of nuerical properties by mathematical and complete induction and infer their distinguishing factor as the necessary fallback on a base case. We use this observation to give a computational expression to the participating entities as inductive function definitions. We conclude by discussing the differences between inductively defined functions and recursive functions.
Mon 12 Jan
Recitation 1
Exercises on Inductive Computing
Tue 13 Jan
Lecture 2
Introduction to SML
This lectures introduces the technical underpinnings of the language Standard ML as well as some elementary constructs. In particular, it discusses the functional programming paradigm, the characteristics of strongly typed languages, and interpreter-based execution.
Wed 14 Jan
Recitation 2
SML, Style

Sun 18 Jan
Lecture 3
Inductive Data Structures
The concept of inductive definition, introduced in the specific case of natural numbers, is easily extended to generic data structures as long as they are finite. These correspond to the notion of freely-generated expressions from abstract algebra. We demonstrate this technique in the case of lists and trees. Such inductive definition of data structures is paralled by a simple generalization of traditional proofs by induction, called structural induction.
Mon 19 Jan
Recitation 3
Exercises on Inductive Data Structures
Tue 20 Jan
Lecture 4
Datatypes, Patterns, and Lists
The algebraic concept of inductive data structure is available in ML as the datatype mechanism. This provides a way to represent data in a natural fashion. We define datatypes for a variety of data structures and show how to write functions for them. We see that the structure of a datatype definition naturally leads to clausal function definitions, a convenient way to work with them based on pattern matching. We discuss at length ML's predefined support for lists and introduce the programming concept of polymorphism.
Wed 21 Jan
Recitation 4
Exercises on Datatypes, Patterns, and Lists

Sun 25 Jan
Lecture 5
Proving Properties of Programs
Inductive definitions can be used not only to describe data structures, but also to give a formal specification to programming concepts such as typing and evaluation. We concentrate on ML evaluation and show how the methodology of structural induction can be used to prove properties about programs. We apply it to termination proofs, equivalence proofs, and to prove the correctness of an ML function with respect to a specification. Along the way, we stumble upon common techniques to prove uncooperative theorems by induction, in particular the concept of generalization.
Mon 26 Jan
Recitation 5
Exercises on Properties of Programs
Tue 27 Jan
Lecture 6
Declarations, Binding, and Scope
We take a closer look at how ML manages declarations. Declarations evaluate to environments, which collects a set of bindings of variables to values which can be used in subsequent declarations or expressions. We introduce the mechanisms provided in ML for local declarations and expand on the rules of scope, which explain how references to identifiers are resolved. This is somewhat tricky for recursive function declarations. We also discuss tail recursion, a form of recursion that is somewhat like the use of loops in imperative programming. This form of recursion is often especially efficient and easy to analyze. Accumulator arguments play an important role in tail recursion.
Wed 28 Jan
Recitation 6
Exercises on Declarations, Binding and Scope

Sun 1 Feb
Lecture 7
Representation Invariants
As inductive definitions get more complex, it becomes a challenge to convince oneself (and prove to others) that they are actually correct. The argument usually relies on invariants that the representation is expected to satisfy at any time. Making these invariants explicit is extremely useful. We demonstrate it on red/black trees, a relatively complex search tree.
Mon 2 Feb
Recitation 7
Exercises on Representation Invariants
Tue 3 Feb
Lecture 8
We discuss the separation between specification and implementation in the context of code reuse. The specification describes the functionalities of an abstract data type, with its syntactic aspects expressed as a module interface. The implementation consists of specific code that realizes these functionalities, code whose details are invisible to the user of a module. We also introduce advanced concepts such as parametric modules. The discussion is concretized by examining the specific module system of ML, in particular the concepts of signature, structure, functor and ascription.
Wed 4 Feb
Recitation 8
Exercises on Modularity

Sun 8 Feb
Lecture 9
First-Class Functions
Higher-order functions are functions that manipulate other functions. We introduce the notion of nameless functions and functions as first-class values. We discuss currying, a common transformation between some traditional functions and some higher-order functions, and study situations where one or the other representation is advantageous. We present some standard higher-order functions that allow to concisely work with lists and other inductive data structures.
Mon 9 Feb
Recitation 9
Exercises on Higher-Order Functions
Tue 10 Feb
Lecture 10
One very useful application of higher-order functions is as a way to control the execution of a program: continuations act as "functional accumulators." The basic idea of the technique is to implement a function f by defining a tail-recursive function f' that takes an additional argument, the continuation. This continuation is a function; it encapsulates the computation that should be done on the result of f. In the base case, instead of returning a result, we call the continuation. In the recursive case we augment the given continuation with whatever computation should be done on the result.
Wed 11 Feb
Recitation 10
Exercises on Continuations

Sun 15 Feb
Lecture 11
Puzzles and games
Implementing games efficiently requires explicit and complex control of the execution. In this lecture, we look at games and their representation as a mathematical problem. We define games and game trees, and study two strategies for choosing the next move in a game.
Mon 16 Feb
Recitation 11
Exercises on Game Tree Search
Tue 17 Feb
Lecture 12
Exceptions are another way to control execution programmatically. We see that exceptions can be used not only to signal error conditions, but also in backtracking search procedures or other patterns of control where a computation needs to be partially undone. Exceptions are the first type of effect that we encounter; they may cause an evaluation to be interrupted or aborted.
Wed 18 Feb
Recitation 12
Exercises on Exceptions

Sun 22 Feb
Lecture 13
Combinators are functions of functions, that is, higher-order functions used to combine functions. One example is the function composition operator. The basic idea is to think at the level of functions, rather than at the level of values returned by those functions. Combinators are defined using the pointwise principle.
Mon 23 Feb
Recitation 13
Midterm review
Tue 24 Feb
Wed 25 Feb
Recitation 14
Exercises on Combinators

Sun 1 Mar
Lecture 14
Co-Inductive definitions
Inductive data structures such as list and trees are meant to be finite. Surprisingly, many of the operations defined on them make sense also when removing the finiteness constraint, except that we take care that these operations return useful results even when applied to potentially infinite entities. Mathematically, such data structures are said to be co-inductively defined. We briefly examine how to prove properties about co-inductive definitions.
Mon 2 Mar
Recitation 15
Exercises on Co-Inductive Definitions
Tue 3 Mar
Lecture 15
Demand-Driven Computation
Data streams (as in YouTube for example) are a prominent computational instance of a co-inductive data structure. We discuss how to support them in a programming language through the concept of lazy evaluation: functions in ML are evaluated eagerly, meaning that the arguments are reduced before the function is applied. An alternative is for function applications and constructors to be evaluated in a lazy manner, meaning expressions are evaluated only when their values are needed in a further computation. In an eager language such as ML, lazy evaluation can be simulated by relying on the on the fact that functions are values to "suspend" the computation. This style of evaluation is essential when working with potentially infinite data structures, such as streams, which arise naturally in many applications. Then, streams are lazy lists whose values are determined by suspended computations that generate the next element of the stream only when forced to do so
Wed 4 Mar
Recitation 16
Exercises on Demand-Driven Computation

Sun 8 Mar
Lecture 16
This lecture investigates the limits of computation. We introduce the Halting problem, a simple problem that no program will ever be able to solve. We then show that many other problems are not computable through two important techniques: diagonalization (which is a direct argument), and problem reduction (which shows that a problem is undecidable by giving a reduction from another undecidable problem)
Mon 9 Mar
Recitation 17
Exercises on Decidability
Tue 10 Mar
Lecture 17
State and Ephemeral Data Structures
The programming techniques used so far in the course have, for the most part, been "purely functional". Some problems, however, are more naturally addressed by keeping track of the "state" of an internal machine. Typically this requires the use of mutable storage. ML supports mutable cells, or references, that store values of a fixed type. The value in a mutable cell can be initialized, read, and changed (mutated), and these operations result in effects that change the store. Programming with references is often carried out with the help of imperative techniques. Imperative functions are used primarily for the way they change storage, rather than for their return values References allow us to create data structures that are ephemeral, i.e., that change over time. Their main advantage is the ability to maintain state as a shared resource among many routines. Another advantage in some cases is the ability to write code that is more time-efficient than purely functional code. On the other hand, they are conceptually complex and therefore error-prone: our routines may accidentally and irreversibly change the contents of a data structure; variables may be aliases for each other. As a result it is much more difficult to prove the correctness of code involving ephemeral data structures.
Wed 11 Mar
Recitation 18
Exercises on References and Ephemeral Data Structures

Sun 15 Mar
Lecture 18
Although the Halting problem says that there is no hope to build a program that will give a yes/no answer to many problems of interest, it is possible to write programs that will return a yes answer but may run forever when the answer is no. This is called a semi-decision procedure. For some other problems, it is always possible to correctly return a no, but a positive answer may never be returned. There is also a class of problems for which either a yes or a no answer may not be returned in finite time.
Mon 16 Mar
Recitation 19
Exercises on Computability
Tue 17 Mar
Lecture 19
We continue with streams, and complete our implementation by introducing a memoizing delay function. Memoization ensures that a suspended expression is evaluated at most once. When a suspension is forced for the first time, its value is stored in a reference cell and simply returned when the suspension is forced again. The implementation that we present makes a subtle and elegant use of a "self-modifying" code technique with circular references
Wed 18 Mar
Recitation 20
Exercises on Memoization

Sun 22 Mar
No class (Spring Break)
Mon 23 Mar
Tue 24 Mar
Wed 25 Mar

Sun 29 Mar
Lecture 20
Language Hierarchy and Regular Languages
With this class, we begin at looking at the mathematical ingredients that are involved in building a programming language. We start by studying languages in general and classify them into a hierarchy based on their expressiveness and complexity. Several layers of this hierarchy are found in a typical interpreter. We examine in some detail regular languages and their relations to regular expressions and finite-state automata.
Mon 30 Mar
Recitation 21
Exercises on Language Hierarchy and Regular Languages
Tue 31 Mar
Lecture 21
Regular Expressions and Lexical Analysis
Regular expressions - and their underlying finite-state automata - are useful in many different applications, and are central to text processing languages and tools such as awk, Perl, emacs and grep. Regular expression pattern matching has a number of simple and elegant implementation in ML. Regular expressions are the key ingredient of lexical analysis, a pre-processing step carried out by many application to recognize legitimate words. Examples include compiling programming languages, processing natural languages, or manipulating HTML pages to extract structure. As an example, we study a lexical analyzer for a simple language of arithmetic expressions.
Wed 1 Apr.
Recitation 22
Exercises on Regular Expressions and Lexical Analysis

Sun 5 Apr.
Lecture 22
Context-free grammars arise naturally in a variety of applications. The "Abstract Syntax Charts" in programming language manuals are one instance. The underlying machine for a context-free language is a pushdown automaton, which maintains a read-write stack that allows the machine to "count"
Mon 6 Apr.
Recitation 23
Exercises on Grammars
Tue 7 Apr.
Lecture 23
In this lecture we rely on context-free grammars to parse programs. We look at the two main parsing techniques, shift-reduce and recursive descent. Shift-reduce parsing uses a stack to delay application of rewrite rules, enabling operator precedence to be enforced. Recursive descent parsing is another style that uses recursion in a way that mirrors the grammar productions. Although parser generator tools exist for restricted classes of grammars, a direct implementation can allow greater flexibility and better error handling. We present an example of a shift-reduce parser for a grammar of arithmetic expressions
Wed 8 Mar
Recitation 24
Exercises on Parsing

Sun 12 Apr
Lecture 24
Programming Language Semantics
Formal languages give us a way to determine whether a program is syntactically valid. Getting them to tell us whether it makes any sense, or how to compute a result, is better addressed by semantic means. In this lecture, we introduce one of the basic infrastructure for doing so, and apply it for describing how to evaluate a program.
Mon 13 Apr
Recitation 25
Exercises on Programming Language Semantics
Tue 14 Apr
Lecture 25
We implement the evaluation semantics into an evaluator, and put it together with lexical analysis and parsing. The result is an interpreter that evaluates arithmetic expressions directly, rather than by constructing an explicit translation of the code into an intermediate language, and then into machine language, as a compiler does. Our first example uses the basic grammar of arithmetic expressions, interpreting them in terms of operations over the rational numbers. We then extend this simple language to include conditional statements, variable bindings, function definitions, and recursive functions
Wed 15 Apr
Recitation 26
Exercises on Interpreters

Sun 19 Apr
Lecture 26
All techniques discussed so far assumed a sequential model of computation, where all the code was executed on a single machine. Whenever computation takes places on several processors at the same time, we have concurrency. One form of concurrency is parallel computing, where the computation is split among the available processors. Another form is distributed computing where independent machines collaborate by exchanging information. In this lecture, we introduce this general taxonomy from a mathematical perspective and discuss some of the issues that distinguish concurrent from sequential systems, especially communication. We explore the main techniques to address these problems.
Mon 20 Apr
Recitation 27
Exercises on Concurrency
Tue 21 Apr
Lecture 27
Networked applications are an especially important case of distributed systems. We explore a few of the main concepts by showing how to write a simple web server.
Wed 22 Apr
Recitation 28
Final review

Mon 27 Apr

Iliano Cervesato