CS 15-212: Principles of Programming
(Spring 2008)

Course Information

 [  Logistics  |  Course Links  |  Calendar of Classes  |  Coursework Calendar  ]

Logistics

Lectures:  Mo,We   11:30 - 12:50 (room C008)
Recitations:  every Tu 10:30 - 11:20 (room A028)   and   Su 10:30 - 11:20 (room A041) as deemed necessary

Class Webpage:   http://qatar.cmu.edu/cs/15212

Instructor: Iliano Cervesato
Office hours:  by appointment
Office:  A128
Email: 

Co-instructor: Thierry Sans
Office hours:  by appointment
Office:  behind A128A
Email: 

Course Links

Calendar of Classes

Click on a class day to go to that particular lecture or recitation.

January February March April May
UMTWRFS
1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31
0
UMTWRFS
1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29
0
UMTWRFS
1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31
UMTWRFS
1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30
30
UMTWRFS
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
30 31

Coursework Calendar

Hw1 Hw2 Hw3 Midterm Hw4 Hw5 Hw6 Final
Posted 14 Jan 27 Jan 10 Feb 27 Feb 02 Mar 16 Mar 06 Apr 30 Apr (8:30)
C008
Due 26 Jan 09 Feb 23 Feb 15 Mar 05 Apr 19 Apr
Corrected 30 Jan 13 Feb 27 Feb 02 Mar 19 Mar 09 Apr 23 Apr 07 May

About this course

 [  Description  |  Prerequisites  |  Software  |  Readings  |  Grading  ]

Description

This course has the purpose of introducing students who have had experience with basic data structures and algorithms to more advanced skills, concepts and techniques in programming and Computer Science in general. This will be accomplished along three dimensions.

Prerequisites

You must have completed CS 15-211 (Fundamental Data Structures and Algorithms)

Software

The course relies extensively on the programming language Standard ML (SML) and related utilities, mainly ML-Lex, ML-Yacc and Concurrent ML. The particular implementation we will be working with is Standard ML of New Jersey (SML/NJ), version 110.65.

SML at CMU-Q

A reference build has been made available on the Unix clusters. To run it, you need to login into your Unix account. In Windows, you do this by firing PuTTy and specifying unix.qatar.cmu.edu as the machine name. When the PuTTy window comes up, type sml, do your work, and then hit CTRL-D when you are done.

You can edit your files directly under Unix (the easiest way is to run the X-Win32 utility from Windows and then run the Emacs editor from the PuTTy window by typing emacs - see also this tutorial).

If you want to do all this from your own laptop, you first need to install X-Win32 from here. PuTTy is pre-installed in Windows.

SML on Your Own Laptop

If you want, you can install a personal copy of SML/NJ on your laptop. To do this, download this file and follow these instructions Personal copies are for your convenience: all software will be evaluated on the reference environment on unix.qatar.cmu.edu. You need to make sure that your homework assignments work there before submitting them. To do so, you need to transfer your files onto unix.qatar.cmu.edu and test them there. You can do so by using the PSFTP utility which comes with PuTTy (or any of the many more user-friendly FTP front-ends).

Documentation

Useful documentation can be found on the SML/NJ web site. The following files will be particularly useful:

Readings

The 15-212 Wiki

The material for all lectures can be found on the 15-212 wiki. This is wiki, not a textbook. The main differences are:

The Textbook

You have been given a copy of the textbook:

Use it mainly as a reference: the lectures will not follow it.

Further References

Grading

Schedule of Classes

At a glance ...

January February March April May
UMTWRFS
1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31
0
UMTWRFS
1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29
0
UMTWRFS
1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31
UMTWRFS
1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30
30
UMTWRFS
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
30 31

Mon 14 Jan.
Lecture 1
Welcome and Course Introduction
Evaluation and Typing
We outline the course, its goals, and talk about various administrative issues. We also introduce the language ML which is used throughout the course
ML concepts:
Useful Review:
Tue 15 Jan.
Recitation 1
SML, Style
Further readings:
Useful Review:
Wed 17 Jan.
Lecture 2
Declarations, Binding, Scope, and Functions
We introduce declarations which evaluate to environments. An environment collects a set of bindings of variables to values which can be used in subsequent declarations or expressions. We also discuss the rules of scope which explain how references to identifiers are resolved. This is somewhat tricky for recursive function declarations.
Core concepts:
  • Declaration
  • Binding
  • Environment
  • Scope
ML concepts:
  • Declarations (val, type)
  • Local declarations (let)
  • Functions (fn, fun)
  • SML Basis library
Further readings:
Useful Review:
  • Types
  • Values and Expressions
  • Typing Rules
  • Evaluation Rules

Sun 20 Jan.
Recitation 1.5
Using and Editing a Wiki
Readings:
Further readings:
Useful Review:
Mon 21 Jan.
Lecture 3
Recursion and Induction
We review the methods of mathematical and complete induction and show how they can be applied to prove the correctness of ML functions. Key is an understanding of the operational semantics of ML. Induction can be a difficult proof technique to apply, since we often need to generalize the theorem we want to prove, before the proof by induction goes through. Sometimes, this requires considerable ingenuity. We also introduce clausal function definitions based on pattern matching.
ML concepts:
Further readings:
Useful Review:
  • Functions
Tue 22 Jan.
Recitation 2
Scoping in recursive functions; Complete induction
Readings:
Further readings:
Useful Review:
Wed 23 Jan.
Lecture 4
Datatypes, Patterns, and Lists
One of the most important features of ML is that it allows the definition of new types with so-called datatype declarations. This means that programs can be written to manipulate the data in a natural representation rather than in complex encodings. This goes hand-in-hand with clausal function definitions using pattern matching on given data types. We introduce lists and polymorphic datatypes and functions
Core concepts:
ML concepts:
Further readings:
Useful Review:
  • Clausal function definition
  • Pattern matching

Sun 27 Jan.
Recitation 2.5
Fibonacci numbers
Readings:
Further readings:
Useful Review:
Mon 28 Jan.
Lecture 5
Structural Induction and Tail Recursion
We discuss the method of structural induction on recursively defined types. This technique parallels standard induction on predicates, but has a unique character of its own, and arises often in programming. We also discuss tail recursion, a form of recursion that is somewhat like the use of loops in imperative programming. This form of recursion is often especially efficient and easy to analyze. Accumulator arguments play an important role in tail recursion. As examples we consider recursively defined lists and trees
ML concepts:
Useful Review:
Tue 29 Jan.
Recitation 3
Lists; Equality types
Readings:
Further readings:
Useful Review:
Wed 30 Jan.
Lecture 6
Higher Order Functions and Staged Computation
We discuss higher order functions, specifically, passing functions as arguments, returning functions as values, and mapping functions over recursive data structures. Key to understanding functions as first class values is understanding the lexical scoping rules. We discuss staged computation based on function currying
Core concepts:
ML concepts:
  • Higher-order functions on lists (map, foldl, ...)
Further readings:
Useful Review:
  • Functions
  • Evaluation
  • Scope

Mon 4 Feb.
Lecture 7
Data Structures
Core concepts:
  • Modularity
  • Abstract data types
  • Representation invariants
  • Binary search trees
ML concepts:
  • Signatures
  • Structures
  • Signature ascription
Further readings:
Useful Review:
  • Datatypes
Tue 5 Feb.
Recitation 4
Representation Invariants
We demonstrate a complicated representation invariant using Red/Black Trees. The main lesson is to understand the subtle interactions of invariants, data structures, and reliable code production. In order to write code satisfying a strong invariant, it is useful to proceed in stages. Each stage satisfies a simple invariant, and is provably correct. Together the stages satisfy the strong invariant
Core concepts:
  • Representation invariants
  • Weakening invariants
ML concepts:
Useful Review:
  • Structural induction
Wed 6 Feb.
Lecture 8
Functors and Substructures
A functor is a parameterized module that acts as a kind of function which takes zero or more structures as arguments and returns a new structure as result. Functors greatly facilitate hierarchical organization in large programs. In particular, as discussed in the next few lectures, they can enable a clean separation between the details of particular definition and higher-level structure, allowing the implementation of "generic" algorithms that are easier to debug and maintain, and that maximize code reuse
Core concepts:
  • Parameterized modules
  • Code reuse
ML concepts:
  • Functors
  • Substructures
Further readings:
Useful Review:
  • Functions
  • Structures

Sun 10 Feb.
Recitation 4.5
Recursion
Readings:
Further readings:
Useful Review:
Mon 11 Feb.
Lecture 9
Continuations
Continuations act as "functional accumulators." The basic idea of the technique is to implement a function f by defining a tail-recursive function f' that takes an additional argument, called the continuation. This continuation is a function; it encapsulates the computation that should be done on the result of f. In the base case, instead of returning a result, we call the continuation. In the recursive case we augment the given continuation with whatever computation should be done on the result. Continuations can be used to advantage for programming solutions to a variety of problems. In today's lecture we'll look at a simple example where continuations are used to efficiently manage a certain pattern of control. We'll see a related and more significant example in an upcoming lecture when we look at regular expressions
Core concepts:
  • Continuation
  • Functional accumulator
  • Control pattern
ML concepts:
Useful Review:
  • Higher-order functions
Tue 12 Feb.
Recitation 5
Review
Readings:
Further readings:
Useful Review:
Wed 13 Feb.
Lecture 10
Regular Expressions
Regular expressions - and their underlying finite-state automata--are useful in many different applications, and are central to text processing languages and tools such as awk, Perl, emacs and grep. Regular expression pattern matching has a simple and elegant implementation in SML using continuation passing
Core concepts:
  • Formal language
  • Regular expression
  • Continuation passing
  • Proof-directed debugging
ML concepts:
Useful Review:

Sun 17 Feb.
Recitation 5.5
Currying, folding, and mapping
Further readings:
Useful Review:
Mon 18 Feb.
Lecture 11
Exceptions
Exceptions play an important role in the system of static and dynamic checks that make SML a safe language. Exceptions are the first type of effect that we will encounter; they may cause an evaluation to be interrupted or aborted. We have already seen simple uses of exceptions in the course, primarily to signal that invariants are violated or exceptional boundary cases are encountered. We now look a little more closely at what exceptions are and how they can be used. In addition to signaling error conditions, exceptions can sometimes also be used in backtracking search procedures or other patterns of control where a computation needs to be partially undone
Readings:
  • Effects
  • Static and dynamic checks
  • Exception handling
  • Backtracking
Useful Review:
Tue 19 Feb.
Recitation 6
Tail Recursion vs Continuations
Readings:
Further readings:
Useful Review:
Wed 20 Feb.
Lecture 12
n-Queens
The same problem can be solved using several, very different, techniques. We examine a classic puzzle, the n-queens problem, and compare solutions that use exceptions, continuations and functions that return options
Readings:
  • N-Queens problem
Further readings:
Useful Review:
  • Exceptions
  • Continuations

Sun 24 Feb.
Recitation 6.5
Midterm review
Mon 25 Feb.
Review
Midterm review
Tue 26 Feb.
Recitation 7
Midterm review
Wed 27 Feb.
Midterm
Midterm

Sun 2 Mar.
Recitation 7.5
TBA
Readings:
Further readings:
Useful Review:
Mon 3 Mar.
Lecture 13
Mutation and State
The programming techniques used so far in the course have, for the most part, been "purely functional". Some problems, however, are more naturally addressed by keeping track of the "state" of an internal machine. Typically this requires the use of mutable storage. ML supports mutable cells, or references, that store values of a fixed type. The value in a mutable cell can be initialized, read, and changed (mutated), and these operations result in effects that change the store. Programming with references is often carried out with the help of imperative techniques. Imperative functions are used primarily for the way they change storage, rather than for their return values
Readings:
  • Mutable cell
  • Storage effect
  • References
Useful Review:
Tue 4 Mar.
Recitation 8
Ascription, where, and functors
Readings:
Further readings:
Useful Review:
Wed 5 Mar.
Lecture 14
Ephemeral Data Structures
Previously, within the purely functional part of ML, we saw that all values were persistent. At worst, a binding might shadow a previous binding. As a result our queues and dictionaries were persistent data structures. Adding an element to a queue did not change the old queue; instead it created a new queue, possibly sharing values with the old queue, but not modifying the old queue in any way. Now that we are able to create cells and modify their contents we can create ephemeral data structures. These are data structures that change over time. The main advantage of such data structures is their ability to maintain state as a shared resource among many routines. Another advantage in some cases is the ability to write code that is more time-efficient than purely functional code. The disadvantages are error and complexity: our routines may accidentally and irreversibly change the contents of a data structure; variables may be aliases for each other. As a result it is much more difficult to prove the correctness of code involving ephemeral data structures. As always, it is a good idea to keep mutation to a minimum and to be careful about enforcing invariants. We present two examples. First, we consider a standard implementation of hash tables. We use arrays to implement generic hash tables as a functor parameterized by an abstract hashable equality type. Second, we revisit the queue data structure, now defining an ephemeral queue. The queue signature clearly indicates that internal state is maintained. Our implementation uses a pair of reference cells containing mutable lists, and highlights some of the subtleties involved when reasoning about references We end the lecture with a few words about ML's value restriction. The value restriction is enforced by the ML compiler in order to avoid runtime type errors. All expressions must have well-defined lexically-determined static types
Readings:
  • Ephemeral data structures
  • Maintaining state with mutable storage
  • Value restriction
Further readings:
Useful Review:
  • Reference cells

Sun 9 Mar.
Recitation 8.5
TBA
Readings:
Further readings:
Useful Review:
Mon 10 Mar.
Lecture 15
Streams, Demand-Driven Computation
Functions in ML are evaluated eagerly, meaning that the arguments are reduced before the function is applied. An alternative is for function applications and constructors to be evaluated in a lazy manner, meaning expressions are evaluated only when their values are needed in a further computation. Lazy evaluation can be implemented by "suspending" computations in function values. This style of evaluation is essential when working with potentially infinite data structures, such as streams, which arise naturally in many applications. Streams are lazy lists whose values are determined by suspended computations that generate the next element of the stream only when forced to do so
Readings:
  • Demand-driven computation
  • Eager vs. lazy evaluation
  • Suspensions
  • Streams as infinite lists
Further readings:
Useful Review:
  • Functions
Tue 11 Mar.
Recitation 9
Arrays and mutable state
Readings:
Further readings:
Useful Review:
Wed 12 Mar.
Lecture 16
Memoization
We continue with streams, and complete our implementation by introducing a memoizing delay function. Memoization ensures that a suspended expression is evaluated at most once. When a suspension is forced for the first time, its value is stored in a reference cell and simply returned when the suspension is forced again. The implementation that we present makes a subtle and elegant use of a "self-modifying" code technique with circular references
Readings:
  • Memoization
  • Circular references
Further readings:
Useful Review:
  • Streams
  • Reference cells

Sun 16 Mar.
Recitation 9.5
TBA
Readings:
Further readings:
Useful Review:
Mon 17 Mar.
Lecture 17
Decidability, tractability, and tiling
In this and the next lecture we discuss the computability of functions in ML. By the Church-Turing thesis this is the same notion of computability as we have in recursion theory, with Turing machines, etc. There are two main ideas to show that certain functions are not computable: diagonalization (which is a direct argument), and problem reduction (which shows that a problem is undecidable by giving a reduction from another undecidable problem)
Readings:
  • Halting problem
  • Decision problem
  • Decision procedure
  • Diagonalization argument
  • Halting problem
  • Problem reduction
  • Equality of functions
Further readings:
Useful Review:
Tue 18 Mar.
Recitation 10
Operations on streams; Sequences and flip-flops
Readings:
Further readings:
Useful Review:
Wed 19 Mar.
Lecture 18
Computability
Although the Halting problem says that there is no hope to build a program that will give a yes/no answer to many problems of interest, it is possible to write programs that will return a yes answer but may run forever when the answer is no. This is called a semi-decision procedure. For some other problems, it is always possible to correctly return a no, but a positive answer may never be returned. There is also a class of problems for which either a yes or a no answer may not be returned in finite time.
Readings:
  • Hierarchy of problems
  • Semi-decision
Further readings:
Useful Review:
  • Decision procedures
  • Halting problem

Mon 24 Mar.
No class (Spring Break)
Tue 25 Mar.
Wed 26 Mar.

Sun 30 Mar.
Recitation 10.5
TBA
Readings:
Further readings:
Useful Review:
Mon 31 Mar.
Lecture 19
Regular Expressions and Lexical Analysis
Many applications require some form of tokenization or lexical analysis to be carried out as a preprocessing step. Examples include compiling programming languages, processing natural languages, or manipulating HTML pages to extract structure. As an example, we study a lexical analyzer for a simple language of arithmetic expressions
Readings:
  • Language hierarchy
  • Compilation
  • Regular expressions
  • Tokenization
  • Lexical analysis
Further readings:
Useful Review:
Tue 1 Apr.
Recitation 11
Languages
Readings:
Further readings:
Useful Review:
Wed 2 Apr.
Lecture 20
Grammars
Context-free grammars arise naturally in a variety of applications. The "Abstract Syntax Charts" in programming language manuals are one instance. The underlying machine for a context-free language is a pushdown automaton, which maintains a read-write stack that allows the machine to "count"
Readings:
  • Context-free grammar
  • Pushdown automaton
Useful Review:
  • Language hierarchy
  • Compilation

Sun 6 Mar.
Recitation 11.5
TBA
Readings:
Further readings:
Useful Review:
Mon 7 Apr.
Lecture 21
Parsing
In this lecture we continue our discussion of context-free grammars, and demonstrate their role in parsing. Shift-reduce parsing uses a stack to delay application of rewrite rules, enabling operator precedence to be enforced. Recursive descent parsing is another style that uses recursion in a way that mirrors the grammar productions. Although parser generator tools exist for restricted classes of grammars, a direct implementation can allow greater flexibility and better error handling. We present an example of a shift-reduce parser for a grammar of arithmetic expressions
Readings:
  • Recursive descent parsing
  • Shift-reduce parsing
Further readings:
Useful Review:
  • Language hierarchy
  • Context-free grammar
Tue 8 Apr.
Recitation 12
TBA
Readings:
Further readings:
Useful Review:
Wed 9 Apr.
Lecture 22
Evaluation
We now put together lexical analysis and parsing with evaluation. The result is an interpreter that evaluates arithmetic expressions directly, rather than by constructing an explicit translation of the code into an intermediate language, and then into machine language, as a compiler does. Our first example uses the basic grammar of arithmetic expressions, interpreting them in terms of operations over the rational numbers. In this and the next lecture we extend this simple language to include conditional statements, variable bindings, function definitions, and recursive functions
Readings:
  • Interpreter
  • Evaluation
Further readings:
Useful Review:
  • Lexing
  • Parsing

Sun 13 Apr.
Recitation 12.5
TBA
Readings:
Further readings:
Useful Review:
Mon 14 Apr.
Lecture 23
Concurrency
All techniques discussed so far assumed a sequential model of computation, where all the code was executed on a single machine. Whenever computation takes places on several processors at the same time, we have concurrency. One form of concurrency is parallel computing, where the computation is split among the available processors. Another form is distributed computing where independent machines collaborate by exchanging information. In this lecture, we introduce this general taxonomy and discuss some of the issues that distinguish concurrent from sequential systems, especially communication. We explore the main techniques to address these problems.
Readings:
  • Parallelism
  • Distributed Systems
  • Shared-memory systems
  • Semaphors
  • Message passing
Further readings:
Useful Review:
Tue 15 Apr.
Recitation 13
Decidability, tractability, and tiling
Readings:
Further readings:
Useful Review:
Wed 16 Apr.
Lecture 24
Networking
Networked applications are an especially important case of distributed systems. We explore a few of the main concepts by showing how to write a simple web server.
Readings:
  • Asynchronous message passing
  • Protocols
Further readings:
Useful Review:

Sun 20 Apr.
Recitation 13.5
TBA
Readings:
Further readings:
Useful Review:
Mon 21 Apr.
Lecture 25
Combinators
Combinators are functions of functions, that is, higher-order functions used to combine functions. One example is ML's composition operator o. The basic idea is to think at the level of functions, rather than at the level of values returned by those functions. Combinators are defined using the pointwise principle. Currying makes this easy in ML. We first discuss combinators of functions of type int -> int. Then we discuss rewriting our regular expression matcher using combinators. We using staging. The regular expression pattern matching is in one stage, the character functions are in another
Readings:
  • Function spaces
  • Combinators
  • Pointwise principle
Further readings:
Useful Review:
Tue 22 Apr.
Recitation 14
Review
Readings:
Further readings:
Useful Review:
Wed 23 Apr.
Review
Final review

30 Apr
8:30-11:30
(C008)
Final
Final

Iliano Cervesato