CS 15-122: Principles of Imperative Computation
(Spring 2018)

Schedule of Classes

At a glance ...

January 2018
U	M	T	W	R	F	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

February 2018
U	M	T	W	R	F	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28

March 2018
U	M	T	W	R	F	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

April 2018
U	M	T	W	R	F	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

May 2018
U	M	T	W	R	F	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Outline[+]

Mon 15 Jan	No class (Martin Luther King Day)
Tue 16 Jan Lecture 1	Welcome and Course Introduction We outline the course, its goals, and talk about various administrative issues. Readings: Course syllabus Overview slides A mysterious function ... We examine a program we know nothing about, making hypotheses about what it is supposed to do. We notice that this function has no meaningful output for some inputs, which leads us to restricting its valid inputs using preconditions. We use a similar mechanism, postconditions, to describe the value it returns. Along the way, we get acquainted to C0 and its support for checking pre- and post-conditions. We then notice that this function doesn't return the expected outputs even for some valid inputs ... Concepts: Pre- and post-conditions Testing Contract support in C0 Readings: Lecture notes (sections 1-2, 9-11) Lecture video Code
Thu 18 Jan Lecture 2	Contracts Contracts are program annotations that spell out what the code is supposed to do. They are the key to connecting algorithmic ideas to their implementation as a program. In this lecture, we illustrate the use of contracts by means of a simple C0 program. As we do so, we learn to verify loop invariants — an important type of contract, we see how contracts can help us write correct code, and we get acquainted with C0's automated support to validating contracts. Concepts: Loop invariants Assertions Using contracts to write correct programs Contract support in C0 Lecture notes Lecture video Loop invariants Code
Fri 19 Jan Recitation 1	C0 Basics This recitation reviews elementary C0 constructs and practices reasoning about code. Exercises

Sat 20 Jan Lab 1	Setup This lab practices using Linux and running the C0 interpreter and compiler. Exercises Unix quick reference Emacs quick reference
Tue 23 Jan Lecture 3	Ints In this lecture, we explore how number representation interplays with the the ability to write effective contracts. We focus on integers and see how the binary representation called two's complement supports the laws of modular arithmetic, which C0 embraces. We also examine operations that allow exploiting the bit pattern of a binary number to achieve compact representations of other entities of interest, and to manipulate them. Representation of integers Two's complement Modular arithmetic Bit-level operations Lecture notes Lecture video Ints
Thu 25 Jan Lecture 4	Arrays In this lecture, we examine arrays as our first composite data structure, i.e., a data construction designed to hold multiple values, together with operations to access them. Accessing an array element outside of its range is undefined — it is a safety violation — and we see how to use contracts, in particular loop invariants, to ensure the safety of array accesses in a program. Arrays are stored in memory, which means that they are manipulated through an address. This raises the possibility of aliasing, a notorious source of bugs in carelessly written programs. Arrays Memory allocation Safe access Loop invariants for arrays Aliasing Lecture notes Lecture video Code
Fri 26 Jan Recitation 2	A Bit about Bytes This recitation practices base conversion and writing code that manipulates bits. Exercises

Mon 29 Jan Lab 2	A Reversal of Fortune This lab practices working with bitwise operations on integers and with arrays. Exercises
Tue 30 Jan Lecture 5	Searching Arrays We practice reasoning about arrays by implementing a function that searches whether an array contains a given value — this is the gateway to a whole class of useful operations. We notice that this function returns its result more quickly when the array is sorted. We write a specialized variant that assumes that the array is sorted, and show that it works correctly by reasoning about array bounds. The first (simpler but less efficient) version acts as a specification for the the second (slightly more complex but often faster). Using the specification in the contract for the implementation is a standard technique to help writing correct code. Linear search Reasoning about arrays Sorted arrays Performance as number of operations executed Specification vs. implementation Lecture notes Lecture video Code
Thu 1 Feb Lecture 6	Sorting Arrays We examine big-O notation as a mathematical tool to describe the running time of algorithms, especially for the purpose of comparing two algorithms that solve the same problem. As a case study, we use the problem of sorting an array, and for now a single sorting algorithm, selection sort. As we implement selection sort, we see that starting with contracts gives us high confidence that the resulting code will work correctly. Along the way, we develop a useful library of functions about sorted arrays to be used in contracts. Big-O notation Selection sort Deliberate programming Asymptotic complexity analysis Lecture notes Selection sort folk dance Lecture video arrayutil.c0 Code
Fri 2 Feb Recitation 3	Function Family Reunion This recitation practices understanding and using big-O notation. Exercises

Mon 5 Feb Lab 3	Loopty-Loopty Loop This lab practices testing and timing running code to estimate its complexity. Exercises
Tue 6 Feb Lecture 7	Binary search When searching for a value in a sorted array, examining the middle element allows us to discard half of the array in the worst case. The resulting algorithm, binary search, has logarithmic complexity which is much better than linear search (which is linear). Achieving a correct imperative implementation can be tricky however, and we use once more contracts as a mechanism to reach this goal. Binary search Divide-and-conquer Deliberate implementation Checking complex loop invariants Lecture notes Why you should love O(log n) (if you're an ant) Lecture video Code
Thu 8 Feb Lecture 8	Quicksort We use the key idea underlying binary search to implement two sorting algorithms with better complexity than selection sort. We examine one of them, quicksort, in detail, again using contracts to achieve a correct implementation, this time a recursive implementation. We observe that the asymptotic complexity of quicksort depends on the the value of a quantity the algorithm use (the pivot) and discuss ways to reduce the chances of making a bad choice for it. We conclude by examining another sorting algorithm, mergesort, which is immune from this issue. Quicksort Deliberate programming Recursion Best, average, and worst case complexity Randomness Choosing an algorithm for a problem Lecture notes Mergesort folk dance Quicksort folk dance Lecture video Code
Fri 9 Feb Recitation 4	A Strange Sort of Proof This recitation reviews proving the correctness of functions. Exercises

Mon 12 Feb Lab 4	TA Training This lab practices working with algorithms with radically different complexity for the same problem. Exercises
Tue 13 Feb Lecture 9	Data structures Arrays are homogeneous collections, where all components have the same type. `struct`s enable building heterogeneous collections, that allow combining components of different types. They are key to building pervasively used data structures. In C0, a `struct` resides in allocated memory and is accessed through an address, which brings up a new form of safety violation: the `NULL` pointer violation. We extend the language of contracts to reason about pointers. Now that we have a two ways to build complex collections, we start exploring the idea of segregating the definition of a data structure and the operations that manipulate it into a library. Code that uses this data structure only needs to be aware of the type, operations and invariants of the data structure, not the way they are implemented. This is the basis of a form of modular programming called abstract data types, in which client code uses a data structure exclusively through an interface without being aware of the underlying implementation. `struct` Pointers Abstract data types Interfaces, client code and library code Data structure invariants Testing Lecture notes Tony Hoare about the `NULL` pointer (at 27:30) Lecture video Code
Thu 15 Feb Lecture 10	Stacks and Queues In this lecture, we examine the interface of two fundamental data structures, stacks and queues. We practice using the exported functions to write client code that implements operations of stacks and queues that are not provided by the interface. By relying only of the interface functions and their contracts, we can write code that is correct for any implementation of stacks and queues. Interface of stacks and queues Using an interface Lecture notes How queues can turn your life around Lecture video
Fri 16 Feb Recitation 5	A `queue_t` Interface This recitation practices programming against an interface. Exercises

Mon 19 Feb Lab 5	Misclaculation This lab practices understanding postfix notation and stack-based machines. Exercises
Tue 20 Feb Lecture 11	Linked Lists We observe that we can implement array-like collections using a `struct` that packages each element with a pointer to the next element. This idea underlies linked lists, a data structure pervasively used in computer science. Writing correct code about linked lists is however tricky as we often rely on stricter invariants than natively supported, in particular the absence of cycles. We develop code to be used in contracts to check for common such properties. We then use linked lists to write code that implements the stack interface, and similarly for queues. We could have given an array-based implementation, and we note the advantages and drawbacks of each choice. Linked lists Checking data structure invariants Linked list implementation of stacks and queues Choosing an implementation: trade-offs Lecture notes Lecture video Code
Thu 22 Feb Midterm 1	Midterm 1 Practice exams: Fall 2016 (sample solution) Fall 2014 (sample solution)
Fri 23 Feb Recitation 6	Link it All Together This recitation practices working with linked lists. Exercises

Mon 26 Feb Lab 6	List(en) Up! This lab practices working with linked lists. Exercises
Tue 27 Feb Lecture 12	Unbounded Arrays When implementing a data structure for a homogeneous collection, using an array has the advantage that each element can be accessed in constant time, but the drawback that we must fix the number of elements a priori. Linked lists can have arbitrarily length but access takes linear time. Can we have the best of both worlds? Unbounded arrays rely on an array to store the data, but double it when we run out of place for new elements. The effect is that adding an element can be either very cheap or very expensive depending on how full the array is. However, a series of insertions will appear as if each one of them takes constant time in average. Showing that this is the case requires a technique called amortized analysis, which we explore at length in this lecture. Better trade-offs Amortized analysis Unbounded arrays Lecture notes The k-bit counter (10 minute video) Amortized analysis (15 minute video) Slides Lecture video Code
Thu 1 Mar Lecture 13	Hash Tables Associative arrays are data structures that allow efficiently retrieving a value on the basis of a key: arrays are the special case where valid indices into the array are the only possible keys. One popular way to implement associative arrays is to use hash tables, which computes an array index out of each key and uses that index to find the associated value. However, multiple keys can map to the same index, something called a collision. We discuss several approaches to dealing with collisions, focusing on one called separate chaining. The cost of access depends on the contents of the hash table. While a worst case analysis is useful, it is not typically representative of normal usage. We compute the average case complexity of an access relative to as few simple parameters of the hash table. Associative arrays AKA dictionaries AKA maps Implementation using hash tables Dealing with collisions Randomness Average case complexity Genericity — part I: void pointers Lecture notes `elem` (sec. 1), `void *` (sec. 4) Lecture video Code
Fri 2 Mar Recitation 7	Array Disarray This recitation practices coding to achieve amortized cost. Exercises

Mon 5 Mar Lab 7	Hashing This lab practices understanding collisions in hash functions. Exercises
Tue 6 Mar Lecture 14	Hash Dictionaries In this lecture, we look at the interface of modular code at greater depth, using hash functions as a case study. In this and many example, it is useful for the client to fill in some parameters used by the library code so that it delivers the functionalities needed by the client. One such parameter is the type of some quantities the library acts upon, keys in our case. It is also often the case that the client wants to provide some of the operations used by the library, here how to hash a key and how to compare elements. This is a first step towards making libraries generic, so that they implement the general functionalities of a data structure but let the client choose specific details. Adaptable libraries Client-supplied operations Lecture notes Lecture video Code
Thu 8 Mar Lecture 15	Generic Data Structures In large (and not-so-large) systems, it is common to make multiple uses of the same library, each instantiated with different parameters. This is not possible in C0, however. To achieve this goal, we look at a richer language, called C1. C1 provides two new features: generic pointers and function pointers. Generic pointers, `void *` in C, allow a same library type to be instantiated with different client types at once, which gives us a way to use a hash table library with both integers and strings as keys for example. Function pointers allow a library to be instantiated with different client-supplied operations in the same program. Genericity — part II: function pointers Lecture notes Lecture video Code
Fri 9 Mar	No class (mid-semester break)

Mon 12 Mar	No class (Spring break)
Tue 13 Mar
Thu 15 Mar
Fri 16 Mar

Mon 19 Mar Lab 8	Legacy of the void* This lab practices defining generic libraries. Exercises
Tue 20 Mar Lecture 16	Binary Search Trees We discuss trees as an approach to representing a collection, and binary search trees as an effective data structure to store and operate on sorted data. In particular, most operations on balanced binary search trees have a logarithm cost on the number of contained data. Binary search trees can however become unbalanced over time. Trees Binary search trees Ordering invariant Exponential speedup Lecture notes Lecture video Code
Thu 22 Mar Lecture 17	AVL trees Self-balancing trees guarantee the performance advantages of binary search trees by making sure that common operations keep the tree roughly balanced. This assurance comes at the price of more complex code for these operations, which rely on more complex invariants to tame it. AVL trees AVL invariants Rotations Experimental validation Lecture notes slides, more slides Animation Lecture video Code
Fri 23 Mar Recitation 8	Rotating Rotations This recitation practices restoring height invariants in trees. Exercises

Mon 26 Mar Lab 9	This One's a Treet This lab practices using dictionaries to avoid recomputing values. Exercises
Tue 27 Mar Lecture 18	Priority Queues We discuss priority queues, another classical data structure. Among the possible ways to implement priority queues, we consider min-heaps, a tree-based strategy that provides superior performance. We examine the data structure invariants of min-heaps and observe that we need to violate them during operations such as insertion. To conclude with, we see that min-heaps are amenable to an efficient implementation based on arrays. Priority queues Implementation strategies Heap-based implementation Implementing heaps using arrays Lecture notes Animation Lecture video Code
Thu 29 Mar Lecture 19	Restoring Invariants In this lecture, we write a library that implements min-heaps using arrays. Of particular interest is how the temporary violation of invariants needed to implement min-heap operations manifests at the level of contracts. Careful pre- and post-conditions are key to writing their code correctly. Temporary violation of invariants Controlling invariant violations using contracts Min-heap library implementation Heapsort Lecture notes Lecture video Code
Fri 30 Mar Recitation 9	Heaps of Fun This recitation practices using priority queues. Exercises

Mon 2 Apr Lab 10	PQ puns are too hard This lab practices using priority queues. Exercises
Tue 3 Apr Lecture 20	Data Structures in C With this lecture, we are moving from the safety of C0 to the more open-ended world of C. We introduce some basic concepts of C, in particular the C preprocessor and how macros written in this language can simulate some of the effects of C0's contracts. We also see how to compile C programs and discuss separate compilation. We conclude with C's primitives to manage memory, in particular the need to free allocated memory to prevent memory leaks. The C preprocessor Macros Contracts in C Compilation of C programs Allocation and deallocation Lecture notes Lecture video Code
Thu 5 Apr Midterm 2	Midterm 2 Practice exams: Spring 2015 (sample solution) Fall 2017 (sample solution)
Fri 6 Apr Recitation 10	From C1 to Shining C This recitation practices the main novelties of C. Exercises

Mon 9 Apr Lab 11	Once you C1 you C them all This lab practices using translating C0 code to C and managing memory leaks. Exercises
Tue 10 Apr Lecture 21	C's Memory Model C provides a very flexible view of memory, which allows writing potentially dangerous code unless one is fully aware of the consequences of one's decision. This lecture is about building this awareness. We see that, while C overlays an organization on the monolithic block of memory the operating systems assigns to a running program, it also provides primitives that allow violating this organization. We focus on two such primitives, pointer arithmetic and address-of. While some uses are legitimate, others are not. C's approach to many non-legitimate operations is to declare them undefined, which means that what happens when a program engages in them is decided by the specific implementation of the C compiler in use. C's memory layout Pointer arithmetic Undefined behavior The address-of operator Lecture notes The C99 standard (all 552 pages of it) Lecture video Code
Thu 12 Apr Lecture 22	Types in C In this lecture, we examine how C implements basic types, and what we as programmers need to be aware of as we use them. We begin with strings, that in C are just arrays of characters with the `null` character playing a special role. A variety of number types are available in C, but some of their characteristics are not defined by the language, most notably their size and what happens in case of overflow. As a consequence, different compilers make different choices, which complicates writing code that will work on generic hardware. Strings in C Casting Numbers in C Implementation-defined behavior Lecture notes Lecture video Code
Fri 13 Apr Recitation 11	C-ing is Believing This recitation practices advanced C constructions. Exercises

Mon 16 Apr Lab 12	All sorts of sorts This lab practices working with pointers in C. Exercises
Tue 17 Apr Lecture 23	Program as Data: the C0VM Getting a program written in a high-level language to run onto the machine hardware can be achieved in a number of ways, with compilation to machine code and interpretation of the source language as the two extremes. We assess the benefits and drawbacks of each and introduce virtual machines as a modern trade-off. We explore this possibility through the C0VM, a virtual machine for C0. A compilation phase takes C0 code as input and outputs bytecode that can be run by the C0VM. We examine in depth the organization of the C0VM itself, understanding how its instructions are executed through a description called an operational semantics. We also describe some of the data structures needed to implement the C0VM itself. Interpreters vs. compilers Virtual machines Programs as data Transformation to bytecode Operational semantics Bytecode validation Lecture notes Lecture video Code
Thu 19 Apr	No class (Carnival)
Fri 20 Apr	No class (Carnival)

Mon 23 Apr Lab 13	passwordLab This lab practices understanding C0VM bytecode. Exercises
Tue 24 Apr Lecture 24	Representing Graphs Graphs provide a uniform way to represent many complex problems, for example the moves of a game. We define a minimal interface for working with undirected graphs and discuss two implementation strategies: adjacency matrices and adjacency lists, emphasizing the pros and cons of each. We also notice that not all graph-based problems need — or can — use an explicit representation of the underlying graph. Graphs Implicit vs. explicit graphs Adjacency matrices Adjacency lists Lecture notes Lecture video Code
Thu 26 Apr Lecture 25	Reachability in Graphs When working with graphs, one basic question is whether a node is reachable from another node by following a path. That destination node is often described by a property of interest — e.g., being a winning board in the graph representing the moves of a game. We examine various approaches to solving the reachability programs, in particular depth-first and breadth-first, each of which has its own advantages. We then discuss various approaches to implementing these strategies. Path between nodes Depth-first search Breadth-first search Implementation strategies Lecture notes Lecture video Code
Fri 27 Apr Recitation 12	Computing on the Edge This recitation practices working with graphs. Exercises

Mon 30 Apr Lab 14	Spend some Cycles Thinking This lab practices working with graphs. Exercises
Tue 1 May Lecture 26	Spanning Trees Given a starting node in a graph, it is often useful to superimpose onto the graph a way to visit every each of the remaining node uniquely by following edges. This is a spanning tree for that graph. Things get particularly interesting for graphs whose edges carry a weight (e.g., the distance between cities in the graph representing a road map). Then, spanning trees with the smallest cumulative weight are really interesting — they are called minimum spanning trees. We discuss Kruskal's algorithm, a classical method for computing a minimum spanning tree. Spanning trees Minimum spanning trees Kruskal's algorithm Lecture notes Lecture video
Thu 3 May Lecture 27	Union-Find A collection of nodes having a given property in a graph — e.g., being a minimum spanning tree — can be represented in many ways, for example as any permutation of the list consisting of these nodes. All these ways are equivalent, and indeed they form an equivalence class. At each step of Kruskal's algorithm, some of these equivalence classes need to be combined into bigger equivalence classes, while ensuring that the underlying property is still maintained — that of the result being a spanning tree. The union-find data structure maintains a canonical representative of this class. The union-find algorithm efficiently determines a canonical representative when merging two equivalence classes. Equivalence classes Canonical representative The union-find data structure The union-find algorithm Lecture notes Animation Lecture video Code
Fri 4 May Recitation 13	Union-finding your Roots This recitation practices understanding the union-find algorithm. Exercises

Thu 10 May
(5:30-8:30)
[null]

final

Final

Practice exams:

2018 Iliano Cervesato