15-418 Spring '04 Syllabus

The goal of this course is to provide a deep understanding of the fundamental principles and engineering tradeoffs involved in designing modern parallel computers (aka ``multiprocessors''), as well as the programming techniques to effectively utilize these machines. Parallel machines are already ubiquitous from desktops to supercomputers, and the expectation is that they will become even more commonplace in the future. However, very few people exploit the potential processing power of these machines because they do not understand how to write efficient parallel programs. Because one cannot design a good parallel program without understanding how parallel machines are built and vice-versa, this course will cover both parallel hardware and software design, as well as the impact that they have on each other.

Course topics include naming shared data, synchronizing threads, and the latency and bandwidth associated with communication. Case studies on shared-memory, message-passing, data-parallel and dataflow machines will be used to illustrate these techniques and tradeoffs. Programming assignments will be performed on one or more commercial multiprocessors, and there will be a significant course project.

This is a relatively unique course since this material is rarely offered to undergraduates. Because parallel processing has become such an important and mainstream technology, the time has come to integrate this material into the undergraduate systems curriculum.

Prerequisites

15-213 (Intro to Computer Systems) is a strict prerequisite for this course. We will build directly upon the material presented in 15-213, including memory hierarchies, memory management, basic networking, etc.

While 18-347 (Intro to Computer Architecture) would be helpful for understanding the material in this course, it is not a prerequisite.

Computer Accounts

Important: please note that the class will be allocated a finite (and not particularly large) amount of time on these machines, so please be careful not to waste time unnecessarily.

Course Work

Grades will be based on homeworks, a project, two exams, and class participation.

Grading Policy

Homework:	25%
Project:	25%
Exams:	40% (20% each)
Class Participation:	10%

Schedule

Table 1 shows the tentative schedule. The idea is to cover the lecture material in roughly the first 2/3 of the semester (by meeting three rather than two days a week), so that you will have more time to devote to the class project in the last 1/3 of the semester, and so that you can take advantage of all of the course lecture material in your projects.

**Table 1:** 15-418, Spring 2004.
Class	Date	Day	Topic	Reading	Assignments
1	1/13	Tue	Why Study Parallel Architecture?	1.1
2	1/14	Wed	Evolution of Parallel Architecture	1.2
3	1/15	Thu	Fundamental Design Issues	1.3-4
4	1/20	Tue	Parallel Programming: Overview I	2.1-2	L1 Out
5	1/21	Wed	Parallel Programming: Overview II	2.3-4
6	1/22	Thu	Parallel Programming: Performance I	3.1
7	1/27	Tue	Parallel Programming: Performance II	3.2
8	1/28	Wed	Parallel Programming: Performance III	3.3-4
9	1/29	Thu	Par. Prog: Case Studies & Implications	3.5-6	L2 Out
10	2/3	Tue	Workload-Driven Arch Evaluation I	4.1	L1 Due
11	2/4	Wed	Workload-Driven Arch Evaluation II	4.2-3
12	2/5	Thu	Shared Memory Multiprocessors I	5.1
13	2/10	Tue	Shared Memory Multiprocessors II	5.3
14	2/11	Wed	Shared Memory Multiprocessors III	5.4
15	2/12	Thu	Directory-Based Cache Coherence I	8.1-5	L3 Out
16	2/17	Tue	Directory-Based Cache Coherence II	8.6-7, 8.9-11	L2 Due
17	2/18	Wed	Relaxed Memory Consistency Models	9.1
18	2/19	Thu	Snoop-Based Multiprocessor Design I	6.1
19	2/24	Tue	Snoop-Based Multiprocessor Design II	6.2
20	2/25	Wed	Earthquake Simulation Case Study
	2/26	Thu	Exam I
21	3/2	Tue	Snoop-Based Multiprocessor Design III	6.3-4	L3 Due
22	3/3	Wed	Snoop-Based Multiprocessor Design IV	6.5, 6.7
23	3/4	Thu	Synchronization	5.5., 7.9, 8.8	Project Proposal
Spring Break
24	3/16	Tue	Scalable Distributed Memory MPs I	7.1-3
25	3/17	Wed	Scalable Distributed Memory MPs II	7.4-8
26	3/18	Thu	Interconnection Network Design	10.1-10
27	3/23	Tue	Latency Tolerance: Prefetching	11.1, 11.6
28	3/24	Wed	Latency Tolerance: Multithreading	11.7-9
	3/25	Thu			Project Milestone 1
	3/30	Tue	Exam II
29	4/8	Thu	Terascale Computing System at PSC		Project Milestone 2
	4/22	Thu	Project Poster Session		Project Report Due

Lectures:	Tuesdays, 12:00-1:20 p.m., PH 125B
	Wednesdays, 10:30-11:20 a.m., SH 324
	Thursdays, 12:00-1:20 p.m., PH 125B
Instructor:	Todd C. Mowry, WeH 8123, 268-3725, `tcm@cs.cmu.edu`
	Office Hours: Thursdays, 3:00-4:00pm, WeH 8123
TA:	Tiankai Tu, `tutk@cs.cmu.edu`

Class Admin:	Jennifer Landefeld, WeH 8124, 268-4740, `jennsbl@cs.cmu.edu`
Web Page:	`www.cs.cmu.edu/afs/cs/academic/class/15418-s04/www/`
Newsgroup:	`cyrus.academic.cs.15-418`
Handouts:	Electronic: `/afs/cs.cmu.edu/academic/class/15418-s04/public`
	Hardcopies: In bins outside WeH 8124.

15-418: Parallel Computer Architecture and Programming
Spring 2004
Syllabus

Course Details at a Glance

Textbook