CS 15-122: Principles of Imperative Computation
(Summer 2023)

Course Information [ Logistics | Calendar of Classes | Coursework Calendar | Office Hours ]

Logistics

Lectures:	MTWRF,	9:30-10:50am EDT	(TEP 1403)
Practice sessions:	MTWRF,	12:30-1:50pm EDT	(GHC CLSTR)
or	MTWRF,	2:00-3:20pm EDT	(GHC CLSTR)

Class web page: https://cs.cmu.edu/~15122

Course syllabus

Calendar of Classes [iCal format]

Click on a class day to go to that particular lecture or recitation. Due dates for homeworks are set in bold. The due date of the next homework blinks.

July 2023
U	M	T	W	R	F	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

August 2023
U	M	T	W	R	F	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Coursework Calendar

No more written homework due
No more programming homework due
No exams left

Test Percentage learn. obj	Wr1 1.3% 1-3,11	Pg1 2.5% 1,12	Wr2 1.3% 1-3	Pg2 2.5% 12,15,16	Wr3 1.3% 1,2,4,12	Pg3 2.5% 1,12-16	Wr4 1.3% 1-4,21,27	Pg4 2.5% 1,18,17	Wr5 1.1% 6-10,15-17	Midterm1 12.5% 1-8	Pg5 2.5% 5,12,27	Wr6 1.3% 6-8,12,17	Pg6 1.6% 10	Wr7 1.3% 9,17,24,27
Posted	3 Jul	4 Jul	5 Jul	6 Jul	7 Jul	8 Jul	10 Jul	12 Jul	15 Jul	Tue 18 Jul	15 Jul	17 Jul	19 Jul	21 Jul
Due	Wed 5 Jul	Thu 6 Jul	Fri 7 Jul	Sat 8 Jul	Mon 10 Jul	Wed 12 Jul	Fri 14 Jul	Sat 15 Jul	Mon 17 Jul	Tue 18 Jul	Wed 19 Jul	Fri 21 Jul	Sat 22 Jul	Mon 24 Jul

Test Percentage learn. obj	Pg7 3.4% 1,12-18	Wr8 1.3% 12,24,27	Pg8 2.5% 8,10-18	Wr9 1.1% 9,10,25-27	Midterm2 12.5% 1-8	Pg9 2.5% 9,12-18	Wr10 1.3% 2,13,25,27	Pg10 2.5% 10,15-20	Wr11 1.3% 18-20	Pg11 2.5% 5,15-20	Wr12 1.1% 18-20	Pg12 2.5% 5,15-20	Final 25% 1-27
Posted	22 Jul	24 Jul	26 Jul	28 Jul	Tue 1 Aug	29 Jul	31 Jul	2 Aug	4 Aug	5 Aug	7 Aug	5 Aug	Fri 11 Aug (12a-3p)
Due	Wed 26 Jul	Fri 28 Jul	Sat 29 Jul	Mon 31 Jul	Tue 1 Aug	Wed 2 Aug	Fri 4 Aug	Sat 5 Aug	Mon 7 Aug	Tue 8 Aug	Thu 10 Aug	Fri 11 Aug	Fri 11 Aug (12a-3p)

Office Hours [iCal format]

Office hour rules:

Office hours are not where you do homework, with a TA around in case you get stuck: go to office hour if you thought about your problem for some time and are still stuck

About this course [ Description | How to Do Well | Resources | Grading | Academic Integrity | Policies | Help | Learning Objectives ]

Description[–]

This course teaches imperative programming in a C-like language and methods for ensuring the correctness of imperative programs. It is intended for students who are familiar with elementary programming concepts such as variables, expressions, loops, arrays, and functions. Given these building blocks, students will learn the process and techniques needed to go from high-level descriptions of algorithms to correct imperative implementations, with specific applications to basic data structures and algorithms. Much of the course will be conducted in a subset of C amenable to verification, with a transition to full C near the end. This will be accomplished along three dimensions:

The main skill you will get out of this course is the ability to write code that is correct by design and accounts for the needs of its application context. You will learn about deliberate programming as a way to write high quality code, about assessing the performance of a program, and about comparing solutions to satisfy deployment constraints.
As you do so, you will gain exposure to fundamental concepts in Computer Science — as opposed to programming — such as abstraction, correctness, complexity, and modularity. This will also give you a vocabulary to communicate effectively and precisely with other computer scientists.
Our vehicle for achieving these objectives will initially be C0, a safe variant of C, and later C itself. Using them, you will gain exposure to a number of data structures and algorithms that are used pervasively in computer science. C is the language of choice for system-level applications, and both are representative of the popular imperative programming paradigm.

After completing 15-122, you will be able to take 15-213 (Introduction to Computer Systems), 15-210 (Parallel and Sequential Data Structures and Algorithms) and 15-214 (Principles of Software System Construction). Other prerequisites or restrictions may apply.

Prerequisites

You must have gotten a 5 on the AP Computer Science A exam or passed 15-112 (Fundamentals of Programming) or equivalent. You may also get permission from an advisor if you performed very high on the CS Assessment on Canvas.
It is strongly advised that you either have taken or take at the same time either 21-127 (Concepts of Mathematics) or 15-151 (Mathematical Foundations of Computer Science): historically, students who did not do so ended up learning less, spending considerably more time on the course and earning one letter grade lower than their peers who did, on average.

Past Offerings

	2022	2022	2021	2020	2019	2018	2017	2016	2015	2014	2013	2012	2011	2010
Fall		F22	F21	F20	F19	F18	F17	F16	F15	F14	F13	F12	F11	F10
Summer	N23	M22	N21	N20	N19	N18	N17	N16	M15	M14	M13	M12	S11
Spring	S23	S22	S21	S20	S19	S18	S17	S16	S15	S14	S13	S12	S11

How to do Well in this Course[–]

Our goals are for you to succeed in this course and to teach you skills and concepts that will contribute to your success in life. To this end, we are providing you with lots of resources and the knowledge that comes from years of experience. Talking to some of the thousands of students who took this course before you, here's some advice that they found particularly useful:

Do not stress over grades: your goal is to learn new and exciting things. Good grades follow naturally from deep learning (but not necessarily vice versa). ...and employers care about what you know not what grade you got.
Participate: you will get a lot more from this class if you ask questions and engage with the course staff than if you are a fly on the wall — and it will be more fun.
Manage your time wisely: allocate sufficient time for homework and learning. Little adjustments can save you a whole lot of time later and have a huge impact on your performance. In particular, use class time to learn, review the material presented in lecture the same day, and schedule time for homework proactively.
Start homework early: racing against a deadline is so stressful! Starting early will remove that stress, lead to deeper learning and give you time to improve your solution if you feel like it.
Get all the help you need: we provide plenty of resources to help you succeed in this course — office hours every day, online help 24-7, and friendly staff when you need them. Take advantage of them: they are there for you! The only thing we ask is that you plan a bit ahead: helping students takes time and there are not enough of us if everybody waits up to the deadline.
Make time for fun: take a break from studying at least once a day — meet with friends, go for a walk, play sports, whatever gets you to reset your mind.

Feedback

It is our goal to make this course successful, stimulating and enjoyable. If at any time you feel that the course is not meeting your expectations or you want to provide feedback on how the course is progressing for you, please contact us. If we are not aware about a problem, we won't know to fix it. If you would like to provide anonymous comments, please use the feedback form on the course home page or slide a note under our doors.

Resources[–]

Course Material

There is no textbook for this course. Lecture notes and other resources are provided through the Schedule tab of this page and on Ed. We do not require students to read lecture notes before lecture, but those who are interested in reading ahead can certainly do so.

The C0 Language

For most of the course, we use C0, a safe subset of C augmented with contracts. This language has been specifically designed to support the learning objectives in this course. It provides garbage collection (freeing students from dealing with low-level details of explicit memory management), fixed range modular integer arithmetic (avoiding complexities of floating point arithmetic and multiple data sizes), an unambiguous language definition (guarding against undefined behavior), and contracts (making code expectations explicit and localizing reasoning).

The C Language

Towards the end, the course transitions to C in preparation for subsequent systems courses. Emphasis is on transferring positive habits developed with the use of C0, and on practical advice for avoiding the pitfalls and understanding the idiosyncrasies of C. We use the valgrind tool to test proper memory management.

From C0 to C: Basics tutorial
C Language Libraries
- C string library <string.h>
POSIX library standard — Unix standard library definitions including a library search functionality

Programming Environments

You are welcome to use any programming environment that suits you to write your programming assignments. However, all programming homework will be graded by running them on a Unix system using Autolab — you may want to make sure they work on Andrew Unix. Popular environment choices include emacs, vim, VSCode, and sublime, but you should use what works for you: an environment that allows you to write code quickly and efficiently. Here are some useful links:

Unix	Emacs	Vim
Linux Commands Reference Card	Emacs Basics Game for learning emacs shortcuts Emacs Help Guide	Vim Introduction Vim Cheat Sheet type `vimtutor` at the terminal

Grading[–]

This is a 10 unit course.

Tasks and Percentages

24 homework assignments: 45%
- 12 weekly written assignments: 15% (due on Gradescope at 5pm EDT Monday and Fridays, see calendar for exceptions)
- 12 weekly programming assignments: 30% (due on Autolab at 11pm EDT Wednesday and Saturday, see calendar for exceptions)
To encourage good work and integrity, the instructors may invite students to their offices to explain their solutions. Should this happen, the students' explanations will become part of their grades for that homework.
- Assignments are individual unless explicitly instructed.
2 midterm exams: 12.5% each, in class, closed books, on Tuesday 18 July 2023 and Tuesday 1 August 2023
Final exam: 25%, 3 hours, closed books, on Friday 11 August 2023
Practice sessions and activities: 5%
Each practice session is graded on a 0-2.5 point scale, assigned as follows:
- 2.5 points for completing bonus exercises
- 2 points for completing standard exercises
- 1.5 point for completing some exercises but not quite enough to get a good practice
- 0 points for not attending or not working on exercises
There will be one activity worth 1 point during or after each lecture. Use the activities page to test your configuration and do the current activity (if there is one).
Each lab you attend gives you 1.5 points, assuming you pay attention and work on the exercises.
All you need to earn the 5% grade for this portion of the course is to accumulate 50 points overall. There are many more points than that for grabs, so no sweat if you miss a practice session or two. Do the math: the course has
- 25 graded practice sessions
- Activities on most days

We are aiming to have homework and exams graded within two days of submission.

Accessing and Monitoring your Grades

Posted grades are accessible by clicking on the Grades tab of this page. After authenticating, you will be able to see your current grades and a projection of where you are headed given your past performance in the class. Use this application to take action if the trajectory does not lead to the grade you are hoping for.

Evaluation Criteria

Your assignments and exams are evaluated on the basis of:

Correctness: Your arguments should make sense, your proofs should be valid, and your program should work in the reference environment.
Elegance: Written material should be of the same quality as what a professional would write. No typos, no bad grammar, clarity is paramount. You are also expected to write code with good programming style. See the Guide to Success on Coding with Style on Ed about what constitutes good style.
For a small subset of assignments, the course staff will review all final submissions by hand. If there are significant style issues, they may give a non-passing grade on style, accompanied by a “FIX STYLE” annotations in their notes. Students who are told to fix their style must address these issues and discuss their revisions with a TA within 3 days of the homework grades being posted. Any TA or instructor can do style re-grading at any office hour; you do not have to go to the TA that assigned the grade.

Late Policy

This is a fast-paced course. The late policy has the purpose to help students from falling behind.

There are no late days for written assignments. The course staff will begin grading written homeworks the same day they are due, so we cannot accept any late homework.
Each student has 3 late days for programming assignments and you may use at most one late day for each individual assignment. This means that for exactly 3 programming deadlines you can turn in your assignment within 24 hours after the deadline. Autolab handles these late days automatically — you don't have to email the course staff, just turn in the assignment between 1 second late and 24 hours after the deadline. You can find how many late days you have used by clicking on "Gradebook" in Autolab. Once you have ran out of late days, Autolab will accept and grade late submissions but will assign them 0 points.
We strongly advise students not to use late days in the first half of the course. Later assignments are more challenging and many courses have lots of deliverables towards the end of the semester. The second half of the semester is where late days are most needed. Note that the constraints of the semester timing mean that no late days can be used on the last programming assignment.

Aside from this, there will be no extensions on assignments in general. If you think you really really need an extension on a particular assignment, contact the instructors as soon as possible and before the deadline. Please be aware that extensions are entirely discretionary and will be granted only in exceptional circumstances outside of your control (e.g., due to severe illness or major personal/family emergencies, but not for competitions, club-related events or interviews). The instructors will require confirmation from University Health Services or your academic advisor, as appropriate.

Nearly all situations that make you run late on an assignment homework can be avoided with proper planning — often just starting early. Here are some examples:

I have so many deadlines this week: you know your deadlines ahead of time — plan accordingly.
It's a minute before the deadline and the network is down: you always have multiple submissions — it's foolish to wait for the deadline for your first submission.
My computer crashed and I lost everything: Use Dropbox or similar to do real-time backup — recover your files onto AFS and finish your homework from a cluster machine.
My fraternity/sorority/club has that big event that is taking all my time: Schedule your extra-curricular activities around your classes, not vice versa.

Grade Appeals

We make mistakes too!
After each exam and homework assignment is graded, you will be able to access your score by clicking on the Grades tab of this page. We will make the utmost effort to be fair and consistent in our grading. If you notice any grading mistakes, proceed as follows:

Go to office hours and describe the putative grading mistake to a TA.
If the TA does not resolve the matter convincingly, you may request a regrade as follows: Write an email to Oliver Moss explaining where and why you think there was a mistake in grading. Make sure to specify which homework or exam this appeal is for. Write at most 3 lines for each response you are disputing.

Email requests to the course staff will not be accepted. Please do not make regrade requests on Ed.

All regrade requests must be received within 3 days of the work being handed back on Gradescope or Autolab, which we will announce in a Ed post.

Final Grades

This class is not curved. However, to ensure consistency across semesters, we set our grading standards in such a way as to compensate for the relative difficulty of exams.

What follows is a rough guide to how course grades will be established, not a precise formula — we will fine-tune cutoffs and other details as we see fit after the end of the course. This is meant to help you set expectations and take action if your trajectory in the class does not take you to the grade you are hoping for (see also the Grades tab on this page). So, here's a rough, very rough heuristics about the correlation between final grades and total scores:

A: above 90%
B: 80-90%
C: 70-80%
D: 60-70%

This assumes that the makeup of a student’s grade is not wildly anomalous: exceptionally low overall scores on exams, programming assignments, or written assignments will be treated on a case-by-case basis. In particular, students who are unable to demonstrate a basic proficiency with the C language in the last few programming assignments will receive a D in the class (this is because 15-122 is a prerequisite to 15-213, a very C-intensive course). For reference, almost a quarter of the students who received a B in Fall 2014 had a 90-100% average on programming assignments, an 80-90% average on written homeworks, and a 70-80% average on exams.

Precise grade cutoffs will not be discussed at any point during or after the semester. For students very close to grade boundaries, instructors may, at their discretion, consider participation in lecture and practice sessions, exam performance and overall grade trends when assigning the final grade.

Academic Integrity[–]

You are expected to comply with the University Policy on Academic Integrity (see also The Word and Understanding Academic Integrity). The university policies and procedures on academic integrity will be applied rigorously. All students are required to fill out a form as part of their first assignment indicating that they understand and accept this policy.

The value of your degree depends on the academic integrity of yourself and your peers in each of your classes. It is expected that, unless otherwise instructed, the work you submit as your own is your own work and not someone or something else’s work or a collaboration between yourself and other(s).

The Policy (Summer'23)

You are allowed to clarify the writeup of homework assignments with other students, but not work on a solution or brainstorm answers with them.

You are welcome to freely discuss course material (lecture notes/slides, practice exams, lab handouts, recitation handouts, blank writtens and programming writeups) as well as to review graded assignments with students taking the course in the current semester. You may give or receive help with computer systems, compilers, debuggers, profilers, or other facilities (as long as answers and/or code are never visible).

You are not allowed to refer to solutions and/or code written by past or present students, or found on the web, not even to "double-check" your own solution. You may not post code from this course publicly (e.g., to Bitbucket or GitHub).

You are not allowed to use any materials from previous iterations of the course, including your own. You may not discuss or receive any help on homework assignments with students who have previously taken the course (excluding current TAs).

We will be using the MOSS system to detect software plagiarism. Whenever a programming assignment is similar to a homework from a previous course edition, we will run MOSS on all submissions from that edition as well. All solutions from the Web are also in MOSS — you should assume that if you were able to find it, we have already found it.

If you are uncertain whether your actions will violate this policy, please reach out to a member of course staff to ask beforehand.

Penalties and Specifics

Please read the University Policy on Academic Integrity carefully to understand the penalties associated with academic dishonesty at Carnegie Mellon. In this class, cheating/copying/plagiarism means obtaining all or part of a program or homework solution from another student or unauthorized source such as the Internet, having someone else do a homework or take an exam for you, knowingly or by negligence giving such information to another student, reusing answers or solutions from previous editions of the course, or giving or receiving unauthorized information during an examination. In general, each solution you submit (written assignment, programming assignment, midterm or final exam) must be your own work. In the event that you use information written by others in your solution, you must cite the source of this information (and receive prior permission if unsure whether this is permitted). It is considered cheating to compare complete or partial answers, copy or adapt others' solutions, or sit near another person who is taking (or has taken) the same course and complete the assignment together. Working on code together, showing code to another student and looking at another student's code are considered cheating. If you need help debugging, make a post on Ed or go to office hours. It is also considered cheating for a repeating student to reuse one's solutions from a previous semester, or any instructor-provided sample solution. It is a violation of this policy to hand in work for other students.

Your course instructors reserve the right to determine an appropriate penalty based on the violation of academic dishonesty that occurs. Penalties are severe: a typical violation of the university policy results in the student failing this course, but may go all the way to expulsion from Carnegie Mellon University. If you have any questions about this policy and any work you are doing in the course, please feel free to contact the instructors for help.

Repeat Students

If you took this course in full or in part in a past semester, we ask that you delete your previous work so you won't look at it. In particular, copying one's own solutions from an earlier semester is a violation of the academic integrity policy and will be handled as such. Doing so may save time close to a deadline but it will not have the effect of learning the material, which will be a serious handicap in exams.

Other Policies[–]

Class presence and participation

Active participation by you and other students will ensure that everyone has the best learning experience in this class. We may take participation in lecture and practice sessions into account when setting final grades. Fire safety rules require that we never exceed the stated capacity of a classroom or cluster. For this reason, we require that you attend the lecture, lab, and recitation you are registered for.

Laptops and mobile devices

As research on learning shows, unexpected noises and movement automatically divert and capture people's attention, which means you are affecting everyone's learning experience if your cell phone, pager, laptop, etc, makes noise or is visually distracting during class. Therefore, please silence all mobile devices during class. You may use laptops for note-taking only, but please do so from the back of the classroom. Do not work on assignments for this or any other class while attending lecture or practice sessions.-->

Students with disabilities

If you wish to request an accommodation due to a documented disability, please inform your instructor and contact Disability Resources as soon as possible (access@andrew.cmu.edu). Once your accommodation has been approved, you will be able to request extra-time for each exam separately by filling this form a week in advance.

Research to Improve the Course

For this class, we are conducting research on student outcomes. This research will involve your work in this course. You will not be asked to do anything above and beyond the normal learning activities and assignments that are part of this course. You are free not to participate in this research, and your participation will have no influence on your grade for this course or your academic career at CMU. If you do not wish to participate or if you are under 18 years of age, please send an email to Chad Hershock (hershock@cmu.edu) with your name and course number. Participants will not receive any compensation. The data collected as part of this research may include student grades. All analyses of data from participants’ coursework will be conducted after the course is over and final grades are submitted. The Eberly Center may provide support on this research project regarding data analysis and interpretation. The Eberly Center for Teaching Excellence & Educational Innovation is located on the CMU-Pittsburgh Campus and its mission is to support the professional development of all CMU instructors regarding teaching and learning. To minimize the risk of breach of confidentiality, the Eberly Center will never have access to data from this course containing your personal identifiers. All data will be analyzed in de-identified form and presented in the aggregate, without any personal identifiers. If you have questions pertaining to your rights as a research participant, or to report concerns to this study, please contact Chad Hershock (hershock@cmu.edu).

Getting Help[–]

Personal Health

Take care of yourself.

Do your best to maintain a healthy lifestyle this semester by eating well, exercising, avoiding drugs and alcohol, getting enough sleep and taking some time to relax. This will help you achieve your goals and cope with stress.

All of us benefit from support during times of struggle. You are not alone. There are many helpful resources available on campus and an important part of the college experience is learning how to ask for help. Asking for support sooner rather than later is often helpful.

If you or anyone you know experiences any academic stress, difficult life events, or feelings like anxiety or depression, we strongly encourage you to seek support. Counseling and Psychological Services (CaPS) is here to help: call 412-268-2922 and visit their website. Consider reaching out to a friend, faculty or family member you trust for help getting connected to the support that can help.

If you or someone you know is feeling suicidal or in danger of self-harm, call someone immediately, day or night:

CaPS: 412-268-2922
Re:solve Crisis Network: 888-796-8226
If the situation is life threatening, call the police:

If you have questions about this or your coursework, please let us know.

Communication

For assistance with the written or oral communication assignments in this class, visit the Global Communication Center (GCC). GCC tutors can provide instruction on a range of communication topics and can help you improve your papers and presentations. The GCC is a free service, open to all students, and located in Hunt library. You can make tutoring appointments directly on the GCC website. You may also visit the GCC website to find out about communication workshops offered throughout the academic year.

External Academic Support

The Office of Academic Development is providing various services aimed at helping students master the contents of this course. These optional services are free and voluntary. They are led by trained leaders who have successfully completed the course. Leaders are not members of the course staff. These services are are designed to supplement — not replace — class lectures and recitations. They do not cover homework.

One-on-one Tutoring: Students can book an appointment for a virtual tutoring session.

We ask that students do not seek help from upperclassmates who have successfully completed the course. Doing so often leads to violations of the academic integrity policy of the course. In particular, upper-classmates found to violate this policy will be reported and will incur a grade penalty.

Learning Objectives[–]

Computational Thinking

Students who complete this course should be able to explain abstraction and other key computer science concepts, apply these fundamental concepts as problem-solving tools, and wield contracts as a tool for reasoning about the safety and correctness of programs. In particular, we expect students to be able to:

develop contracts (preconditions, postconditions, assertions, and loop invariants) that establish the safety and correctness of imperative programs.
develop and evaluate proofs of the safety and correctness of code with contracts.
develop and evaluate informal termination arguments for programs with loops and recursion.
evaluate claims of both asymptotic complexity and practical efficiency of programs by running tests on different problem sizes.
define the concept of programs as data, and write programs that use the concept.
defend the use of abstractions and interfaces in the presentation of algorithms and data structures.
identify the difference between specification and implementation.
compare different implementations of a given specification and different specifications that can be applied to a single implementation.
explain data structure manipulations using data structure invariants.
identify and evaluate the use of fundamental concepts in computer science as problem-solving tools:
1. order (sorted or indexed data),
2. asymptotic worst case, average case, and amortized analysis,
3. randomness and (pseudo-)random number generation, and
4. divide-and-conquer strategies.

Programming Skills

Students who complete this course should be able to read and write code for imperative algorithms and data structures. In particular, we expect students to be able to:

trace the operational behavior of small imperative programs.
identify, describe, and effectively use basic features of C0 and C:
1. integers as signed modular arithmetic,
2. integers as fixed-length bit vectors,
3. characters and strings,
4. Boolean operations with short-circuiting evaluation,
5. arrays,
6. loops (while and for),
7. pointers,
8. structs,
9. recursive and mutually recursive functions,
10. void pointers and casts between pointer types,
11. generic data structures using void and function pointers,
12. contracts (in C0), and
13. casts between different numeric types (in C).
translate between high-level algorithms and correct imperative code.
translate between high-level loop invariants and data structure invariants and correct contracts.
write code using external libraries when given a library interface.
develop, test, rewrite, and refine code that meets a given specification or interface.
develop and refine small interfaces.
document code with comments and contracts.
identify undefined and implementation-defined behaviors in C.
write, compile, and test C programs in a Unix-based environment using make, gcc, and valgrind.

Algorithms and Data Structures

Students who complete this course should be able to describe the implementation of a number of basic algorithms and data structures, effectively employ those algorithms and data structures, and explain and interpret worst-case asymptotic complexity arguments. In particular, we expect students to be able to:

determine the big-O complexity of common code patterns.
compare common complexity classes like O(1), O(log n), O(n), O(n log n), O(n2), and O(2n).
explain the structure of basic amortized analysis proofs that use potential functions.
apply principles of asymptotic analysis and amortized analysis to new algorithms and data structures.
recognize properties of simple self-adjusting data structures.
recognize algorithms and data structures using divide-and-conquer.
describe and employ a number of basic algorithms and data structures:
1. integer algorithms,
2. linear search,
3. binary search,
4. sub-quadratic complexity sorting (mergesort and quicksort),
5. stacks and queues,
6. pseudo-random number generators,
7. hash tables,
8. priority queues,
9. balanced binary search trees,
10. disjoint-set data structures (union/find), and
11. simple graph algorithms.

Course Staff

Mascot

Honk !

Mascot's Buddy

Chonk

Instructors

Asa Frank

Cooper Pierce

Course Administrative Assistant

Oliver Moss

GHC 6006

Teaching Assistants

Alex Blass	Rishabh Cowlagi
Liz Chu	Jessie Fan
Arthur Jakobsson	James Kim
Delaynie McMillan	Arnav Sabharwal
Abby Shrack	Justin Sun
Tom Tang	Gongwei (David) Wang
Ruiqi Wang	Yu-Ching Wu
Zhaowei Zhang	Rocky Zhou

Schedule of Classes

At a glance ...

Outline[+]

Weeks 1-2	Weeks 3-4	Weeks 5-6
Deliberate programming	Data structures	Transition to C

Before the First Day of Classes

You will be learning one of the early topics of 15-122 independently using the OLI platform. Here's how to access it:

Navigate to: https://oli.cmu.edu
In the upper right of the page, click "Sign In"; on the following screen, click on "CMU users sign in here" and log in using your Andrew credentials
On the "My courses" page, enter the course key into the "Register for a course" text box.
This is the course key to enter: ip-m23

Once on OLI, complete all of Unit 3 (modules 6-8), i.e., read the material and do the activities — it will take around 60 minutes. While we will start using this material only with lecture 2, we strongly encourage you to complete these modules by Sunday, July 2nd (you will likely not have time once the course starts)

Mon 3 Jul Lecture 0	Welcome and Course Introduction We outline the course, its goals, and talk about various administrative issues. Readings: Course syllabus Overview slides A mysterious function ... We examine a program we know nothing about, making hypotheses about what it is supposed to do. We notice that this function has no meaningful output for some inputs, which leads us to restricting its valid inputs using preconditions. We use a similar mechanism, postconditions, to describe the value it returns. Along the way, we get acquainted to C0 and its support for checking pre- and post-conditions. We then notice that this function doesn't return the expected outputs even for some valid inputs ... Concepts: Pre- and post-conditions Testing Contract support in C0 Readings: Lecture notes (sections 1-2, 9-11) Review slides (through "Loop Invariants") Code
Mon 3 Jul Practice 1	Setup This session practices using Linux and running the C0 interpreter and compiler. lab 1 exercises and starter code Step-by-step video
Tue 4 Jul	No class (Independence Day)
Wed 5 Jul Lecture 1	Contracts Contracts are program annotations that spell out what the code is supposed to do. They are the key to connecting algorithmic ideas to their implementation as a program. In this lecture, we illustrate the use of contracts by means of a simple C0 program. As we do so, we learn to verify loop invariants — an important type of contract, we see how contracts can help us write correct code, and we get acquainted with C0's automated support to validating contracts. Concepts: Loop invariants Assertions Using contracts to write correct programs Contract support in C0 Lecture notes (oli, pdf) Ints lecture notes Review slides Ints review slides Code
Wed 5 Jul Practice 2	C0 Basics This session reviews elementary C0 constructs and practices reasoning about code. Exercises
Thu 6 Jul Lecture 2	Arrays In this lecture, we examine arrays as our first composite data structure, i.e., a data construction designed to hold multiple values, together with operations to access them. Accessing an array element outside of its range is undefined — it is a safety violation — and we see how to use contracts, in particular loop invariants, to ensure the safety of array accesses in a program. Arrays are stored in memory, which means that they are manipulated through an address. This raises the possibility of aliasing, a notorious source of bugs in carelessly written programs. Arrays Memory allocation Safe access Loop invariants for arrays Aliasing Lecture notes (oli, pdf) Review slides Code
Thu 6 Jul Practice 3	A Bit about Bytes This session practices base conversion and writing code that manipulates bits. Exercises
Fri 7 Jul Lecture 3	Searching Arrays We practice reasoning about arrays by implementing a function that searches whether an array contains a given value — this is the gateway to a whole class of useful operations. We notice that this function returns its result more quickly when the array is sorted. We write a specialized variant that assumes that the array is sorted, and show that it works correctly by reasoning about array bounds. The first (simpler but less efficient) version acts as a specification for the the second (slightly more complex but often faster). Using the specification in the contract for the implementation is a standard technique to help writing correct code. Linear search Reasoning about arrays Sorted arrays Performance as number of operations executed Specification vs. implementation Lecture notes (oli, pdf) Review slides Code
Fri 7 Jul Practice 4	TA Training This session practices testing code. Exercises

Mon 10 Jul Lecture 4	Big-O We examine big-O notation as a mathematical tool to describe the running time of algorithms, especially for the purpose of comparing two algorithms that solve the same problem. As a case study, we use the problem of sorting an array, and for now a single sorting algorithm, selection sort. As we implement selection sort, we see that starting with contracts gives us high confidence that the resulting code will work correctly. Along the way, we develop a useful library of functions about sorted arrays to be used in contracts. Big-O notation Selection sort Deliberate programming Asymptotic complexity analysis Lecture notes (oli, pdf) Review slides arrayutil.c0 Code Selection sort folk dance
Mon 10 Jul Practice 5	Function Family Reunion This session practices understanding and using big-O notation. Exercises
Tue 11 Jul Lecture 5	Binary search When searching for a value in a sorted array, examining the middle element allows us to discard half of the array in the worst case. The resulting algorithm, binary search, has logarithmic complexity which is much better than linear search (which is linear). Achieving a correct imperative implementation can be tricky however, and we use once more contracts as a mechanism to reach this goal. Binary search Divide-and-conquer Deliberate implementation Checking complex loop invariants Lecture notes (oli, pdf) Review slides Code
Tue 11 Jul Practice 6	Fibonacci has Bad Internet This session practices working with algorithms with radically different complexity for the same problem. Exercises
Wed 12 Jul Lecture 6	Quicksort We use the key idea underlying binary search to implement two sorting algorithms with better complexity than selection sort. We examine one of them, quicksort, in detail, again using contracts to achieve a correct implementation, this time a recursive implementation. We observe that the asymptotic complexity of quicksort depends on the the value of a quantity the algorithm use (the pivot) and discuss ways to reduce the chances of making a bad choice for it. We conclude by examining another sorting algorithm, mergesort, which is immune from this issue. Quicksort Deliberate programming Recursion Best, average, and worst case complexity Randomness Choosing an algorithm for a problem Lecture notes Review slides Code Mergesort folk dance Quicksort folk dance
Wed 12 Jul Practice 7	A Strange Sort of Proof This session reviews proving the correctness of functions. Exercises
Thu 13 Jul Lecture 7	Libraries Arrays are homogeneous collections, where all components have the same type. `struct`s enable building heterogeneous collections, that allow combining components of different types. They are key to building pervasively used data structures. In C0, a `struct` resides in allocated memory and is accessed through an address, which brings up a new form of safety violation: the `NULL` pointer violation. We extend the language of contracts to reason about pointers. Now that we have a two ways to build complex collections, we start exploring the idea of segregating the definition of a data structure and the operations that manipulate it into a library. Code that uses this data structure only needs to be aware of the type, operations and invariants of the data structure, not the way they are implemented. This is the basis of a form of modular programming called abstract data types, in which client code uses a data structure exclusively through an interface without being aware of the underlying implementation. Pointers Structs Abstract data types Interfaces, client code and library code Data structure invariants Testing Lecture notes (oli, pdf) Review slides (pointers and structs) Review slides (libraries) Code Tony Hoare about the `NULL` pointer (at 27:30)
Thu 13 Jul Practice 8	Misclaculation This session practices understanding postfix notation and stack-based machines. Exercises
Fri 14 Jul Lecture 8	Stacks and Queues In this lecture, we examine the interface of two fundamental data structures, stacks and queues. We practice using the exported functions to write client code that implements operations of stacks and queues that are not provided by the interface. By relying only of the interface functions and their contracts, we can write code that is correct for any implementation of stacks and queues. Interface of stacks and queues Using an interface Lecture notes (oli, pdf) Review slides Code How queues can turn your life around
Fri 14 Jul Practice 9	A `queue_t` Interface This session practices programming against an interface. Exercises

Mon 17 Jul Lecture 9	Linked Lists We observe that we can implement array-like collections using a `struct` that packages each element with a pointer to the next element. This idea underlies linked lists, a data structure pervasively used in computer science. Writing correct code about linked lists is however tricky as we often rely on stricter invariants than natively supported, in particular the absence of cycles. We develop code to be used in contracts to check for common such properties. We then use linked lists to write code that implements the stack interface, and similarly for queues. We could have given an array-based implementation, and we note the advantages and drawbacks of each choice. Linked lists Checking data structure invariants Linked list implementation of stacks and queues Choosing an implementation: trade-offs Lecture notes (oli, pdf) Review slides Code
Mon 17 Jul Practice 10	List(en) Up! This session practices working with linked lists. Exercises
Tue 18 Jul Midterm 1	Midterm 1
Wed 19 Jul Lecture 10	Unbounded Arrays When implementing a data structure for a homogeneous collection, using an array has the advantage that each element can be accessed in constant time, but the drawback that we must fix the number of elements a priori. Linked lists can have arbitrary length but access takes linear time. Can we have the best of both worlds? Unbounded arrays rely on an array to store the data, but double it when we run out of place for new elements. The effect is that adding an element can be either very cheap or very expensive depending on how full the array is. However, a series of insertions will appear as if each one of them takes constant time on average. Showing that this is the case requires a technique called amortized analysis, which we explore at length in this lecture. Better trade-offs Amortized analysis Unbounded arrays Lecture notes Review slides The k-bit counter (10 minute video) Amortized analysis (15 minute video) Code
Wed 19 Jul Practice 11	Link it All Together This session practices working with linked lists. Exercises
Thu 20 Jul Lecture 11	Hashing Associative arrays are data structures that allow efficiently retrieving a value on the basis of a key: arrays are the special case where valid indices into the array are the only possible keys. One popular way to implement associative arrays is to use a hash table, which computes an array index out of each key and uses that index to find the associated value. However, multiple keys can map to the same index, something called a collision. We discuss several approaches to dealing with collisions, focusing on one called separate chaining. The cost of access depends on the contents of the hash table. While a worst case analysis is useful, it is not typically representative of normal usage. We compute the average case complexity of an access relative to a few simple parameters of the hash table. Genericity — part I: void pointers Associative arrays AKA dictionaries AKA maps Implementation using hash tables Dealing with collisions Randomness Average case complexity Lecture notes (generic pointers) `elem` (sec. 1) `void*` (sec. 4) Review slides (generic pointers) Lecture notes (hashing) Review slides (hashing) Code
Thu 20 Jul Practice 12	Hash This! This session practices understanding collisions in hash functions. Exercises
Fri 21 Jul Lecture 12	Hash Dictionaries In this lecture, we look at the interface of modular code at greater depth, using hash functions as a case study. In this and many example, it is useful for the client to fill in some parameters used by the library code so that it delivers the functionalities needed by the client. One such parameter is the type of some quantities the library acts upon, keys in our case. It is also often the case that the client wants to provide some of the operations used by the library, here how to hash a key and how to compare elements. This is a first step towards making libraries generic, so that they implement the general functionalities of a data structure but let the client choose specific details. Adaptable libraries Client-supplied operations Lecture notes Review slides Code
Fri 21 Jul Practice 13	Array Disarray This session practices coding to achieve amortized cost. Exercises

Mon 24 Jul Lecture 13	Generic Data Structures In large (and not-so-large) systems, it is common to make multiple uses of the same library, each instantiated with different parameters. This is not possible in C0, however. To achieve this goal, we look at a richer language, called C1. C1 provides two new features: generic pointers and function pointers. Generic pointers, `void *` in C, allow a same library type to be instantiated with different client types at once, which gives us a way to use a hash table library with both integers and strings as keys for example. Function pointers allow a library to be instantiated with different client-supplied operations in the same program. Genericity — part II: function pointers Lecture notes Review slides function pointers generic hash dictionaries Code
Mon 24 Jul Practice 14	Legacy of the void* This session practices defining generic libraries. Exercises
Tue 25 Jul Lecture 14	Binary Search Trees We discuss trees as an approach to representing a collection, and binary search trees as an effective data structure to store and operate on sorted data. In particular, most operations on balanced binary search trees have a logarithm cost on the number of contained data. Binary search trees can however become unbalanced over time. Trees Binary search trees Ordering invariant Exponential speedup Lecture notes Review slides Code
Tue 25 Jul Practice 15	This One's a Treet This session practices using dictionaries to avoid recomputing values. Exercises
Wed 26 Jul Lecture 15	AVL trees Self-balancing trees guarantee the performance advantages of binary search trees by making sure that common operations keep the tree roughly balanced. This assurance comes at the price of more complex code for these operations, which rely on more complex invariants to tame it. AVL trees AVL invariants Rotations Experimental validation Lecture notes Review slides Code Animation
Wed 26 Jul Practice 16	Rotating Rotations This session practices restoring height invariants in trees. Exercises
Thu 27 Jul Lecture 16	Priority Queues We discuss priority queues, another classical data structure. Among the possible ways to implement priority queues, we consider min-heaps, a tree-based strategy that provides superior performance. We examine the data structure invariants of min-heaps and observe that we need to violate them during operations such as insertion. To conclude with, we see that min-heaps are amenable to an efficient implementation based on arrays. Priority queues Implementation strategies Heap-based implementation Implementing heaps using arrays Lecture notes Review slides Code Animation
Thu 27 Jul Practice 17	Mind your P's and Q's This session practices using priority queues. Exercises
Fri 28 Jul Lecture 17	Restoring Invariants In this lecture, we write a library that implements min-heaps using arrays. Of particular interest is how the temporary violation of invariants needed to implement min-heap operations manifests at the level of contracts. Careful pre- and post-conditions are key to writing their code correctly. Temporary violation of invariants Controlling invariant violations using contracts Min-heap library implementation Heapsort Lecture notes Review slides Code
Fri 28 Jul Practice 18	Heaps of Fun This session practices using priority queues. Exercises

Mon 31 Jul Lecture 18	Introduction to C With this lecture, we are moving from the safety of C0 to the more open-ended world of C. We introduce some basic concepts of C, in particular the C preprocessor and how macros written in this language can simulate some of the effects of C0's contracts. We also see how to compile C programs and discuss separate compilation. We conclude with C's primitives to manage memory, in particular the need to free allocated memory to prevent memory leaks. The C preprocessor Macros Contracts in C Compilation of C programs Allocation and deallocation Lecture notes Review slides Code
Mon 31 Jul Practice 19	From C1 to Shining C This session practices the main novelties of C. Exercises
Tue 1 Aug Midterm 2	Midterm 2
Wed 2 Aug Lecture 19	C's Memory Model C provides a very flexible view of memory, which allows writing potentially dangerous code unless one is fully aware of the consequences of one's decision. This lecture is about building this awareness. We see that, while C overlays an organization on the monolithic block of memory the operating systems assigns to a running program, it also provides primitives that allow violating this organization. We focus on two such primitives, pointer arithmetic and address-of. While some uses are legitimate, others are not. C's approach to many non-legitimate operations is to declare them undefined, which means that what happens when a program engages in them is decided by the specific implementation of the C compiler in use. C's memory layout Pointer arithmetic Undefined behavior The address-of operator Lecture notes Review slides The C99 standard (all 552 pages of it) Code
Wed 2 Aug Practice 20	Once you C1 you C them all This session practices using translating C0 code to C and managing memory leaks. Exercises
Thu 3 Aug Lecture 20	Types in C In this lecture, we examine how C implements basic types, and what we as programmers need to be aware of as we use them. We begin with strings, that in C are just arrays of characters with the `null` character playing a special role. A variety of number types are available in C, but some of their characteristics are not defined by the language, most notably their size and what happens in case of overflow. As a consequence, different compilers make different choices, which complicates writing code that will work on generic hardware. Strings in C Casting Numbers in C Implementation-defined behavior Lecture notes Review slides Code
Thu 3 Aug Practice 21	C-ing is Believing This session practices advanced C constructions. Exercises
Fri 4 Aug Lecture 21	Virtual Machines Getting a program written in a high-level language to run onto the machine hardware can be achieved in a number of ways, with compilation to machine code and interpretation of the source language as the two extremes. We assess the benefits and drawbacks of each and introduce virtual machines as a modern trade-off. We explore this possibility through the C0VM, a virtual machine for C0. A compilation phase takes C0 code as input and outputs bytecode that can be run by the C0VM. We examine in depth the organization of the C0VM itself, understanding how its instructions are executed through a description called an operational semantics. We also describe some of the data structures needed to implement the C0VM itself. Interpreters vs. compilers Virtual machines Programs as data Transformation to bytecode Operational semantics Bytecode validation Lecture notes Review slides Code
Fri 4 Aug Practice 22	All sorts of sorts This session practices working with pointers in C. Exercises

Mon 7 Aug Lecture 22	Graph Representation Graphs provide a uniform way to represent many complex problems, for example the moves of a game. We define a minimal interface for working with undirected graphs and discuss two implementation strategies: adjacency matrices and adjacency lists, emphasizing the pros and cons of each. We also notice that not all graph-based problems need — or can — use an explicit representation of the underlying graph. Graphs Implicit vs. explicit graphs Adjacency matrices Adjacency lists Lecture notes Review slides Code
Mon 7 Aug Practice 23	passwordLab This session practices understanding C0VM bytecode. Exercises
Tue 8 Aug Lecture 23	Graph Search When working with graphs, one basic question is whether a node is reachable from another node by following a path. That destination node is often described by a property of interest — e.g., being a winning board in the graph representing the moves of a game. We examine various approaches to solving the reachability programs, in particular depth-first and breadth-first, each of which has its own advantages. We then discuss various approaches to implementing these strategies. Path between nodes Depth-first search Breadth-first search Implementation strategies Lecture notes Review slides Code
Tue 8 Aug Practice 24	Computing on the Edge This session practices working with graphs. Exercises
Wed 9 Aug Lecture 24	Spanning Trees Given a starting node in a graph, it is often useful to superimpose onto the graph a way to visit every each of the remaining node uniquely by following edges. This is a spanning tree for that graph. Things get particularly interesting for graphs whose edges carry a weight (e.g., the distance between cities in the graph representing a road map). Then, spanning trees with the smallest cumulative weight are really interesting — they are called minimum spanning trees. We discuss Kruskal's algorithm, a classical method for computing a minimum spanning tree. Spanning trees Minimum spanning trees Kruskal's algorithm Lecture notes Review slides
Wed 9 Aug Practice 25	Spend some Cycles Thinking This session practices working with graphs. Exercises
Thu 10 Aug Lecture 25	Union-Find A collection of nodes having a given property in a graph — e.g., being a minimum spanning tree — can be represented in many ways, for example as any permutation of the list consisting of these nodes. All these ways are equivalent, and indeed they form an equivalence class. At each step of Kruskal's algorithm, some of these equivalence classes need to be combined into bigger equivalence classes, while ensuring that the underlying property is still maintained — that of the result being a spanning tree. The union-find data structure maintains a canonical representative of this class. The union-find algorithm efficiently determines a canonical representative when merging two equivalence classes. Equivalence classes Canonical representative The union-find data structure The union-find algorithm Lecture notes Review slides Code Animation
Thu 10 Aug Practice 26	Final review
Fri 11 Aug (12a-3p) final	Final

2023 Iliano Cervesato

iliano@cmu.edu