Overview
Title: Cloud Computing
Units: 15-319: 9 units; 15-619: 12 units.
Pre-requisites: A grade of "C" or better in 15-213, Introduction to Computer Systems
Recitations: Every Tuesday at 8 AM and Thursday at 4:30 PM Pittsburgh GHC 4307.
Webpage: http://www.cs.cmu.edu/~msakr/15619-f14/
OLI Portal: http://oli.cmu.edu accessed through the Carnegie Mellon Blackboard Portal: http://blackboard.andrew.cmu.edu
Description
This on-line course gives students an overview of the field of Cloud Computing, its enabling technologies, main building blocks, and hands-on experience through 4 projects utilizing a public cloud (Amazon Web Services). Cloud computing services are being adopted widely across a variety of organizations and in many domains. Simply, cloud computing is the delivery of computing as a service over a network, whereby distributed resources are rented, rather than owned, by an end user as a utility.
The course will introduce this domain and cover the topics of data centers, virtualization, cloud storage, and programming models. As an introduction, we will discuss the motivating factors, benefits, challenges, and service models. Modern data centers enable many of the economic and technological benefits of the cloud paradigm; hence, we will describe several concepts behind data center design and management. Next, we will focus on virtualization as a key cloud technique for offering software, computation and storage services. We will study how CPU, memory and I/O resources are virtualized, with examples from Xen and VMWare, and present real use cases such as Amazon EC2. Subsequently, students will learn about different cloud storage concepts including data distribution, durability, consistency and redundancy. HDFS, PVFS, HBASE, Cassandra and S3 will be presented as case studies. Students will understand the details of the MapReduce programming model and gain a broad overview of alternative programming models such as Pregel, and GraphLab.
Students will work with Amazon Web Services, use them to rent and provision compute resources and then program and deploy applications that run on these resources. In addition, students will work with cloud storage systems and learn to develop applications in the MapReduce programming paradigm. The 15-619 students will have to complete an extra project which entails designing and implementing a complete web-service solution for querying big data. For the extra project, the students are evaluated based on the cost and performance of their web services.
Teaching Staff
Prof. Majd F. Sakr msakr@cs.cmu.edu, GHC 7006, x8-1161 Office hours: Tuesday, 3-4pm (Pittsburgh)
Teaching Assistants:
Andi Ni andin@andrew.cmu.edu
Bin Feng bfeng@andrew.cmu.edu
Felipe Faraco fsoaresf@andrew.cmu.edu
Jialiang Lin jialianl@andrew.cmu.edu
Jiang Xue jiangx@andrew.cmu.edu
Jiten Mehta jitenm@andrew.cmu.edu
Junqi Wang junqiw@andrew.cmu.edu
Kasipan Kanniah kkanniah@andrew.cmu.edu
Lina Li linal@andrew.cmu.edu
Lu Qu lqu@andrew.cmu.edu
Lu Zeng luzeng@andrew.cmu.edu
Luning Pan luningp@andrew.cmu.edu
Mrigesh Kalvani mkalvai@andrew.cmu.edu
Preston Lin yuyul@andrew.cmu.edu
Ravi Chandra Bandlamudi Venkata rbandlam@andrew.cmu.edu
Suhail Rehman suhailr@andrew.cmu.edu
Vivek Munagala vpm@andrew.cmu.edu
Xiaokang Zhang xiaokanz@andrew.cmu.edu
Yiqi Wu yiqiw@andrew.cmu.edu
Yishuang Pang yishuanp@andrew.cmu.edu
Yu Wu yuwu1@andrew.cmu.edu
TAs hold office hours in GHC 4122 and 4126, office hours posted on Piazza.
Objectives
In this on-line course we plan to give students an overview of the field of Cloud Computing, and an in-depth study into its enabling technologies and main building blocks. Students will gain hands-on experience solving relevant problems through projects that will utilize existing public cloud tools. It is our objective that students will develop the skills needed to become a practitioner or carry out research projects in this domain. Specifically, the course has the following objectives:
Students will learn:
- the fundamental ideas behind Cloud Computing, the evolution of the paradigm, its applicability; benefits, as well as current and future challenges;
- the basic ideas and principles in data center design and management;
- different cloud resource management and sharing approaches;
- about cloud storage technologies and relevant distributed file systems;
- the variety of programming models and develop working experience in one of them.
Learning Outcomes:
The primary learning outcomes of this course are five-fold. Students will be able to:
- Explain the core concepts of the cloud computing paradigm: how and why this paradigm shift came about, the characteristics, advantages and challenges brought about by the various models and services in cloud computing.
- Apply the fundamental concepts in datacenters to understand the tradeoffs in power, efficiency and cost.
- Identify resource management fundamentals, i.e. resource abstraction, sharing and sandboxing and outline their role in managing infrastructure in cloud computing.
- Illustrate the fundamental concepts of cloud storage and demonstrate their use in storage systems such as Amazon S3 and HDFS.
- Analyze various cloud programming models and apply them to solve problems on the cloud.
Detailed learning outcomes are in the PDF Version
Course Organization
Your participation in the course will involve several forms of activity:
- Going through the Online coursework content for each unit
- Completing the inline activities for each unit (“Learn by doing” activities and “Did I get this” Review activities)
- Completing the graded checkpoint quizzes after each unit.
- Programming projects are performed on AWS and submitted through OLI.
Getting Help
For urgent communication with the teaching staff, it is best to send an email (preferred) or call the office phone. If you want to talk to a staff member in person, remember that our posted office hours are merely nominal times when we guarantee that we will be in our offices. You are always welcome to visit us outside of our office hours if you need help or want to talk about the course.
We ask that you follow a few simple guidelines. Prof. Sakr normally work with his office door open and welcome visits from students whenever the doors are open. However, if his door is closed, then he is busy with a meeting or a phone call and should not be disturbed.
We will use the course web-page as the central repository for all information about the class. Using the web-page, you can:
- Obtain copies of any handouts or assignments. This is especially useful if you miss class or you lose your copy.
- Find links to any electronic data you need for your assignments
- Read clarifications and changes made to any assignments, schedules, or policies.
- Provide healthy feedback about the course
Policies
Working Alone on Projects
Projects that are assigned to single students should be performed individually.
Handing in Projects
All assignments/projects are due at 11:59 PM EST (one minute before midnight) on the specified due date. All hand-ins are electronic, and use the OLI Checkpoint system or require Autolab. This will be specified in the individual projects.
Appealing Grades
After each project phase is graded, you have seven calendar days to appeal your grade. All your appeals should be provided in writing. If you are still not satisfied, please come and visit Prof. Sakr. If you have questions about an exam grade, please visit Prof. Sakr directly.
Assessment
Final Grade Assignment and Assessment methods
Inline activities (“Learn by Doing” and “Did I Get This”), which are present in most pages in the OLI course, are simple, non-graded activities to assess your comprehension of the material as you read through the course material. You are advised to complete all of the inline activities before proceeding through to the next page or module.
Checkpoint Quizzes will be present for each unit and are graded. You will have only one attempt at these.
Cheating
Each project must be the sole work of the student turning it in, except for possible group projects. Projects will be closely monitored by automatic cheat checkers, and students may be asked to explain any suspicious similarities with any piece of code available. The following are guidelines on what collaboration is authorized and what is not:
What is cheating?
- Sharing code or other electronic files: either by copying, retyping, looking at, or supplying a copy of a file.
- Sharing written assignments: Looking at, copying, or supplying an assignment.
What is NOT cheating?
- Clarifying ambiguities or vague points in class handouts.
- Helping others use the computer systems, networks, compilers, debuggers, profilers, or other system facilities.
- Helping others with high-level design issues.
- Helping others debug their code.
Cheating in group projects will also be strictly monitored and penalized (similar to cheating in individual exams, assignments or projects). Be aware of what constitutes cheating (and what does not) while interacting with students in other groups; same rules of cheating as above apply when collaborating between two or more groups. You cannot share or use written assignments, code, and other electronic files from students in other groups. If you are unsure, ask the teaching staff.
Be sure to store your work in protected directories. The penalty for cheating is severe, and might jeopardize your career; cheating is not worth the trouble. By cheating in the course, you are cheating yourself; the worst outcome of cheating is missing an opportunity to learn. In addition, you will be removed from the course with a failing grade. We also place a record of the incident in the student's permanent record.