CS 213 Fall '98
Notes on Assignments and Exams

In the following, the term HOME213 refers to the home directory for the course, namely /afs/cs.cmu.edu/academic/class/15213-f98/.

The following information is in reverse order of assignment and exam, i.e., the most recent assignments and exams appear first.

Final exam ( ps, pdf, solution )

The average score was 79/100. The highest score was 98/100.

Lab Assignment L4 (ps, pdf)

The cookie monster now reports when groups make him happy to this web page.

Frequently Asked Questions:

Q1: What should readpacket() and printpacket() do? Does it matter where we decrypt the packet?
A1: The templates for readpacket() and printpacket() are just suggestions. If you define struct packet correctly, both of these functions should be very simple. However, you are free to define struct packet however you wish, or even to get rid of it entirely. (You may disregard the part of the assignment that says "don't change the interface to readpacket() and printpacket()".)

Q2: Should we sort the packets?
A2: No, pfr should return the packets as they come off the wire. You should make sure to print them in the specified form.

Q3: What ports are good for testing?
A3: Since we're sending messages on the low ports, using ports higher than 150 is a good idea. Otherwise, you might intefere with some other group.

Q4: In part 1 of the lab when I use ioctl() I don't get any packets. Why not?
A4: Don't use ioctl(). It may not work as you expect, and we have already set up your file descriptor to work in a reasonable fashion.

Q5: What happens when I read() from a packet filter file descriptor?
A5: In openpf() we configure the packet filter such that each call to read(), using the file descriptor returned by openpf(), returns exactly one ethernet packet. So you want your reads to be something of the form: if ((*packetsize = read(fd, buf, MAXPACKETSIZE)) < 0) err("read");

Q6: I notice that those splay tree guys have sent 15,213 cookies to the cookie monster. Coincidence?
A6: Yes.

Q7: Is it true that Dr. Evil is back?
A7: Unfortunately, Dr. Evil has cracked our cookienet and is taunting our hardworking 213 students on one of the cookienet ports.

Q8: How do I use packetfilter(7)? The man page is very confusing.
A8: We are not expecting you to use the packetfilter(7) interface. You can read this man page for your own interest and education, to learn how the low level interface to the network works. But we have set these low level things up for you.

Q9: What should we do with the last packet (with the size field equal to zero)?
A9: Use printf("%04d:EOM\n", seqnum) for this special packet.

Q10: I am calling read() twice. Once to read in the packet headers, and a second time to read in the body of the packet. On the second read it is giving me the headers again! What's up with this?
A10: Packet filters are weird. Each call to read() will return exactly 1 packet, and if you don't give read() enough space then it will truncate the packet. So your code should only call read() once for each packet, with a buffer large enough to hold the whole thing. [See the class newsgroup for more details...]

Q11: When I call read() on my connection to the cookie monster server, I am only getting half a line of input (Say, "* ENCODE") then on my next read I am getting the rest of the line plus the next line (say, " 32\r\nOK\r\n"). Are you just doing this to make our lives harder?
A10: No, this is how read() works. It will return data as soon as it gets it, and it does not care about line boundaries. It is up to you to buffer the data returned by read() until you get a whole line's worth, then parse it. One way of handling this is to ask the C library to do it for you: turn the file descriptor into a FILE * using fdopen(), and then use fgets() to read lines from the server, and fprintf() followed by fflush() to write data to the server. Note that once you have called fdopen(), it will cause problems if you ever use that file descriptor again (since the C library thinks it controls it). Do not call read() or write() on a file descriptor after turning it into a FILE *, and use fclose() to close the file. If you don't like calling fflush() all the time, you can change the buffering of a file using the function setvbuf().

Exam 2 ( ps, pdf, solution, histogram, review-ps, review-pdf )

The average score was 48/70. There were two scores of 69/70.

Lab Assignment L3 (ps, pdf)

Some interesting solutions are available here.

You can see how some other groups are doing by taking a look here.

Frequently Asked Questions:

Q1: It says to call DataRef for each basic block in p and for each load or store instruction i in b. But what do we supply as a type for when it is a basic block?
A1: You should call DataRef for each load and store instruction only, not for each basic block. However, in order to find each instruction you must iterate over the basic blocks within a procedure. Like so:
for each procedure p { for each basic block b within p { for each instruction i within b { if i is a load or store instruction call DataRef(type, i) } } } See the Atom User Manual for examples of how to do this.

Q2: The assignment says in Step 2: "The InitCache and PrintCache routines..."
A2: This should read: "The InitCache and PrintResults routines..."

Q3: Is it okay for us to hardcode the number of bytes in a matrix since we will be working specifically with 64x64 matrices or should our code be able to decipher the size of the matrix somehow?
A3: Yes, you can hardcode the matrix size when you compute the copy bandwidth in your csim.anal.c routine. It is also OK for you to optimize fast-trans.c specifically to run fast for the 64x64 case. Your performance score will be computed from the performance of your fast-trans.c on the 64x64 case. So go crazy!

Q4: On page 4 of the writeup in the handin instructions: "...when you execute an instrumented binary, it should produce exactly one line of output, ..."
A4: This should read: "...when you execute an instrumented binary, it should produce exactly one line of output per group member, ..." The point is to be sure to call the handin_output() routine exactly once. This routine will print a line of output for each member of your group.

Q5: Can we call other routines from fast_trans(), such as bcopy?
A5: No.

Q6: In Step 2, the addresses in my traces are biased by some fixed amount from the reference traces.
A6: For some reason, maroon produces traces whose addresses are off by exactly 32 bytes from those produced by slate. We're not sure why this is so, but it's not a big deal and won't affect the correctness of your simulation.

Q7: How fast does my solution need to be in order to get full credit.
A7: Copy bandwidths of 72-74 are fairly straightforward to achieve. You'll need to be near or above 80 MB/s to get full performance credit.

Q8: In implementing fast_trans, are we allowed to do all of the loads and stores as 64-bit values?
A8: No. The virtual machine you're simulating uses 32 bit loads and stores.

Q9: My csim.inst.c compiles and runs, but all the addresses are "1"! Why doesn't it produce the correct addresses?
A9: You need to look at your call to AddCallProto() for DataRef(). If an argument is declared "int", then the *exact value* you pass to AddCallInst() will be passed through to DataRef(). If you declare an argument as "VALUE", this means you are asking Atom to compute a value for you, such as EffAddrValue.

Homework Assignment H4 (ps, pdf, solution )

Correction: The last sentence of the first paragraph on page 2 (Problem 1) incorrectly asks you to assume that each *block* has a set of LRU bits, rather than each *set*. Instead, it should read: "For your computation of C, assume that for each block there are an additional t tag bits and 1 valid bit, and that for each set there are an additional E(E-1) bits that implement the LRU replacement policy."

Frequently Asked Questions:

Q1: How do we handle page faults?
A1: Actually, none of the references in problems 3-5 results in a page fault. If you're getting a page fault, you're doing something wrong.

If there were a page fault however (valid bit=0), then the correct response would be to stop and go on to the next problem, since the PPN in the invalid page table entry is no longer any good. In a real system, the OS would bring in the physical page, decide where to place it in physical memory, and then assign a new PPN, at which point you could use the new PPN to reference the cache.

Q2: For problem 2, am I correct in assuming that some bits may be used for 2 or more of the labels, e.g. a bit range could be used for the VPN and the TLBT?
A2: That's correct, the VPN and the TLBT share some of the same bits.

Q3: I'm confused about where the TLBI and TLBT are involved in a virtual address. I can't seem to find where this is explained in the class notes.
A3. The key is to remember that the TLB is just another cache that is referenced by the VPN. The TLBI and the TLBT partition the bits in the VPN. The TLBI occupies the low order bits of the VPN, and the TLBT occupies the remainder of the VPN bits. The exact number of TLBI bits is determined by the number of sets in the TLB.

Lab Assignment L2 (ps, pdf)

Start working on your malloc and free, using your own testing code (modify test_malloc.c). The automated testing server now works! You can get an updated Makefile in $CLASSDIR/L2/Makefile. You can get feedback on the correctness and performance of your implementation by submitting your malloc.c using "make validate NAME=groupname". You'll get a response via e-mail. You can see how other groups are doing by taking a look here. The codenames are randomly assigned (you'll be e-mailed your codename when you submit your malloc).

Don't forget to modify the "team_struct" in malloc.c
The functions were updated after the handout was printed to (1) use unsigned long for sizes, and (2) deal with "void *" instead of "char *" for consistency with typical UNIX mallocs
There is some confusion about the "minimum free block size". Assume that your free blocks have a word boundary tag at the start and end. Then your free block size is 16 bytes (two Alpha words). That's great. Of course, having more fields in the free block might be a good tradeoff for getting performance points.... Also, the minimum allocatable block size will be 16: so if you need only 8 bytes of bookkeeping for each allocated block, your smallest block size is 8 bytes for data plus 8 bytes of overhead.
The "performance" number reported by the timing server is the average time in seconds for a pair of malloc/free calls, computed across several benchmarks. The entries on the web page are sorted in increasing order based on this field.
The performance number will strongly reflect whether your L2_free is linear or constant time. If it is linear in the total number of blocks, rather than the number of free blocks, your performance test is likely to take forever and be killed by the server. You really need to shoot for O(#free blocks) or better.
The "overhead" is the fraction of the heap that your allocator wastes (i.e. doesn't let the application use for storage). This number is calculated on a simple access pattern of same-sized blocks until L2_malloc returns NULL. Worst case is 1.0.
The due date has been extended by two days. Please hand in your assignment (using make handin NAME=yourname) by Friday, October 23, 1998 at 11:59pm.

Exam 1 ( ps, pdf, histogram, solution )

The average score was 57/75. The maximum score was 72/75.

Exam 1 will be held in class on Tuesday, October 6. Coverage will include all of the lecture material, up through (and including) class 9 (Sept. 22). Floating point will not be covered. In addition, you will be expected to have really done all of the work required for homeworks H1 and H2, as well as lab L1.

Here are some logistical issues:

The exam will be open book and open notes and handouts. You may bring any (printed) material you wish.
You will write all of your answers on the exam. No blue books are required.

To help you get started, we have created a practice exam available in postscript. This exam was created by adapting problems previously given in CS 347. They are fairly representative of the styles of problems you will encounter. These problems, and their answers, will be covered in recitations on Monday, October 5. Don't ask us to provide online answers to these problems. Note also that these practice problems do not cover the entire scope of the problems you'll encounter on the exam. Be sure to study the lecture notes and to review the homework and lab material.

Homework Assignment H3 (ps, pdf, solution )

This assignment should not take long. There are no hints or comments yet.

Lab Assignment L1 (ps, pdf)

The bomb statistics web page now contains the final results!
Dr. Evil has posted a useful message to the class newsgroup! Read it here.
A web page has been created which will tell you how far you are in defeating your bomb. It is here.
Check out the revised version of the Alpha Assembly Language Guide (Postscript, PDF.) It has updated descriptions on some of the byte-level operations you'll find in your bomb code.
Please request your bomb now! You get the bomb when I read my mail, and that will not happen at funny hours of the day. So request it before you need it, please!

Homework Assignment H2 (ps, pdf, solution)

Some advice: Problem 2 is fairly tricky. Skip it and come back to it.
HANDIN PROBLEM: The handin instructions on the assignment do not work! Please copy the new and improved Makefile from $CLASSDIR/H1/Makefile, then use the command make handin NAME=yourname to hand in your assignment. Replace yourname with the Andrew ID of the first member of your group. I have updated H1.ps to reflect this change.
I have receieved reports of the makefile still not working for some people. The solution is: make sure you are working on an Alpha system. This assignment depends on dissasembling the object code, and it will differ dramatically if you are on a Sun.

Homework Assignment H1 (ps, pdf)

Due Date Extension: The due date for this assignment has been delayed by one full week. It will be due on Thurs., Sept. 10 at 12:01am.
Further Restrictions on function bang: Your code for this function may not call any other functions. This restriction is imposed since some of your other functions may make use of the ! operator.
Commenting: The assignment write up indicates that you should write a single line comment above each line of code. In fact, this rule will not be strictly enforced. You may write multple lines or none. All we ask is that you give some indication via comments as to how your implementation works. This can be quite brief.
Systems, Languages, and Compilers: You are welcome to do your code development using any system or compiler you choose. Note, however, that the version you turn in must compiler and run correctly on both the class Alpha's as well as the Andrew Sun's. If your code does not compile, you will loose lots of points. You have access to the exact system configurations we will be using, including the compiler (gcc), and setting of the compiler's command line flags. Make sure you test your code for these configurations before submitting it.
Extraneous include's, print Statements. Much of our testing of your code will be done automatically. You will be penalized for things that require us to go back and fix up your code before it can run. Make sure your file bits.c does not include any #include's, other than of bits.h. Make sure you have no print statements in your code.
c2c rule checking compiler. We will use a modified version of the c2c ANSI C compiler to check your programs for compliance with the coding rules. An Alpha binary version is available in the H1 directory. See the course newsgroup cmu.cs.class.cs213 for details.
Improved timing code. I fiddled with the timing code to make it less sensitive to the system load. Get the file "ftime.c" from the archive H1.tar on the class directory. You can do this the command:
tar xf /afs/cs.cmu.edu/academic/class/15213-f98/H1/H1.tar ftime.c
and then recompile your code. This version seems to work quite well, even on a heavily loaded system.
A few hints. Here are a few little hints to help you on some of the more difficult parts of H1. They are intentionally terse. Use them to guide your thought process.
- isignBit: Look at the binary representations of 63 and 31.
- highByte: Look at the binary representations of 56 and 24.
- absval: (x >> signBit()) is interesting, both as a number and as a mask.
- bang: 0 is the only value for x such that both x and -x are nonnegative.

CS 213 Fall '98 Notes on Assignments and Exams

Final exam ( ps, pdf, solution )

Lab Assignment L4 (ps, pdf)

Exam 2 ( ps, pdf, solution, histogram, review-ps, review-pdf )

Lab Assignment L3 (ps, pdf)

Homework Assignment H4 (ps, pdf, solution )

Lab Assignment L2 (ps, pdf)

Exam 1 ( ps, pdf, histogram, solution )

Homework Assignment H3 (ps, pdf, solution )

Lab Assignment L1 (ps, pdf)

Homework Assignment H2 (ps, pdf, solution)

Homework Assignment H1 (ps, pdf)

CS 213 Fall '98
Notes on Assignments and Exams