The following information is in reverse order of assignment and exam, i.e., the most recent assignments and exams appear first.
The cookie monster now reports when groups make him happy to this web page.
Frequently Asked Questions:
Q1: What should readpacket() and printpacket() do? Does it
matter where we decrypt the packet?
A1: The templates for readpacket() and printpacket() are just
suggestions. If you define struct packet correctly, both of these
functions should be very simple. However, you are free to define
struct packet however you wish, or even to get rid of it entirely.
(You may disregard the part of the assignment that says "don't change
the interface to readpacket() and printpacket()".)
Q2: Should we sort the packets?
A2: No, pfr should return the packets as they come off the
wire. You should make sure to print them in the specified form.
Q3: What ports are good for testing?
A3: Since we're sending messages on the low ports, using ports
higher than 150 is a good idea. Otherwise, you might
intefere with some other group.
Q4: In part 1 of the lab when I use ioctl() I don't get any
packets. Why not?
A4: Don't use ioctl(). It may not work as you expect, and we
have already set up your file descriptor to work in a reasonable
fashion.
Q5: What happens when I read() from a packet filter file descriptor?
A5:
In openpf() we configure the packet filter such that each call to read(),
using the file descriptor
returned by openpf(), returns exactly one ethernet packet. So you want
your reads to be something of the form:
if
((*packetsize = read(fd, buf, MAXPACKETSIZE)) < 0)
err("read");
Q6: I notice that those splay tree guys have sent 15,213 cookies to the
cookie monster. Coincidence?
A6:
Yes.
Q7: Is it true that Dr. Evil is back?
A7:
Unfortunately, Dr. Evil has cracked our cookienet and is taunting
our hardworking 213 students on one of the cookienet ports.
Q8: How do I use packetfilter(7)? The man page is very confusing.
A8:
We are not expecting you to use the packetfilter(7) interface. You
can read this man page for your own interest and education, to learn
how the low level interface to the network works. But we have set
these low level things up for you.
Q9: What should we do with the last packet (with the size field
equal to zero)?
A9: Use printf("%04d:EOM\n", seqnum) for this special packet.
Q10: I am calling read() twice. Once to read in the packet
headers, and a second time to read in the body of the packet. On the
second read it is giving me the headers again! What's up with this?
A10: Packet filters are weird. Each call to read() will return
exactly 1 packet, and if you don't give read() enough space then it
will truncate the packet. So your code should only call read() once
for each packet, with a buffer large enough to hold the whole thing.
[See the class newsgroup for more details...]
Q11: When I call read() on my connection to the cookie monster
server, I am only getting half a line of input (Say, "* ENCODE")
then on my next read I am getting the rest of the line plus the
next line (say, " 32\r\nOK\r\n"). Are you just doing this to
make our lives harder?
A10: No, this is how read() works. It will return data as soon
as it gets it, and it does not care about line boundaries. It is up
to you to buffer the data returned by read() until you get a whole
line's worth, then parse it. One way of handling this is to ask the C
library to do it for you: turn the file descriptor into a FILE * using
fdopen(), and then use fgets() to read lines from the server, and
fprintf() followed by fflush() to write data to the server. Note that
once you have called fdopen(), it will cause problems if you ever use
that file descriptor again (since the C library thinks it controls
it). Do not call read() or write() on a file descriptor after turning
it into a FILE *, and use fclose() to close the file. If you don't
like calling fflush() all the time, you can change the buffering of a
file using the function setvbuf().
You can see how some other groups are doing by taking a look here.
Frequently Asked Questions:
Q1:
It says to call DataRef for each basic block in p and for each load or store
instruction i in b.
But what do we supply as a type for when it is a basic block?
A1:
You should call DataRef for each load and store instruction only, not for
each basic block. However, in order to find each instruction you
must iterate over the basic blocks within a procedure. Like so:
for each procedure p {
for each basic block b within p {
for each instruction i within b {
if i is a load or store instruction call DataRef(type, i)
}
}
}
See the Atom User Manual for examples of how to do this.
Q2:
The assignment says in Step 2: "The InitCache and PrintCache
routines..."
A2:
This should read: "The InitCache and PrintResults
routines..."
Q3:
Is it okay for us to hardcode the number of bytes in a matrix since we
will be working specifically with 64x64 matrices or should our code be
able to decipher the size of the matrix somehow?
A3: Yes, you can hardcode the matrix size when you compute
the copy bandwidth in your csim.anal.c routine. It is also OK for you
to optimize fast-trans.c specifically to run fast for the 64x64 case.
Your performance score will be computed from the performance of your
fast-trans.c on the 64x64 case. So go crazy!
Q4:
On page 4 of the writeup in the handin instructions:
"...when you execute an instrumented binary, it should produce
exactly one line of output, ..."
A4:
This should read: "...when you execute an instrumented binary,
it should produce exactly one line of output per group member, ..."
The point is to be sure to call the handin_output() routine
exactly once. This routine will print a line of output for
each member of your group.
Q5:
Can we call other routines from fast_trans(), such as bcopy?
A5:
No.
Q6:
In Step 2, the addresses in my traces
are biased by some fixed amount from the reference
traces.
A6:
For some reason, maroon produces traces whose addresses are off by
exactly 32 bytes from those produced by slate. We're not sure why this
is so, but it's not a big deal and won't affect the correctness of
your simulation.
Q7:
How fast does my solution need to be in order to get full credit.
A7:
Copy bandwidths of 72-74 are fairly straightforward to achieve.
You'll need to be near or above 80 MB/s to get full performance
credit.
Q8:
In implementing fast_trans, are we allowed to do all of the loads and
stores as 64-bit values?
A8:
No. The virtual machine you're simulating uses 32 bit loads
and stores.
Q9:
My csim.inst.c compiles and runs, but all the addresses are "1"! Why
doesn't it produce the correct addresses?
A9:
You need to look at your call to AddCallProto() for DataRef(). If an
argument is declared "int", then the *exact value* you pass to
AddCallInst() will be passed through to DataRef(). If you declare an
argument as "VALUE", this means you are asking Atom to compute a value
for you, such as EffAddrValue.
Frequently Asked Questions:
Q1: How do we handle page faults?
A1: Actually, none of the references in problems 3-5 results
in a page fault. If you're getting a page fault, you're doing something wrong.
If there were a page fault however (valid bit=0), then the correct response would be to stop and go on to the next problem, since the PPN in the invalid page table entry is no longer any good. In a real system, the OS would bring in the physical page, decide where to place it in physical memory, and then assign a new PPN, at which point you could use the new PPN to reference the cache.
Q2: For problem 2, am I correct in assuming that some bits may be used for 2
or more of the labels, e.g. a bit range could be used for the VPN and the
TLBT?
A2: That's correct, the VPN and the TLBT share some of the same bits.
Q3: I'm confused about where the TLBI and TLBT are involved in
a virtual address. I can't seem to find where this is explained in
the class notes.
A3. The key is to remember that the TLB is just another cache
that is referenced by the VPN. The TLBI and the TLBT partition the
bits in the VPN. The TLBI occupies the low order bits of the VPN, and
the TLBT occupies the remainder of the VPN bits. The exact number of
TLBI bits is determined by the number of sets in the TLB.
Exam 1 will be held in class on Tuesday, October 6. Coverage will include all of the lecture material, up through (and including) class 9 (Sept. 22). Floating point will not be covered. In addition, you will be expected to have really done all of the work required for homeworks H1 and H2, as well as lab L1.
Here are some logistical issues:
To help you get started, we have created a practice exam available in postscript. This exam was created by adapting problems previously given in CS 347. They are fairly representative of the styles of problems you will encounter. These problems, and their answers, will be covered in recitations on Monday, October 5. Don't ask us to provide online answers to these problems. Note also that these practice problems do not cover the entire scope of the problems you'll encounter on the exam. Be sure to study the lecture notes and to review the homework and lab material.
tar xf /afs/cs.cmu.edu/academic/class/15213-f98/H1/H1.tar ftime.c
and then recompile your code. This version seems to work quite well, even on a heavily loaded system.