15-410 Homework 2

This homework assignment is due Friday, May 4th at 23:59:59. As we intend to make solutions available on the web site immediately thereafter, please turn your solutions in on time. Late days are not available for this assignment.

Homework must be submitted (online) in either PostScript or PDF format (not: Microsoft Word, Word Perfect, Apple Works, LaTeX, XyWrite, WordStar, etc.). A plain text file (.text or .txt) is also acceptable, though it must conform to Unix expectations, meaning lines of no more than 120 characters separated by newline characters (note that this is not the Windows convention, and MacOS has two different conventions). Except as otherwise directed (in the crypto question), turn in your answers as .../$USER/hw2/$USER.pdf, .../$USER/hw2/$USER.ps, or .../$USER/hw2/$USER.text. If you use another filename, there is some risk that your solutions will not be credited to you. Please try not to submit your homework into your book-report directory or the other way around.

As usual, you may discuss this assignment with others, but you must then go off by yourself to write up the solution.

Question 1 - Public Key Practicum

As you go through the steps of working on this question, try to think carefully about what each step is accomplishing in terms of underlying cryptography primitives.

Follow the directions in gpg.html to generate a PGP key ring, containing public and private keys for digital signature and encryption purposes. Do not turn the key ring in to your hw2 directory! Instead, follow the directions on how to export the public key information from the key ring into a file, hw2/$USER.asc. Then create a secret message for the course staff, in hw2/$USER.secret.asc.

Question 2 - Producer/Consumer

In the IPC/RPC lecture, your instructor briefly claimed that IPC systems with blocking send() operations are "bad for the producer/consumer pattern". In this problem you will explore what that might mean.

What is "producer/consumer"? It is a communication pattern for multiple processes or threads, most simply one producer talking to one consumer. For example, one process might have the job of producing blocks of "tasteful" random numbers: it repeatedly draws a bunch of random numbers and checks the standard deviation of the numbers against a target; if the standard deviation is acceptable, the block is forwarded to the consumer using some communication channel, but if the standard deviation is too high, the producer discards the block and tries again. Obviously it is difficult to predict exactly when the producer will next send a "tasteful" block to the consumer, even though the mathematically inclined can probably say meaningful things about the long-term rate at which tasteful blocks will be generated.

Meanwhile, each time the consumer gets a block of "tasteful" random numbers, it uses them as the definition for some problem to solve (imagine some graph algorithm, or an optimization problem--something that requires an amount of computation that varies noticeably depending on the exact details of the inputs). When the consumer solves one of these problems, it happily records the answer in a log file somewhere and requests the next problem definition from the producer. Again, it is difficult to say how long each solution will take, though it may be possible to characterize the overall/average rate at which solutions can be generated.

The exact details of the connection between the producer and consumer don't matter too much. If they are threads in a task, maybe the "connection" is some shared-memory data structure managed by typical thread synchronization primitives. If they are independent processes on a Unix-like system, the connection might be a Unix pipe, operated on by read() and write() system calls.

Sometimes the consumer might be able to solve several problems in a row very quickly; in that case, when it tries to get a new problem from the producer one won't be ready yet, and the consumer will need to block. It shouldn't be surprising that a read() (or IPC receive) operation might need to block; this question is about when the corresponding write() (or IPC send) might block.

When the lecture refers to "blocking send()", what is meant is a "strictly-blocking" send operation that always blocks the sender if there isn't already a receiver waiting. Strictly-blocking send is nice, of course, if you are the entity responsible for implementing the communication channel: when either party arrives before the other, you block that party until the other arrives; once the second party arrives, you transfer data and unblock both parties; meanwhile, you don't need to allocate space to store (buffer) data for a potentially unbounded time between when a sender wants to send and when a receiver feels like receiving. Note that if you allow a sender to send multiple times without a matching reciever the amount of data you would be responsible for buffering might be unbounded.

Also note that a blocking send operation is fine for some communication patterns, such as client-server, in which the client always does a send to the server followed by a receive from the server. If the client can't do anything useful until it receives the answer from the server, it doesn't care whether it blocks twice (once in the send operation, until the server receives the request, and then once in the receive operation, until the response is available) or just once (if the IPC channel buffers the request and returns immediately from the send operation, then the client will immediately block in the receive operation until the response is ready).

Part A

Explain why a "strictly blocking send" is bad for the producer/consumer pattern. Your answer should assume a single producer thread/process and a single consumer thread/process sharing a single communication channel, and should ignore the effects of any other programs that might be running on the machine. Your answer should consider two cases, that of a uni-processor machine and that of a multi-processor machine. You should show at least one execution trace that clearly illustrates your concern. If you wish to make assumptions about the relative rates at which the producer and consumer run, you may.

Part B

Unix pipes (and also port-based IPC mechanisms) historically almost always ran on uni-processor machines, because historically almost all machines were uni-processors. However, Unix pipes (and also port-based IPC mechanisms) generally supported buffered send operations. Briefly speculate as to why send buffering was considered a good idea, even though, historically speaking, kernel memory for buffering was a limited resource.

Helpful Hint

By the way, if you think you are having AFS permission problems, try running the program located at
% /afs/cs.cmu.edu/academic/class/15410-s12/pub/access_hw2

[Last modified Tuesday May 01, 2012]