Due: Fri, 10 Mar 2017 23:59:00 -0500
The goal of this assignment is to verify something larger and more ambitious than what we’ve done before. Specifically, you will write a utility for losslessly compressing and decompressing files. Your program should run from the command line as:
./compression 1 SourceFile DestFile
to compress SourceFile
into DestFile
. Given a 0 instead of a 1, it should decompress SourceFile
into DestFile
. In your file, you should prove that calling decompress on compress is the identity function. Note, however, that you will not receive credit for implementing compress and decompress as the identity function! Be sure to document any references you use for your compression algorithms.
We have included three example files for you to run your compression and decompression routines on. We will test your solution on similar but not identical files. If your verified compression routine produces a smaller file for at least one of the three, you will receive full credit. If you compress all three, you will receive extra credit. The verified implementation that compresses the most will receive additional extra credit (in the event of a tie, the solution that runs the fastest will win). You can also get partial credit if you at least write a functional specification for compress and decompress and prove that they are inverses, even if you don’t manage to implement them successfully. You are encouraged to comment your code, so we can understand your design decisions.
Please also put the answer to the questions below in a file named answers.txt
.
In this assignment, your compression algorithm is lossless, which lends itself to a succinct specification. How would you succinctly specify the correctness of a lossy compression algorithm like JPG? This question is a bit speculative, so there isn’t necessarily a “correct” answer.
For this assignment, you get to choose your own verification tool: Dafny or Coq. To make the decision, think about the algorithms you want to implement, the proofs you’ll need to write about them, and your level of comfort using the two tools. Fair warning: When we used this assignment in a previous iteration of the class, the Dafny version proved very challenging, so please start early. This is the first iteration with Coq, so the difficulty of that choice is unknown (could be easier, or could be much harder). Your ambition for your compression algorithm will also affect the assignment’s difficulty.
Why did you choose the tool you did?
As you work through the assignment, please keep track of the experience, so you can answer the following questions.
How long did you spend on the assignment?
What were some of the biggest challenges you struggled with?
What did your tool of choice do well? What did it do poorly?
For the truly ambitious, do the assignment in both Coq and Dafny, ideally using the same algorithm in both (only change algorithms if your original choice appears to be infeasible in the other tool).
Compare and contrast your experience using the two different tools.
Completing this challenge successfully will give you up to 20 points out of the 10 points this homework is worth.
For this assignment, you’ll need some additional files. You’ll find the sample input files in the sample-input
directory.
The provided Dafny files should look reminiscent of the earlier Dafny assignment, so you may want to consult that documentation as well.
For this assignment, there’s a skeleton “solution” already present in compression.dfy
. It should compile and run successfully if you execute:
Your task is to flesh out the compress and decompress functional specifications and accompanying methods and proof (lossless
) to apply to a non-trivial implementation. Be sure to write as strong a specification as you can for Main
(i.e., formalize the loose specification above), similar to the file copy assignment.
Create a zip file dafny.zip
that contains all of the files necessary to verify and compile your program.
The grading script will expect the following commands to execute correctly, given your zip file:
If you decide to include additional files, please add a README explaining how they relate. The grading script will run Dafny over them as well, with the same parameters shown above.
The provided Coq file Compression.v
follows a structure similar to the Dafny file. It currently implements compression and decompression as a trivial identity function, making it easy to prove that decompress is the inverse of compress. The file relies on the Coq.io library, which you will need to install via OPAM (the package manager for OCaml). On Mac OS X, you can do this via:
You may need to run eval \
opam config env` in terminals where you did not run
opam init`.
Once you’ve installed Coq.io, you should be able to verify the current Compression.v
. Do this to extract the code in Compression.v
, and then
to build an executable, run:
This should give you an executable named main.native
that you can run.
Your main goal should be to implement an effective compression routine and prove the lossless
theorem. Proving additional properties about your program’s actual behavior using the effects system provided by Coq.io
would be a bonus.
You will likely want to consult the documentation for ListString, which provides things like LString.t
.
As another useful hint, remember that by default Coq understands =
to mean Propositional equality, which isn’t useful if you want to change your program’s run-time behavior based on an equality test. For that, you’ll want to try things like LString.eqb
.
Create a zip file coq.zip
that contains all of the files necessary to verify and compile your program.
The grading script will expect the following commands to execute correctly, given your zip file:
If you decide to include additional files, please add a README explaining how they relate. The grading script will run coqc over them as well.
Please upload to Blackboard one or both zip files that you created above.