Assignment 4: Compression

Due: Fri, 10 Mar 2017 23:59:00 -0500

Goal

The goal of this assignment is to verify something larger and more ambitious than what we’ve done before. Specifically, you will write a utility for losslessly compressing and decompressing files. Your program should run from the command line as:

./compression 1 SourceFile DestFile

to compress SourceFile into DestFile. Given a 0 instead of a 1, it should decompress SourceFile into DestFile. In your file, you should prove that calling decompress on compress is the identity function. Note, however, that you will not receive credit for implementing compress and decompress as the identity function! Be sure to document any references you use for your compression algorithms.

We have included three example files for you to run your compression and decompression routines on. We will test your solution on similar but not identical files. If your verified compression routine produces a smaller file for at least one of the three, you will receive full credit. If you compress all three, you will receive extra credit. The verified implementation that compresses the most will receive additional extra credit (in the event of a tie, the solution that runs the fastest will win). You can also get partial credit if you at least write a functional specification for compress and decompress and prove that they are inverses, even if you don’t manage to implement them successfully. You are encouraged to comment your code, so we can understand your design decisions.

Please also put the answer to the questions below in a file named answers.txt.

Question

In this assignment, your compression algorithm is lossless, which lends itself to a succinct specification. How would you succinctly specify the correctness of a lossy compression algorithm like JPG? This question is a bit speculative, so there isn’t necessarily a “correct” answer.

Choose Your Own Adventure

For this assignment, you get to choose your own verification tool: Dafny or Coq. To make the decision, think about the algorithms you want to implement, the proofs you’ll need to write about them, and your level of comfort using the two tools. Fair warning: When we used this assignment in a previous iteration of the class, the Dafny version proved very challenging, so please start early. This is the first iteration with Coq, so the difficulty of that choice is unknown (could be easier, or could be much harder). Your ambition for your compression algorithm will also affect the assignment’s difficulty.

Question

Why did you choose the tool you did?

As you work through the assignment, please keep track of the experience, so you can answer the following questions.

Question

How long did you spend on the assignment?

Question

What were some of the biggest challenges you struggled with?

Question

What did your tool of choice do well? What did it do poorly?

Challenge

For the truly ambitious, do the assignment in both Coq and Dafny, ideally using the same algorithm in both (only change algorithms if your original choice appears to be infeasible in the other tool).
Compare and contrast your experience using the two different tools.

Completing this challenge successfully will give you up to 20 points out of the 10 points this homework is worth.

Infrastructure

For this assignment, you’ll need some additional files. You’ll find the sample input files in the sample-input directory.

Dafny

The provided Dafny files should look reminiscent of the earlier Dafny assignment, so you may want to consult that documentation as well.

For this assignment, there’s a skeleton “solution” already present in compression.dfy. It should compile and run successfully if you execute:

dafny /noCheating:1 /timeLimit:120 compression.dfy IoNative.cs

Your task is to flesh out the compress and decompress functional specifications and accompanying methods and proof (lossless) to apply to a non-trivial implementation. Be sure to write as strong a specification as you can for Main (i.e., formalize the loose specification above), similar to the file copy assignment.

Submission Preparation

Create a zip file dafny.zip that contains all of the files necessary to verify and compile your program.

The grading script will expect the following commands to execute correctly, given your zip file:

unzip dafny.zip
dafny /noCheating:1 /timeLimit:120 compression.dfy IoNative.cs
./compression.exe 1 source.txt dst.txt
./compression.exe 0 dst.txt new_source.txt
diff source.txt new_source.txt
... other tests here ...

If you decide to include additional files, please add a README explaining how they relate. The grading script will run Dafny over them as well, with the same parameters shown above.

Coq

The provided Coq file Compression.v follows a structure similar to the Dafny file. It currently implements compression and decompression as a trivial identity function, making it easy to prove that decompress is the inverse of compress. The file relies on the Coq.io library, which you will need to install via OPAM (the package manager for OCaml). On Mac OS X, you can do this via:

$ brew install opam
$ opam init
$ opam repo add coq-released https://coq.inria.fr/opam/released
$ opam install -j4 -v coq-io-system
$

You may need to run eval \opam config env` in terminals where you did not run opam init`.

Once you’ve installed Coq.io, you should be able to verify the current Compression.v. Do this to extract the code in Compression.v, and then to build an executable, run:

$ ocamlbuild main.native -use-ocamlfind -package io-system
$

This should give you an executable named main.native that you can run.

Your main goal should be to implement an effective compression routine and prove the lossless theorem. Proving additional properties about your program’s actual behavior using the effects system provided by Coq.io would be a bonus.

You will likely want to consult the documentation for ListString, which provides things like LString.t.

As another useful hint, remember that by default Coq understands = to mean Propositional equality, which isn’t useful if you want to change your program’s run-time behavior based on an equality test. For that, you’ll want to try things like LString.eqb.

Submission Preparation

Create a zip file coq.zip that contains all of the files necessary to verify and compile your program.

The grading script will expect the following commands to execute correctly, given your zip file:

unzip coq.zip
coqc Compression.v
ocamlbuild main.native -use-ocamlfind -package io-system
./main.native 1 source.txt dst.txt
./main.native 0 dst.txt new_source.txt
diff source.txt new_source.txt
... other tests here ...

If you decide to include additional files, please add a README explaining how they relate. The grading script will run coqc over them as well.

Final Submission

Please upload to Blackboard one or both zip files that you created above.