Analysis of Algorithms: Project 4

Deadline: April 26 (Wednesday)

The goal of this project is to use Huffman codes for compressing images. Your task is to implement a function that constructs a table of Huffman codewords for an image file, that is, assigns a codeword to each pixel value in the file.

Download huffcode.h, huffcode.c, and mytable.c, as well as good old imageio.h and  imageio.c. The huffcode.c file contains a procedure for compressing images, which calls your function for constructing a codeword table. The mytable.c file includes the declaration of your function, called form_table; you need to write its body. The function inputs a table of pixel values and their frequencies, and it should compute a Huffman codeword for each pixel value.

The table of codewords is encoded as an array of HUFFTABLE structures, defined in huffcode.h; each element of this array represents a codeword for a specific pixel value. The HUFFTABLE structure includes a pixel value (called value), the frequency of its occurrence in the image file (count), the corresponding Huffman codeword (bitstring), and the number of bits in the codeword (bits). It also includes a maxbits value, which you may optionally use to store the length of the bitstring array, or any other information.

The compression procedure in huffcode.c determines the frequency of each pixel value, and then calls your function, which should assign appropriate bitstring and bits values. Note that bitstring is an array, and your function needs to allocate memory for this array. You may store any values in maxbits, as the compression procedure does not access maxbits.

As usual, you should not modify any files except mytable.c, and you should submit only mytable.c to heath@suntan.eng.usf.edu.

After implementing form_table, compile the compression procedure:

gcc -o huffcode huffcode.c mytable.c imageio.c -lm

Note that you need the -lm option with gcc, which indicates the use of the math library.

The syntax for compressing a file is as follows:

huffcode [-pgm | -data] file.pgm

The procedure writes the compressed data into a new file, with the .hf extension. The -pgm option means that the input file is a PGM image, and huffcode can use properties of images to improve compression. On the other hand, the -data option notifies the procedure that the file may not be in the PGM format. You may use this option to compress any files, not only images.

To decompress a file, use the following syntax:

huffcode -uc file.hf

The procedure then produces a new file with the .hfu extension.

If you enter the huffcode command without any option, it will check whether the input is a compressed file. If the file appears compressed, then the procedure will decompress it. Otherwise, huffcode will compress the file, using the -data mode.

Download the following six images and use them to test the Huffman compression:
 building.pgm (381039 bytes) Engineering Building (ENB)
 windows.pgm  (381039 bytes) Windows of ENB
 plants.pgm (381039 bytes) Small plants near ENB
 fountain.pgm (381039 bytes) Fountain on campus
 spot.pgm (261136 bytes) Dark spot (synthetic image)
 noise.pgm (261136 bytes) Random noise (synthetic)
For each file, first apply the Huffman procedure with the -pgm option, and then with the -data option; in addition, apply the gzip compression and compare its results with huffcode. The gzip procedure is based on a different compression technique, called the Lempel-Ziv algorithm; its syntax is as follows:

gzip file.pgm

Determine the sizes of the compressed images for each of the three compression techniques: huffcode -pgm, huffcode -data, and gzip. Enclose a table of the resulting sizes to your report, and answer the following questions:

 Back to the Projects page