Place these files in a lab8 folder. Before leaving lab, zip up the lab8 folder and hand the zip file in.
This lab uses facilities provided by RubyLabs. If you were not present for Lab 5 (with the time operation) or Lab 6 (with the Canvas), you will need to setup RubyLabs before beginning this lab.
Support Code
Store the following functions in huffman.rb:
def make_pq(table) pq = PriorityQueue.new for item in table do char = item.first freq = item.last node = Node.new(char, freq) pq << node end return pq end def build_tree(pq) while pq.length > 1 node1 = pq.shift node2 = pq.shift pq << Node.combine(node1, node2) end return pq.first end
Be sure you understand what each instruction in each function is doing and how each function works overall. In the file exercise1.txt, write a paragraph for each function, explaining how each function works overall. Your explanations should not just repeat the instructions above; instead, you should show deeper understanding and explain what is going on conceptually.
The Hawaiian Alphabet
NOTE: Be sure to include BitLab and load "huffman.rb" in irb before you do these exercises.In irb, define an array named table that contains two-element arrays, where each two-element array consists of a character followed by its frequency as a floating point number. Use the following data for your table, based on the Hawaaian alphabet and the relative frequency of each symbol in the set of all Hawaiian words:
' 0.068 A 0.262 E 0.072 H 0.045 I 0.084 K 0.106 L 0.044 M 0.032 N 0.083 O 0.106 P 0.030 U 0.059 W 0.009Your table definition will begin like this:
table = [ ["'", 0.068], ["A", 0.262], ...
In irb, use the make_pq function to create a priority queue consisting of all the nodes required for the Huffman tree for this alphabet. Verify that the nodes are increasing order in the priority queue based on frequency. You should see this if you display your priority queue:
[( W: 0.009 ), ( P: 0.030 ), ( M: 0.032 ), ( L: 0.044 ), ( H: 0.045 ), ( U: 0.059 ), ( ': 0.068 ), ( E: 0.072 ), ( N: 0.083 ), ( I: 0.084 ), ( K: 0.106 ), ( O: 0.106 ), ( A: 0.262 )]In irb, build the Huffman tree from the nodes in the priority queue by using the build_tree function in irb. Verify that you get the following tree:
( 1.000 ( 0.428 ( 0.195 ( 0.089 ( L: 0.044 ) ( H: 0.045 ) ) ( K: 0.106 ) ) ( 0.233 ( O: 0.106 ) ( 0.127 ( U: 0.059 ) ( ': 0.068 ) ) ) ) ( 0.572 ( A: 0.262 ) ( 0.310 ( 0.143 ( 0.071 ( M: 0.032 ) ( 0.039 ( W: 0.009 ) ( P: 0.030 ) ) ) ( E: 0.072 ) ) ( 0.167 ( N: 0.083 ) ( I: 0.084 ) ) ) ) )
In irb, show the Huffman code for "M" and "K".
Decode the following mystery word encoded using your Huffman tree:
0001110110110011(HINT: It might not be a Hawaiian word.)
In irb, encode the mystery word using the encode function from BitLab to see that you get the binary sequence above. Use the decode function to verify that you get the mystery word.
In the file exercise2.txt cut and paste your irb session that shows the results of these activities above.
Another Alphabet
In irb, define an array named table that contains two-element arrays, where each two-element array consists of a character followed by its frequency as a floating point number. Use the following data for your table, based on the alphabet and the relative frequency of each symbol shown below:
A 0.111 E 0.171 G 0.027 H 0.082 I 0.094 L 0.054 M 0.032 O 0.101 R 0.082 S 0.085 T 0.123 U 0.038
In irb, use the make_pq function to create a priority queue consisting of all the nodes required for the Huffman tree for this alphabet. Verify that the nodes are increasing order in the priority queue based on frequency.
In irb, build the Huffman tree from the nodes in the priority queue by using the build_tree function in irb.
Based on the resulting tree you get in irb, draw a picture on paper of the Huffman tree. HINT: You can also use irb to determine the Huffman code for each letter (as you did for letters M and K in the previous exercise) so it is easier to interpret the Huffman tree output. You will need to hand in this paper once you are done with the lab; be sure your name and andrew ID are on the paper.
Decode the following mystery word encoded using your Huffman tree:
0110001101000101100001100101110101
In irb, encode the mystery word using the encode function from BitLab to see that you get the binary sequence above. Use the decode function to verify that you get the mystery word.
In the file exercise3.txt cut and paste your irb session that shows the results of these activities above.
[EXPLORE MORE] Find a frequency table for letters in English words (using all letters A through Z) and build a Huffman Tree in Ruby using this table.