15110 SUMMER SESSION ONE - 2013

Lab 8 - Tuesday, June 11

Deliverables

  1. exercise1.txt
  2. huffman.rb
  3. exercise2.txt
  4. exercise3.txt
  5. Drawing of Huffman tree in exercise 3 on paper (hand in to CA)

Place these files in a lab8 folder. Before leaving lab, zip up the lab8 folder and hand the zip file in.

RubyLabs Information

This lab uses facilities provided by RubyLabs. If you were not present for Lab 5 (with the time operation) or Lab 6 (with the Canvas), you will need to setup RubyLabs before beginning this lab.

CA-Led Activities/Demonstration

Self-Directed Activities

  1. Support Code

    Store the following functions in huffman.rb:

    def make_pq(table)
      pq = PriorityQueue.new
      for item in table do
        char = item.first
        freq = item.last
        node = Node.new(char, freq)
        pq << node
      end
      return pq
    end
    
    def build_tree(pq)
      while pq.length > 1
        node1 = pq.shift
        node2 = pq.shift
        pq << Node.combine(node1, node2)
      end
      return pq.first
    end
    

    Be sure you understand what each instruction in each function is doing and how each function works overall. In the file exercise1.txt, write a paragraph for each function, explaining how each function works overall. Your explanations should not just repeat the instructions above; instead, you should show deeper understanding and explain what is going on conceptually.

  2. The Hawaiian Alphabet

    NOTE: Be sure to include BitLab and load "huffman.rb" in irb before you do these exercises.
    1. In irb, define an array named table that contains two-element arrays, where each two-element array consists of a character followed by its frequency as a floating point number. Use the following data for your table, based on the Hawaaian alphabet and the relative frequency of each symbol in the set of all Hawaiian words:

      '  0.068
      A  0.262
      E  0.072
      H  0.045
      I  0.084
      K  0.106
      L  0.044
      M  0.032
      N  0.083
      O  0.106
      P  0.030
      U  0.059
      W  0.009
      
      Your table definition will begin like this:

      table = [ ["'", 0.068], ["A", 0.262], ... 
      

    2. In irb, use the make_pq function to create a priority queue consisting of all the nodes required for the Huffman tree for this alphabet. Verify that the nodes are increasing order in the priority queue based on frequency. You should see this if you display your priority queue:

      [( W: 0.009 ), ( P: 0.030 ), ( M: 0.032 ), ( L: 0.044 ), 
      ( H: 0.045 ), ( U: 0.059 ), ( ': 0.068 ), ( E: 0.072 ), 
      ( N: 0.083 ), ( I: 0.084 ), ( K: 0.106 ), ( O: 0.106 ), 
      ( A: 0.262 )]
      
    3. In irb, build the Huffman tree from the nodes in the priority queue by using the build_tree function in irb. Verify that you get the following tree:

      ( 1.000 ( 0.428 ( 0.195 ( 0.089 
      ( L: 0.044 ) ( H: 0.045 ) ) ( K: 0.106 ) ) 
      ( 0.233 ( O: 0.106 ) ( 0.127 ( U: 0.059 ) 
      ( ': 0.068 ) ) ) ) ( 0.572 ( A: 0.262 ) 
      ( 0.310 ( 0.143 ( 0.071 ( M: 0.032 ) 
      ( 0.039 ( W: 0.009 ) ( P: 0.030 ) ) ) 
      ( E: 0.072 ) ) ( 0.167 ( N: 0.083 ) 
      ( I: 0.084 ) ) ) ) )
      

    4. In irb, show the Huffman code for "M" and "K".

    5. Decode the following mystery word encoded using your Huffman tree:

      0001110110110011
      
      (HINT: It might not be a Hawaiian word.)

      In irb, encode the mystery word using the encode function from BitLab to see that you get the binary sequence above. Use the decode function to verify that you get the mystery word.

      In the file exercise2.txt cut and paste your irb session that shows the results of these activities above.

  3. Another Alphabet

    1. In irb, define an array named table that contains two-element arrays, where each two-element array consists of a character followed by its frequency as a floating point number. Use the following data for your table, based on the alphabet and the relative frequency of each symbol shown below:

      A  0.111
      E  0.171
      G  0.027
      H  0.082
      I  0.094
      L  0.054
      M  0.032
      O  0.101
      R  0.082
      S  0.085
      T  0.123
      U  0.038
      

    2. In irb, use the make_pq function to create a priority queue consisting of all the nodes required for the Huffman tree for this alphabet. Verify that the nodes are increasing order in the priority queue based on frequency.

    3. In irb, build the Huffman tree from the nodes in the priority queue by using the build_tree function in irb.

    4. Based on the resulting tree you get in irb, draw a picture on paper of the Huffman tree. HINT: You can also use irb to determine the Huffman code for each letter (as you did for letters M and K in the previous exercise) so it is easier to interpret the Huffman tree output. You will need to hand in this paper once you are done with the lab; be sure your name and andrew ID are on the paper.

    5. Decode the following mystery word encoded using your Huffman tree:

      0110001101000101100001100101110101
      

      In irb, encode the mystery word using the encode function from BitLab to see that you get the binary sequence above. Use the decode function to verify that you get the mystery word.

      In the file exercise3.txt cut and paste your irb session that shows the results of these activities above.

  4. [EXPLORE MORE] Find a frequency table for letters in English words (using all letters A through Z) and build a Huffman Tree in Ruby using this table.