15110 Fall 2011 [Cortina/von Ronne]

Programming Assignment 5 - due Tuesday, October 11

Overview

For this assignment, you will create a Ruby source file containing a ruby function(s) implementing each of the problems described below. If you find it useful for any of the problems in this assignment, you may define additional helper functions that your primary function for that problem calls to solve a sub-problem. Definitions for these helper functions should be included in the same Ruby source file as the primary function they help. You should store a source file for each problem in a folder named pa5 (typo corrected).

Warning: This assignment makes use of RubyLabs. RubyLabs must have been set up correctly for this to work.

Note: You are responsible for protecting your solutions to these problems from being seen by other students either physically (e.g., by looking over your shoulder) or electronically. (More information can be found in the instructions for Programming Assignment 2.)

Problems

[2 points] The sum of the first n positive integers (1 + 2 + ... + n) where n > 0 can be computed recursively. The sum of the first n positive integers is n + the sum of the first n-1 positive integers. For example, the sum of the integers from 1 to 5 is equivalent to 5 + the sum of the integers from 1 to 4. (What do you think the base case is for this recursive definition?)

Define a ruby function rec_sum(n) that computes and returns the sum of the first n positive integers using recursion instead of loops or iteration. Place rec_sum(n) in the file rec_sum.rb

Hint: this function should have the same structure as the recusive factorial function in the Unit 5A lecture notes. It will, however, have a different base case and will have a slightly different recursive step.
[2 points] The merge sort algorithm uses a merge helper function that takes two sorted lists as parameters and merges them together creating a new sorted list that holds all of the elements contained in either of the parameters. In class you saw an iterative implementation of the merge part of merge sort. An alternative recursive version of this procedure rec_merge(a, b, c) can be defined that merges the sorted lists a and b appending those elements in sorted order to the end of sorted list c and returning the resulting sorted list. (It is assumed generally that the smallest element of list a and the smallest element of list b are both greater than or equal to the largest element of list c.) This recursive merge can be accomplished using the following algorithm:
1. If a.length is 0, then concatenate b on the end of c and return the resulting list.
2. If b.length is 0, then concatenate a on the end of c and return the resulting list.
3. If the first element of a is less than or equal to the first element of b, then
  1. Append the first element of a onto c.
  2. Recursively call rec_merge with the following parameters: the list a without the first element, followed by the list b, followed by the list c. Return the list returned by this recursive call.
4. Otherwise (when the first element of b is less than the first element of a)
  1. Append the first element of b onto c.
  2. Recursively call rec_merge with the following parameters: the list a, followed by the list b without the first element, followed by the list c. Return the list returned by this recursive call.
Define a Ruby function rec_merge(a, b, c) (in rec_merge.rb) implementing this algorithm. To append an element on to the end of a list (array), use the << operator. To concatenate a list on to the end of another list, you can use the concat function. For example, array b can be concatenated onto the end of array a using a.concat(b).

Example usage in irb:
```
>> load("rec_merge.rb")
=> true
>> rec_merge([3,4,6], [1,2,5,7,8], []) 
=> [1, 2, 3, 4, 5, 6, 7, 8]
>> rec_merge([3,4,5], [2,5,6], [0,1]) 
=> [0, 1, 2, 3, 4, 5, 5, 6]
```
NOTE: To test your function, you can also use the merge sort algorithm from class. You need the following alternate version of msort:
```
def msort(list)
    return list if list.length == 1
    halfway = list.length/2
    list1 = list[0..halfway-1]
    list2 = list[halfway..list.length-1]
    newlist1 = msort(list1)
    newlist2 = msort(list2)
    newlist = rec_merge(newlist1, newlist2, [])   # new function called here
    return newlist
end
```
Arrays can be used as stacks in Ruby quite easily by adding and removing elements only from the end of the array. (The end of the array acts like the top of the stack.) Arrays have a push method that adds an element to the end of the array and a pop method that removes and returns the element at the end of the array. For example, if s is an array acting as a stack, then we can push and pop as follows:
```
s.push(5)
s.push(6)
s.push(7)
print s.pop     # prints 7
print s.pop     # prints 6
print s.pop     # prints 5
```
The file rpn.rb contains a Ruby function evaluate that should take an an arithmetic expression in Reverse Polish Notation (RPN) and returns its numeric value. The arithmetic expression is stored as an array of strings containing integers and operators. For example, the RPN expression 3 4 + would be stored in the array as ["3", "4", "+"]. Download this file and save it in your pa5 directory.
1. [1 point] Complete the missing code for the RPN evaluation algorithm as discussed in class. Look at the comments in the function as a guide. Test your solution using irb by loading the function and then calling the function on RPN expressions where you know the answer.
  
  Example usage in irb:
```
>> load("rpn.rb")
=> true
>> evaluate(["3","4","+"])
=> 7
>> evaluate(["23","3","-","4","6","+","/"])
=> 2
```
2. [1 point] Ruby functions can be written to call this evaluator to calculate the value of expressions in prefix notation. For example, the arithmetic expression 3 + 4 can be calucated by the Ruby function:
```
def compute_rpn0()
    expr = ["3","4","+"]
    return evaluate(expr)
end
```
  Define a ruby function compute_rpn1() that uses the RPN evaluate method to compute and return the value of:
  5 + 6 * 7
  You will need to manually convert this expression to RPN first.
  
  Define a ruby function compute_rpn2() that uses the RPN evaluate method to compute and return the value of:
  (9 + 7) / (2 * (5 - 3))
  Again you will need to manually convert this expression to RPN first.
One common use of the linear search, binary search, and hash table lookup algorithms that we have introduced is to implement a dictionary where information is associated with key. Supporting this requires a slightly more complicated data structure and some small changes to the algorithm.

If one wanted to be able to look up how many times different words occured in some text, one would need a data structure that included the words and their number of occurrences rather than just a list of words. A simple way to organize this would be to use an array of "entries" where each entry is a two element array where the first element is the key (i.e., the word) and the second element is the value (i.e., the number of times that word occurs). An example of a ruby declaration of data organized in this way is:

wc_list = [["endoesophagitis", 27], ["repanel", 71], ["proverbial", 73], ["carroter", 74], ["saccharomycetic", 17], ["anarchoindividualist", 54], ["arboricultural", 55], ["inflaming", 14], ["triconodont", 47], ["haglike", 87]]
1. [2 points] Define a Ruby function wc_llookup(wc_list, word) (in llookup.rb), which performs a linear search over the entries in wc_list for word. The list wc_list, which is a word count list organized as described above. You should use the following algorithm, which requires wc_list and word:
  1. For each integer i from 0 to the (length of wc_list) - 1, do the following:
    1. Set entry to the wc_list[i].
    2. Set key to the first element of entry.
    3. Set value to the second element of entry.
    4. If key is equal to word, then return value
  2. Return nil (if the loop completes without returning a value).
  Usage example in irb:
```
>> load "llookup.rb"
=> true
>> wc_list = [["endoesophagitis", 27], ["repanel", 71], 
["proverbial", 73], ["carroter", 74], ["saccharomycetic", 17], 
["anarchoindividualist", 54], ["arboricultural", 55], 
["inflaming", 14], ["triconodont", 47], ["haglike", 87]] 
=>  [["endoesophagitis", 27], ["repanel", 71],
["proverbial", 73], ["carroter", 74], ["saccharomycetic", 17],
["anarchoindividualist", 54], ["arboricultural", 55],
["inflaming", 14], ["triconodont", 47], ["haglike", 87]] 
>> wc_llookup(wc_list, "haglike") 
=> 87
>> wc_llookup(wc_list, "carnegie")
=> nil
```
2. [2 points] The file word_count.rb includes a hash table implementation that also works with words as keys and numbers as values. It, however, requires RubyLabs (which you need to have set up) and your wc_llookup method to work. Download word_count.rb into your pa5 folder.
  
  Now, we'd like you to perform some experiments to compare linear search and hash table lookup, and then, record your observations in lookup.txt. In your text file, include the text of each bolded question below on its own line with your answers on separate lines in between the questions.
  If you haven't done so already, enter the definition for wc_list into irb and load your llookup.rb file into irb.
  
  Using the definition of wc_list that you already entered using irb, and the make_wc_hash_table method in word_count.rb create a corresponding hash table with 7 buckets using irb:
```
wc_htable = make_wc_hash_table(wc_list, 7)
```
  Print out wc_htable (typo corrected). (Hint: use p wc_htable instead of print wc_htable to better see wc_htable's structure.)
  
  On which keys does the hash function produce collisions? How can you find this out by looking at the hash table?
  
  Next, in irb use the make_random_wc_list and make_wc_hash_table functions in word_count.rb to create a word count list containing 10,000 entries and a hash table with 20,000 buckets containing the same 10,000 entries:
```
big_wc_list = make_random_wc_list(10000)
ht = make_wc_hash_table(big_wc_list, 20000)
```
  About how long did this take?
  
  Now choose the first, last, and middle words in the list using the following commands in irb:
```
best_word = big_wc_list[0].first
worst_word = big_wc_list[big_wc_list.length-1].first
middle_word = big_wc_list[big_wc_list.length/2].first
```
  What are these words?
  
  Finally, using irb, time how long it takes to lookup up each of these words using each of the two methods (typos corrected):
```
time { wc_llookup(big_wc_list, best_word) }
time { wc_llookup(big_wc_list, worst_word) }
time { wc_llookup(big_wc_list, middle_word) }
time { wc_htlookup(ht, best_word) }
time { wc_htlookup(ht, worst_word) }
time { wc_htlookup(ht, middle_word) }
```
  Record the time it took for each of these lookup operations to finish. Which method is fastest in the best case? Which method is fastest in the worst case? Which method is most consistent?
Submission

You should now have a pa5 directory that contains the required files, rec_sum.rb, rec_merge, rpn.rb, llookup.rb, word_count.rb, and lookup.txt, each—in turn—containing the corresponding function(s) or answers. Zip up your directory and upload it using the handin system. (The handin system will accept submissions beginning on Friday until the deadline Tuesday night.)

15110 Fall 2011 [Cortina/von Ronne]

Programming Assignment 5 - due Tuesday, October 11

Overview

Problems

Submission