15-121 SPRING 2010 [CORTINA]

HOMEWORK 10 - due Tuesday, April 23

WRITTEN PROBLEMS (10 pts)

  1. (2 pts) A Date is represented as a string in the format MM/DD/YYYY:

    public class Date implements Comparable {
            String dateString;
    
            public int compareTo(Date other) {
                    return this.dateString.compareTo(other.dateString);
            }
    
            // Other methods not shown
    }
    

    1. How are Date objects naturally ordered using the compareTo method? Give an example.

    2. Show how to define a DateComparator that has a compare method that orders two Date objects chronologically. That is, the method returns a negative integer if the first date is earlier than the second date, a positive integer if the first date is later than the second date, and 0 if the two dates are the same.

    3. Let d1 and d2 represent two Date objects that are already initialized. In Java, show how to print out the earlier of the two dates. (If the dates are the same, print out either one.)

    4. Let dateList be a LinkedList of Date. Assume this list is initialized with a collection of dates. Using Java, show how to sort this list of dates. (HINT: You can do this in one line.)

  2. (2 pts) Let showMap be a map from a television show to a television network. Let networkMap be a map from a television network to a set of television shows on that network. You may assume that a television show is shown on only one television network.

    1. Using Java, show how to output all of the television shows on the ABC television network using showMap. You should do this by iterating over all of the shows in the map, and seeing if each show maps to ABC.

    2. Using Java, show how to output all of the television shows on the FOX television network in alphabetical order using networkMap.

  3. (2 pts) Write a method that accepts an ArrayList and returns the number of unique elements in the list in O(n) expected time where n is the size of the list. HINT: Use one of the set or map data structures provided in the Java API.

    public static int countUnique(ArrayList<E> list) {
    
    }
    

  4. (2 pts) A hash table stores Date objects (see problem 1) in an array of length 10 using the following hashing function:
    (month + day) % 10

    The following dates are added to an initially empty hash table in the order shown:

    1. Show the resulting hash table if linear probing is used to resolve collisions.

    2. Show the resulting hash table if chaining (buckets) is used to resolve collisions.

  5. (2 pts) Consider a hash function H(k) that returns an index into a hash table given a key k. You have a 50-cell hash table indexed from 0 to 49. The keys are positive integers.

    1. Alice proposes the following hash function: H(k) = (int)(Math.random()*50). Explain why this hash function is a poor choice.

    2. Bob proposes the following hash function: H(k) = the sum of the digits in key k. Explain why this hash function is a poor choice.

PROGRAMMING PROJECT

Since the lyrics of most pop music contains words that repeat a number of times, a simple way to compress a lyric file is to create a map that stores each word once along with the positions of each word in the file.

For example, suppose the lyric consists of the lines:

What have I
 1     2  3                       <-- word position in lyrics
What have I
 4     5  6                       <-- word position in lyrics
What have I done to deserve this
 7     8  9  10  11   12     13   <-- word position in lyrics

We would form a map that maps each unique word to a list of word positions in the lyric. NOTE: The word position for a word at the end of a line is stored as a negative integer rather than a positive integer so you can recreate the lyrics later when you iterate through the words in the map.

Sample map for the lyric above (order of words may vary):

Word       Word Position(s)
===========================
WHAT       1, 4, 7
HAVE       2, 5, 8
I          -3, -6, 9
DONE       10 
TO         11
DESERVE    12
THIS       -13

Assignment

Download the following project file to start this project: Homework10.zip

This project currently contains only one class that models the analyzer using a HashMap that maps a lyric word to a list of integers representing the locations of the word in the lyrics.

  1. Complete the class for the lyric analyzer by adding the following methods listed below.

    1. Write an add method with the following signature:

      public void add(HashMap<String,ArrayList<Integer>> map, String lyricWord, int wordPosition)

      This method will determine if the given lyric word is in the map. If the word is not in the map, then a mapping is added that maps that word to a list containing the position of the word in the lyric. If the word is already in the map, then its word position is added to the list of word positions for this word. Do not create a new mapping if the lyric word is already in the map. Use the one that is already there and just update its list. Remember to negate the word position if the word is at the end of a line in the lyrics.

    2. Write a displayWords method with the following signature:

      public void displayWords(HashMap<String,ArrayList<Integer>> map)

      This method should display the words of the song along with the word positions of the word, one word per line, in alphabetical order. You should do this without creating another map. Instead, get a set of all the words stored in the map. Sort this set using one of the sort methods from the Java API. Then iterate over this sorted set and print out each word along with the word positions associated with each word. You may leave the negative integers in the word position list. (see sample output below) Iterate through the array of words using the enhanced for loop.

    3. Write a displayLyrics method with the following signature:

      public void displayLyrics(HashMap<String,ArrayList<Integer>> map)

      This method will display the lyrics for the song (in uppercase) stored in the map. Start with an empty array of strings whose size is the number of words in the lyric plus 1 (don't use cell 0). Initialize this array with empty strings (not null). Then, get a set of all of the words stored in the map. For each word, store the word in the array cells corresponding to its word position(s). If a word position is negative, add on an extra newline character to the word when you store the word in the array. Once you finish processing all words that are in the map, iterate through the array and print out each string, and you should see the lyrics appear, line by line. Note that if the original lyrics had blank lines, you won't have these in the reconstructed lyrics. Iterate through the array of words using the enhanced for loop.

    4. Write a method with the following signature:

      public int count(HashMap<String,ArrayList<Integer>> map)

      This method will return the total number of unique words in the lyric by analyzing the map.

    5. Write a method with the following signature:

      public String mostFrequentWord(HashMap<String,ArrayList<Integer>> map)

      This method will return the word that occurs most frequently in the lyric. Do this by getting a set of all the words in the map and then for each word, find the one that has the largest set of word positions. Use the enhanced for loop in your solution. Do not create any additional data structures to solve this problem. If there is a tie for the most frequent word, print out the most frequent word that comes first alphabetically.

  2. Complete the main method that asks the user for the name of a text file to open. The text file contains the lyrics for a pop song. You may assume that the files have all punctuation filtered out (except for apostrophes), but there may be uppercase and lowercase letters mixed together. For example, the string "Don't" is considered one word. Some lines of the file may also be empty. If the file is not found, exit the program with a warning message.

    Once you open the text file, read each word from the file one at a time, convert the word to uppercase, and insert the word into the map along with its word position using your add method.

    After you add all words to the map, display the word/position list using your displayWords method.

    Next, reconstruct and display the lyrics using your displayLyrics method.

    Then, display the total number of unique words in the lyrics by using your count method.

    Finally, output the word that occurs most frequently using your mostFrequentWord method.

    YOU SHOULD NOT PRECOMPUTE THE RESULTS OF ANY METHOD (e.g. count) AS YOU ARE READING THE INPUT DATA FILE. WHEN YOU READ THE INPUT DATA FILE, YOU SHOULD JUST BUILD THE MAP.

Testing/Sample Output

Here are three text files that you can try out.

minimal.txt
realgone.txt
anotherbrick.txt

The sample output for the simple lyric given above would look like this:

Input name of lyrics file: simplesong.txt

DESERVE [12]
DONE [10] 
HAVE [2, 5, 8]
I [-3, -6, 9]
THIS [-13]
TO [11]
WHAT [1, 4, 7]

WHAT HAVE I
WHAT HAVE I
WHAT HAVE I DONE TO DESERVE THIS

The number of unique words in the lyric is: 7

Most frequent word: HAVE

Your output should match the formatting of this sample output as closely as possible. When we grade your work, part of the grading procedure will involve running your program on the files above (and possibly other files) and electronically comparing your output to our expected output. Use the textual prompts and messages as shown for maximum credit. Your solutions should work in general, not just for the specific three files given above.

Using the API Effectively

You will be graded not only on the correctness of your final output but also on the process you use to solve this problem. Study the HashMap, HashSet, TreeMap and TreeSet classes in the Java API to determine which methods you'll use. If your solution is unnecessarily convoluted or complex, or if it uses unnecessary data structures, you will not receive full credit for the program. Plan out your solution first before you start writing code.

Documentation & Style

Write Javadoc comments in your lyric analyzer class to document all of the public features. Add additional comments in your methods to explain your code where necessary. Use appropriate variable names for self-documentation and indent properly.

Hand-In Instructions & Late Submissions

Your submitted zip file should include the complete Java program.

See the course website for instructions on how to hand in your program and the policy for late submissions.