Due Monday, November 4 @ 8:00pm on Gradescope
In this assignment, you’ll be writing the code for a binary search tree. It will be based on the tree that we construct and deal with in class. (So, it is ok to use code from our class demo code in your assignment.)
Download a pre-prepared Eclipse Project hw4.zip
and import it into Eclipse using these instructions. You will make all of your changes inside the Java files in this project.
For this assignment you have 5 submissions. You should write your own testcases while you work so that you don’t waste submissions. After you have used your submissions, you may continue to submit, but your submission will be penalized 4 points for every extra submission. (So, your 6th submission receives a -4, your 7th submission a -8, etc.)
You have two main tasks.
Finish the KeyValueBinarySearchTree
implementation with all of the requested methods.
Write the AnagramTree
class that can be used to find anagrams from a dictionary.
KeyValueBinarySearchTree
As the first step in this project, you will be building a key-value binary search tree. This is like a standard binary search tree, except that the order of the items in the tree is determined by a key associated with each item, instead of being determined by the item itself. Each node in the tree will store both the data
for the node as well as the key
associated with that data.
You will need to write the following methods in KeyValueBinarySearchTree.java
.
public KeyValueBinarySearchTree()
The constructor.
public void add(KeyType key, DataType data)
Add an item to the tree. You should determine the appropriate node to put the item based on the key, and at the node store both the key and the data.
Parameters:
key
— The key used for searching/finding the appropriate location for the item.
data
— The data to be stored at the location.
Exceptions: IllegalStateException
— if attempting to add a key that is already in the tree.
public int size()
Determines and returns the size of the tree (total number of nodes).
public boolean isEmpty()
Determine whether or not the tree is empty.
public int height()
Finds and returns the length of the longest path (number of edges) from the root to a leaf. Recall that the height of a 1-node tree (just the root) is 0 (as is the height of an empty tree) and the height of a node is the length of the longest path from that node to a leaf.
public boolean isBalanced()
Returns true if this tree is balanced, meaning that the height of the left subtree and the height of the right subtree of each node (not just the root) differ by no more than 1; returns false otherwise. An empty tree is (trivially) balanced.
public DataType find(KeyType key)
Searches the tree for the node represented by key and returns the data stored at the node.
Parameters: key
— The key to search for in the tree
Returns: The data at the node represented by key, or null if the key is not found in the tree.
public KeyValueBinarySearchTree<KeyType, DataType> clone()
Returns a new BST that is a shallow clone of the original, i.e., contains nodes with the same keys and data in exactly the same locations. The original tree should not be modified.
public ArrayList<DataType> asList()
Returns an ArrayList containing all of the data elements of the tree, sorted according to their keys.
private class TreeNode
The TreeNode class is a private inner class used (only) by this class
public String toString()
returns a String that prints tree top to bottom, right to left in a 90-degree rotated level view. Do not remove these toString methods, the autograder will make use of them.
AnagramTree
Next, you are going to write an anagram finder, AnagramTree
. (If you don’t know what an anagram is, try Wikipedia on the subject.)
The AnagramTree
will make use of the KeyValueBinarySearchTree
class to store lists of anagrams. At a high level, it will read in a file of words (one per line) and store them in a binary tree using their sorted letters as the search key. What does this mean? When you read a word from the file (a String
), you must sort it (by creating another, sorted, String that has all the letters of the original word in sorted order) and then insert the original word into the tree using the sorted word as the key. The sorted word (a String) will be the search key for the binary search tree, and all the words that have the same sorted form (like “rats” and “tars” and “arts”) will all be stored in an ArrayList
at the node with the sorted word key (in this case, with the key “arst”).
The AnagramTree
method loadWords
takes a file name and maximum word size and builds a tree with all the words that are less than or equal to the length specified. To do this, you read words from a file one line at a time, and for each word, if it is the correct length then construct its sorted form and add the original word into your tree, using the sorted form as the key. (Remember: The key finds the node, and the node contains an ArrayList
of words that are anagrams of the sorted key.)
If you use the small file and a maximum word size of 7, you should have 16 words in 9 nodes in the tree. If you use the big file with a maximum word size of 7, you should have 51913 words in 41121 nodes. (Or, with a maximum word size of 8, 80314 words in 66538 nodes.)
We have provided some simple driver code in AnagramTester.java
that should use your AnagramTree
class. Your code must be compatible with this tester as provided.
You will need to write the following methods in AnagramTree.java
.
public AnagramTree()
The constructor.
public void loadWords(String filename, int maxLen)
Reads in words of length <= maxLen and stores them in ArrayLists in the tree, indexed by the sorted form of the word.
This method does not have an explicit efficiency requirement, but if you are too inefficient then the autograder will timeout.
Parameters:
filename
— The file to read words from.
maxLen
— The maximum length of a word.
public boolean isEmpty()
Returns whether or not the tree is empty (has no nodes)
public int size()
Determines and returns the size of the tree (number of nodes) storing the anagrams.
public int numWords()
Return the total number of words that have been added to the tree.
public ArrayList<String> findMatches(String word)
Searches the tree given a word and returns a list of all the words that are anagrams of it.
Parameters: word
— A word
Returns: An ArrayList containing all the words in the tree that are anagrams of word
.
There are multiple parts of the grading of this assignment:
For the first 90 points, your submission will be auto-graded on Gradescope.
For the next 5 points, your submission will be manually graded to check for good implementation methodologies. (Did you use a good approach to solving the problems?)
For the next 5 points, your submission will be manually graded to check for good testcases that you include in the main method. (Do you have 2-3 basic testcases for each method, and do they all execute automatically?)
Your code will also be checked for style. The parts of style that can be checked automatically (things like spacing, indentation, the use of CamelCase, etc.) are automatically checked by the autograder. Other parts of style, such as choosing good variable names, will be checked manually. Autograded style guide violations can result in, at most, -10 points. Manually checked style guide violations can result in, at most, -5 points.
You will submit your program to Gradescope. Log in to the system and you will see the homework. Once there, you need to submit a zip
file containing your code. Lucky for you, however, Eclipse can create this zip file for you. Check out these instructions for exporting. On Gradescope, you’ll submit that exported zip
file. On the page that follows your submission, you will see your live score (out of 90). If you receive a lower score than you expect, you can click on the score to see the feedback from the autograder.
In this homework, we are not providing any testcases.
For KeyValueBinarySearchTree
you need to provide at least two of your own testcases for each of the new methods. At least one of those should be non-basic. All of your testcases should be in the KeyValueBinarySearchTree
class included with the skeleton code.
For AnagramTree
you need to provide at least three of your own complete testcases that demonstrate overall functionality. (And, of course, we expect that you will also be testing manually using the AnagramTester
).
You must follow the model of our testcases from previous assignments (meaning you print when you start, print the results (pass/fail) when you finish, etc.) Additionally, you must comment each testcase with a note describing what it tests.
When grading, in addition to counting testcases we will also look at the quality of what you are testing.
You may only use the following Java libraries in this assignment:
java.util.ArrayList
java.util.Scanner
java.io.FileNotFoundException
java.io.FileReader
java.util.Arrays
If you have questions, please post to Discord. The course staff, including the instructor, monitor and answer questions there.
When posting to Discord, if you include any code make sure your question is posted privately to the course staff, and not to the entire class.
Do not change the names of any of the provided methods or files. You may, however, add additional methods as needed.
After you submit to Gradescope, make sure you check your score. If you aren’t sure how to do this, then ask.
There is no partial credit on Gradescope testcases. Your Gradescope score is your Gradescope score.
Read the last bullet point again. Seriously, we won’t go back later and increase your Gradescope score for any reason. Even if you worked really hard and it was only a minor error…
You can submit to Gradescope multiple times. So, after you submit you should check your score. If it isn’t a 90 you can keep working until it is.
Don’t forget to make sure your code conforms to the style guide. We’ll be taking a look at that!
The style guide has a bunch of small rules with regard to indentation and spacing that are hard to follow perfectly. Luckily, Eclipse can handle it for you! Go to Source -> Format
in Eclipse and watch all of your spacing and indentation problems get fixed automatically. (You should do this before every submission.)