[SOLUTIONS] Homework 1 - Due: Tuesday Feb. 6, 1:30pm, in class.
SELECT salary, teamID, year FROM Salaries WHERE salary = (SELECT MAX(salary) FROM Salaries);
salary | teamID | year |
26000000 | NYA | 2005 |
SELECT p.first_name, p.last_name FROM PLAYERS p, SALARIES s WHERE s.teamID = 'PIT' and s.year = 1985 and p.playerID = s.playerID and p.rightHanded = 'L';
first_name | last_name |
Andy | Hassler |
Larry | McWilliams |
Joe | Orsulak |
Rod | Scurry |
Jason | Thompson |
SELECT DISTINCT stadium FROM Teams WHERE teamName = 'New York Yankees';
stadium |
Polo Grounds IV |
Shea Stadium |
Yankee Stadium I |
Yankee Stadium II |
z_value = zorder(x, y, n){ x_bits = make_binary(x) y_bits = make_binary(y) z_bits = () if (length(x_bits) > n or length(y_bits) > n) exit("input out of range") for bit_counter in 1:n { z_bits = cat(z_bits, x_bits[bit_counter] z_bits = cat(z_bits, y_bits[bit_counter] } return(make_decimal(z_bits)) }
Explanation:
Make sure to check that input is in range of -n parameter. That is, x and y must be < 2^n in order to fit in the 2^n x 2^n grid. If so, make a new binary number by concatenating the first (leftmost) bit of the x-coordinate, then the first bit of the y-coordinate, then the second bit of the x-coordinate, and so on. Translate this number to base 10 and output.
(x_value, y_value) = izorder(z, n){ x_bits = () y_bits = () z_bits = make_binary(z) z_bit_counter = 1 for xy_bit_counter in 1:n{ x_bits = cat(x_bits, z_bits[z_bit_counter] z_bit_counter++ y_bits = cat(y_bits, z_bits[z_bit_counter] z_bit_counter++ } return(make_decimal(x_bits), make_decimal(y_bits)) }
Explanation:
Reverse the process above. Translate the input to binary, then put the first bit as the first bit of the x-coordinate, the second bit as the first bit of the y-coordinate, the third bit as the second bit of the x-coordinate, and so on. Translate the final x and y-coordinates to decimal and output.
command | arguments | output |
zorder | -n 2 0 0 | 0 |
zorder | -n 3 0 1 | 1 |
izorder | -n 5 0 | (0,0) |
izorder | -n 2 15 | (3,3) |
zorder | -n 3 10 10 | Error: input is out of range (10 > 2^3) |
zorder | -n 3 10 11 | Error: input is out of range (10 > 2^3) |
(closest_pair_node_1, closest_pair_node_2, distance) = naive_closest_pair(root_node){ closest_pair_node_1 = -1 closest_pair_node_2 = -1 current_min_distance = 999999999 for each leaf_node in get_leaves(root_node){ temp_closest_pair_node_1 = leaf_node temp_closest_pair_node_2 = get_nearest_non_self_neighbor(leaf_node) temp_min_distance = distance(temp_closest_pair_node_1, temp_closest_pair_node_2) if (temp_min_distance < current_min_distance) { current_min_distance = temp_min_distance closest_pair_node_1 = temp_closest_pair_node_1 closest_pair_node_2 = temp_closest_pair_node_2 } } return(closest_pair_node_1, closest_pair_node_2, current_min_distance) }
Explanation:
Following the hint's advice, traverse the tree. For each leaf point issue a 2-nearest neighbor query (k = 2 so we don't find ourselves as our nearest neighbor). Calculate the distance between each leaf point and its nearest (non-self) neighbor. Print the smallest distance found, and the id's of these closest points.
GLOBAL: closest_pair_node_1 = -1 closest_pair_node_2 = -1 current_min_distance = 999999999 recursive_closest_pair(root_node, root_node){ for each node_1 in children(root_node){ for each node_2 in children(root_node){ if (is_leaf(node_1) AND is_leaf(node_2){ temp_min_distance = distance(node_1, node_2) if (temp_min_distance < current_min_distance) { current_min_distance = temp_min_distance closest_pair_node_1 = node_1 closest_pair_node_2 = node_2 } } else if (min_min_dist(node_1, node_2) < current_min_distance){ (closest_pair_node_1, closest_pair_node_2, current_min_distance) = recursive_closest_pair(node_1, node_2) } else { return } } } return(closest_pair_node_1, closest_pair_node_2, current_min_distance) }
Explanation:
This algorithm is detailed nicely in these slides and this paper by Corral et. al.
We again follow the hint's advice and traverse the tree, this time with two pointers, p1 and p2. As we go down the tree, at each level, we calculate the minimum possible distance between any of the points in p1's branch (bounded by the mbr at that level) and between p2's branch. This distance is called the minmindistance. If this smallest possible distance is less than our current closest pair's distance, we prune this branch of the search (that is, we stop our traversal). If the minmindistance is less than our current best, we proceed as before since there is the possibility of finding closer pairs of points. We follow this procedure recursively until we reach the leaf level, at which point we calculate the distance between all points in p1 and p2's leaf mbr's.
This algorithm can be improved by prioritizing our search to look at the most promising mbr's first. This will decrease our current minimum distance as quickly as possible, thus pruning as many future searches as possible.
As before, after we are done searching we print the smallest distance found, and the id's of these closest points.
The closest pairs found are the same for both algorithms (although the recursive solution will find them more quickly).
For the small script the closest pair was:
point_1 | point_2 | distance |
666 | 777 | 0 |
Some people were confused because points 666 and 777 had the same coordinates. But since they had different node_id's, these were a valid closest pair. The closest pair of non-identical points was:
point_1 | point_2 | distance |
555 | 666 | 1.4 |
point_1 | point_2 | distance |
866 | 869 | 50 |