Milestone Results

Group Members: Anvesh Komuravelli, Samir Sapra, and Jenn Tam

We examine all STAMP benchmarks using the Fair runtime from RSTM. The runtime description can be found here and the respective PPOPP, 2009 paper can be found here.

To learn more about the STAMP benchmarks, please read:
STAMP: Stanford Transactional Applications for Multi-Processing Chi Cao Minh, JaeWoong Chung, Christos Kozyrakis, Kunle Olukotun In IISWC '08: Proceedings of The IEEE International Symposium on Workload Characterization, Sept. 2008.

In what follows, we show the distributions of transaction lengths for the various STAMP benchmarks with different parameters. One significant parameter is the number of threads which are running concurrently, which we varied as 1, 2, 4 and 8 for all the benchmarks. To study the distribution of the transaction lengths, we make use of the following plots.

Boxplot - A boxplot is a convenient way of graphically depicting groups of numerical data through their five-number summaries - the smallest observation (sample minimum), lower quartile (Q1), median (Q2), upper quartile (Q3), and largest observation (sample maximum). A boxplot may also indicate which observations, if any, might be considered outliers.

Distribution plot - We zoomed into the transaction lengths boxplot omitting outlier transactions and plotted the histogram plot for those data.

Goodness of fit plot - We tried to fit the distribution we got in the above plot into known distribution curves which might help us design our own contention manager to perform the best for the STAMP benchmarks and get an insight into the best possible contention manager.

To view any graph in more detail please right click on the graph and click "view image".
Please click on a bechmark to jump to that section:

genome
kmeans
labyrinth
bayes
intruder
ssca2
vacation
yada

genome

This benchmark implements a gene sequencing program that reconstructs the gene sequence from segments of a larger gene. The algorithm used to sequence the gene has three phases:

Remove duplicate segments by using hash-set
Match segments using Rabin-Karp string search algorithm - Cycles are prevented by tracking starts/ends of matched chains
Build sequence

The first two steps make up the bulk of the execution time and are parallelized.

For the test runs, the following parameters were used, as suggested for this STAMP benchmark.

Gene length : 16384
Length of each segment : 64
Number of segments : 16777216


Boxplot of entire distribution for 1 thread	Histogram showing majority lengths of transactions for 1 thread (exact percentage in title of graph)	Bar graph of log-likelihood values for 1 thread


Boxplot of entire distribution for 2 threads	Histogram showing majority lengths of transactions for 2 threads (exact percentages in title of graph)	Bar graph of log-likelihood valules for 2 threads


Boxplot of entire distribution for 4 threads	Histogram showing majority lengths of transactions for 4 threads (exact percentages in title of graph)	Bar graph of log-likelihood valules for 4 threads


Boxplot of entire distribution for 8 threads	Histogram showing majority lengths of transactions for 8 threads (exact percentages in title of graph)	Bar graph of log-likelihood valules for 8 threads

top

kmeans

K-means is a partition-based method and is arguably the most commonly used clustering technique. K-means represents a cluster by the mean value of all objects contained in it. Given the user-provided parameter k, the initial k cluster centers are randomly selected from the database. Then, K-means assigns each object to its nearest cluster center based on the similarity function. For example, for spatial clustering, usually the Euclid distance is used to measure the closeness of two objects. Once the assignments are completed, new centers are found by finding the mean of all the objects in each cluster. This process is repeated until two consecutive iterations generate the same cluster assignment. The clusters produced by the K-means algorithm are sometimes called "hard" clusters, since any data object either is or is not a member of a particular cluster.

For the test runs, the following parameters were used, as suggested for this STAMP benchmark. Runs are observed for both low contention and high contention scenarios.

Number of clusters : 40 (low contention), 15 (high contention)
Threshold value : 0.00001
Input image file : picks 16 uniformly distributed random centers and then selects 16384 random points of 24 dimensions around those centers using a gaussian distribution.


Low contention - Boxplot of entire distribution for 1 thread	Low contention - Histogram showing majority lengths of transactions for 1 thread (exact percentage in title of graph)	Low contention - Bar graph of log-likelihood values for 1 thread


Low contention - Boxplot of entire distribution for 2 threads	Low contention - Histogram showing majority lengths of transactions for 2 threads (exact percentages in title of graph)	Low contention - Bar graph of log-likelihood valules for 2 threads


Low contention - Boxplot of entire distribution for 4 threads	Low contention - Histogram showing majority lengths of transactions for 4 threads (exact percentages in title of graph)	Low contention - Bar graph of log-likelihood valules for 4 threads


Low contention - Boxplot of entire distribution for 8 threads	Low contention - Histogram showing majority lengths of transactions for 8 threads (exact percentages in title of graph)	Low contention - Bar graph of log-likelihood valules for 8 threads


High contention - Boxplot of entire distribution for 1 thread	High contention - Histogram showing majority lengths of transactions for 1 thread (exact percentage in title of graph)	High contention - Bar graph of log-likelihood values for 1 thread


High contention - Boxplot of entire distribution for 2 threads	High contention - Histogram showing majority lengths of transactions for 2 threads (exact percentages in title of graph)	High contention - Bar graph of log-likelihood valules for 2 threads


High contention - Boxplot of entire distribution for 4 threads	High contention - Histogram showing majority lengths of transactions for 4 threads (exact percentages in title of graph)	High contention - Bar graph of log-likelihood valules for 4 threads


High contention - Boxplot of entire distribution for 8 threads	High contention - Histogram showing majority lengths of transactions for 8 threads (exact percentages in title of graph)	High contention - Bar graph of log-likelihood valules for 8 threads

top

labyrinth

Given a maze, this benchmark finds the shortest-distance paths between pairs of starting and ending points. In this algorithm, the maze is represented as a grid, where each grid point can contain connections to adjacent, non-diagonal grid points. The algorithm searches for a shortest path between the start and end points of a connection by performing a breadth-first search and labeling each grid point with its distance from the start. This expansion phase will eventually reach the end point if a connection is possible. A second traceback phase then forms the connection by following any path with a decreasing distance. This algorithm is guaranteed to find the shortest path between a start and end point; however, when multiple paths are made, one path may block another.

For the test runs, the following parameters were used, as suggested for this STAMP benchmark.

Used a file which has a randomly generated 512x512x6 maze grid and 64 uniformly selected random start/end point pairs


Boxplot of entire distribution for 1 thread	Histogram showing majority lengths of transactions for 1 thread (exact percentage in title of graph)	Bar graph of log-likelihood values for 1 thread


Boxplot of entire distribution for 2 threads	Histogram showing majority lengths of transactions for 2 threads (exact percentages in title of graph)	Bar graph of log-likelihood valules for 2 threads


Boxplot of entire distribution for 4 threads	Histogram showing majority lengths of transactions for 4 threads (exact percentages in title of graph)	Bar graph of log-likelihood valules for 4 threads


Boxplot of entire distribution for 8 threads	Histogram showing majority lengths of transactions for 8 threads (exact percentages in title of graph)	Bar graph of log-likelihood valules for 8 threads

top

bayes

This application implements an algorithm for learning Bayesian networks, which is an important part of machine learning. Typically, neither the probability distributions nor the conditional dependences among them are known or solvable for a human; thus Bayesian networks are often learned from observed data. The particular algorithm implements a hill-climbing strategy that uses both local and global search. For efficient probability distribution estimations, the adtree data structure is used.

For the test runs, the following parameters were used, as suggested for this STAMP benchmark.

Number of variables : 32
Maximum number of parents : 10
Percent chance of having a parent : 40
Number of random records generated and used as inputs : 4096
Edge insert penalty : 2
Maximum number of edges learnt per variable : 8
Random seed : 1


Boxplot of entire distribution for 1 thread	Histogram showing majority lengths of transactions for 1 thread (exact percentage in title of graph)	Bar graph of log-likelihood values for 1 thread


Boxplot of entire distribution for 2 threads	Histogram showing majority lengths of transactions for 2 threads (exact percentages in title of graph)	Bar graph of log-likelihood valules for 2 threads


Boxplot of entire distribution for 4 threads	Histogram showing majority lengths of transactions for 4 threads (exact percentages in title of graph)	Bar graph of log-likelihood valules for 4 threads


Boxplot of entire distribution for 8 threads	Histogram showing majority lengths of transactions for 8 threads (exact percentages in title of graph)	Bar graph of log-likelihood valules for 8 threads

top

intruder

Signature-based network intrusion detection systems (NIDS) scan network packets for matches against a known set of intrusion signatures. Network packets are processed in parallel and go through three phases: capture, reassembly, and detection. The main data structure in the capture phase is a simple FIFO queue, and the reassembly phase uses a dictionary (implemented by a self-balancing tree) that contains lists of packets that belong to the same session. In the TM version included in STAMP, the capture and reassembly phases are each enclosed by transactions. Hence, the code for each phase is as simple as that with coarse-grain locks but hopefully achieves good performance through optimistic concurrency. When operating on these data structures, this benchmark has relatively short transactions. It also has moderate to high levels of contention depending on how often the reassembly phase rebalances its tree. Overall, since two of the three phases are spent in transactions, this benchmark has a moderate amount of total transactional execution time.

For the test runs, the following parameters were used, as suggested for this STAMP benchmark.

Percentage of attacks : 10
Maximum number of packets per stream : 16
Total number of streams : 20000
Random seed : 1


Boxplot of entire distribution for 1 thread	Histogram showing majority lengths of transactions for 1 thread (exact percentage in title of graph)	Bar graph of log-likelihood values for 1 thread


Boxplot of entire distribution for 2 threads	Histogram showing majority lengths of transactions for 2 threads (exact percentages in title of graph)	Bar graph of log-likelihood valules for 2 threads


Boxplot of entire distribution for 4 threads	Histogram showing majority lengths of transactions for 4 threads (exact percentages in title of graph)	Bar graph of log-likelihood valules for 4 threads


Boxplot of entire distribution for 8 threads	Histogram showing majority lengths of transactions for 8 threads (exact percentages in title of graph)	Bar graph of log-likelihood valules for 8 threads

top

ssca2

The Scalable Synthetic Compact Applications~2 (SSCA2)benchmark is comprised of four kernels that operate on a large, directed, weighted multi-graph. These four graph kernels are commonly used in applications ranging from computational biology to security. STAMP focuses on Kernel 1, which constructs an efficient graph data structure using adjacency arrays and auxiliary arrays.

The transactional version of SSCA2 has threads adding nodes to the graph in parallel and uses transactions to protect accesses to the adjacency arrays. Since this operation is relatively small, not much time is spent in transactions. Additionally, the length of the transactions and the sizes of their read and write sets is relatively small. The amount of contention is the application is also relatively low as the large number of graph nodes lead to infrequent concurrent updates of the same adjacency list.

For the test runs, the following parameters were used, as suggested for this STAMP benchmark.

-s20 -i1.0 -u1.0 -l3 -p3
Problem scale : 15
Probability of inter-clique : 1.0
Unidirectional probability : 1.0
Maximum length of a path : 3
Maximum number of parallel edges : 3


Boxplot of entire distribution for 1 thread	Histogram showing majority lengths of transactions for 1 thread (exact percentage in title of graph)	Bar graph of log-likelihood values for 1 thread


Boxplot of entire distribution for 2 threads	Histogram showing majority lengths of transactions for 2 threads (exact percentages in title of graph)	Bar graph of log-likelihood valules for 2 threads


Boxplot of entire distribution for 4 threads	Histogram showing majority lengths of transactions for 4 threads (exact percentages in title of graph)	Bar graph of log-likelihood valules for 4 threads


Boxplot of entire distribution for 8 threads	Histogram showing majority lengths of transactions for 8 threads (exact percentages in title of graph)	Bar graph of log-likelihood valules for 8 threads

top

vacation

This benchmark implements a travel reservation system powered by a non-distributed database. The workload consists of several client threads interacting with the database via the system's transaction manager.

The database is consists of four tables: cars, rooms, flights, and customers. The first three have relations with fields representing a unique ID number, reserved quantity, total available quantity, and price. The table of customers tracks the reservations made by each customer and the total price of the reservations they made. The tables are implemented as Red-Black trees.

For the test runs, the following parameters were used, as suggested for this STAMP benchmark. Runs are observed for both low contention and high contention scenarios.

Number of queries per task : 2 (low contention), 4 (high contention)
Number of relations queried : 90 (low contention), 60 (high contention)
Number of user tasks : 98 (low contention), 90 (high contention)
Number of possible relations : 1048576
Number of tasks : 4194304


Low contention - Boxplot of entire distribution for 1 thread	Low contention - Histogram showing majority lengths of transactions for 1 thread (exact percentage in title of graph)	Low contention - Bar graph of log-likelihood values for 1 thread


High contention - Boxplot of entire distribution for 1 thread	High contention - Histogram showing majority lengths of transactions for 1 thread (exact percentage in title of graph)	High contention - Bar graph of log-likelihood values for 1 thread

top

yada

This benchmark implements Ruppert's algorithm for Delaunay mesh refinement.

For the test runs, the following parameters were used, as suggested for this STAMP benchmark.

Angle constraint : 15 degrees


Boxplot of entire distribution for 1 thread	Histogram showing majority lengths of transactions for 1 thread (exact percentage in title of graph)	Bar graph of log-likelihood values for 1 thread


Boxplot of entire distribution for 2 threads	Histogram showing majority lengths of transactions for 2 threads (exact percentages in title of graph)	Bar graph of log-likelihood valules for 2 threads


Boxplot of entire distribution for 4 threads	Histogram showing majority lengths of transactions for 4 threads (exact percentages in title of graph)	Bar graph of log-likelihood valules for 4 threads


Boxplot of entire distribution for 8 threads	Histogram showing majority lengths of transactions for 8 threads (exact percentages in title of graph)	Bar graph of log-likelihood valules for 8 threads

top