Problem Based Benchmark Suite (2020)

exptSeq Data Generator:

exptSeq -t {int,double} <n> <filename>

This generator creates a sequence of n values with repeats that are distributed in an exponential distribution and outputs the in the sequence file format.

In particular it will first generate n possible values v1, v2, ..., vn uniformly at random from a given range (depending on the type) and then among those it will pick the ith value with probability (1/(i ln n)). The purpose of the distribution is to test codes on inputs with a varying number of duplicates, and with some values highly duplicated (e.g., approximately a 1/(ln n) fraction of the elements will have value v1).

The generator supports both double-precision floating-point values and integers. The integer version selects the n possible values uniformly at random from 0 up to the maximum possible value for a twos complement 32-bit integer (2,147,483,647). The double precision version selects the n possible values uniformly at random in the range [0:1].

last modified 17:46, 20 Sep 2020

This project has been funded by the following sources:
Intel Labs Academic Research Office for the Parallel Algorithms for Non-Numeric Computing Program,
National Science Foundation, and
IBM Research.