Load Trace Archive

This page provides pointers to the load traces discussed in these papers:
  • P. Dinda, The Statistical Properties of Host Load , TO APPEAR in Scientific Programming in Fall of 1999. (This is a vastly extended version of the study published in LCR 98 and presents results for both sets of traces. Currently available as CMU-CS-98-175 abstract, postscript )

  • P. Dinda, D. O'Hallaron, An Evaluation of Linear Models for Host Load Prediction, TO APPEAR in HPDC '99. Currently available as technical report CMU-CS-98-148, School of Computer Science, Carnegie Mellon University, November, 1998. abstract, postscript

  • The first paper (and an earlier description of the work given at LCR '98) presents a detailed statistical study of these load traces. The second paper studies the performance of different linear models for host load prediction. The study is based on data-mining a large-scale trial of applying different randomly selected prediction models to randomly selected portions of these load traces. The traces are also used in other parts of my thesis work on prediction-based best-effort real-time systems for distributed interactive applications.

    Because of the large total size of these files (>800 MB), the machine that serves them may change from time to time, thus it is advisable to bookmark this page, instead of the URLs where the actual files currently reside. If you have difficulty accessing the traces or if you are interested in accessing the raw data of the prediction study, please contact pdinda@cs.cmu.edu.

    Load trace playback

    You can now download a tool to play these load traces on a machine of your choice.

    File format and conversion

    The traces are provided in their raw from, and a slightly processed form which is much easier to download and use. Some selected traces are also available in compressed ascii.

    The format of the raw traces is a sequence of pairs of binary IEEE doubles stored in Dec Alpha byte order. The first number is a timestamp, and the second number is the 5 second load average sampled at that point.

    The "combine_host.pl" program will combine the date of a group of host files in the correct time order.

    The "bo_to_text.c" program will convert to whitespace delimited ascii. If the output looks like garbage, supply the "-pervert" option to swap byte orders.

    The trace filenames have a standard form which encodes the name of the machine (`hostname`), and the time (`date`) the trace was started, all sanitized for compatibility with both Unix and NT. For example,

    axp0.psc.edu.Thu_Aug_21_22-01-59_EDT_1997.trace
    
    is a trace that was started on August 21, 1997 22:01:59 eastern daylight time on the machine axp0.psc.edu. All of the traces are 3600 samples long (one hour), except for the first group of PSC traces, which are 86400 samples long (one day). Trace runs were done consecutively, so that trace files from a single machine can be generally be concatenated in time order. Samples are taken at one second intervals.

    In the processed form the traces from a specific host are concatenated in appropriate time order as described above and converted into network byte order. Note that while individual raw traces are contiguous, processed traces may have gaps at times where no raw trace data was available. If in doubt, use the raw traces or use the timestamp fields in the processed traces. The processed August 1997 traces from the PSC also have a measurement error corrected. Due to an off-by-one error in the trace gathering tool, every hundredth measurement was corrupted. In the processed traces, this measurement is replaced with an interpolation from the surrounding measurements.

    Traces

    Traces were collected for two time periods, late August 1997 and February to March 1998 on roughly the same group of machines. There are two groups of machines. The first is the Alpha cluster at the Pittsburgh Supercomputing Center (PSC). Of this group, the machines axpfea, axpfeb, and axp0-3 are interactive machines, while the remainder (axp4-10) are batch machines. The second group of machines (CMU) are compute servers (mojave, sahara), a testbed (manchester1-8), and desktop workstations (remainder) in the CMCL here at CMU.

    All of the machines run Digital Unix. Because Digital Unix does not change the five second load average value more frequently than approximately every two seconds, we sampled periodically at a rate of 1 Hz. This captures all of the dynamics the operating system makes available to us.

    Some interesting traces in compressed ascii format

    The following traces are in a two column whitespace-delimited ascii format. The first column is the (floating point) time stamp in seconds and the second column is the measured load value (floating point). The traces have been gzipped to save space.
  • axp0.psc.edu (August 97) a heavily loaded, highly variable interactive machine on the PSC cluster.
  • axp7.psc.edu (August 97) a more lightly loaded batch machine on the PSC cluster that has interesting epochal behavior
  • sahara.cmcl.cs.cmu.edu (August 97) a moderately loaded, big memory compute server in the CMCL
  • themis.nectar.cs.cmu.edu (August 97) a moderately loaded desktop machine.
  • All traces in binary format

    Here are all of the individual traces in binary format.
  • CMU, August 1997 (basic stats, processed traces, raw traces)
  • CMU, February-March 1998 (basic stats, processed traces, raw traces)
  • PSC, August 1997 (basic stats, processed traces, raw traces)
  • PSC, February-March 1998 (basic stats, processed traces, raw traces)
  • Contact Peter A. Dinda for more information.