The prefetching algorithm has a few compile-time parameters, which we
consistently set as follows: cache line size = 32 bytes, effective cache size = 500 bytes, prefetch latency = 300 cycles, and
policy on unknown loop bounds = assume a small number of iterations.
The cache line size precisely matches the architecture, while the other
parameters are more heuristic in nature. As discussed in Section
, we choose an effective cache size to be a fraction
of the actual size (8 Kbytes) as a first approximation to the effects of
cache conflicts. The prefetch latency indicates to the compiler how
many cycles in advance it should try to prefetch a reference (i.e.
parameter
in equation(
)). The prefetch latency is
larger than 75 cycles, the minimum miss-to-memory penalty, to account for
bandwidth-related delays. For cases where loop bounds cannot be resolved at
compile-time, we assume the number of iterations to be small, which tends
to overestimate what remains in the cache. Later, in
Section
, we will consider the effects of varying
these parameters.