Although non-binding prefetching allows the compiler to ignore the fact that it is compiling for a multiprocessor from a correctness standpoint, there are some performance reasons why it should take multiprocessing into account. In this subsection, we discuss the first of these reasons, which is that communication between processors can potentially increase the miss rate by causing more coherence misses (e.g., misses due to invalidations when using an invalidation-based cache coherence protocol).
As an example of how communication affects the miss rate, consider the
example in Figure . In this example, two
processors are both accessing location A, and both processors
initially have copies of A in their caches in a ``shared''
state. Processor 1 loads A twice. Assume that
during the interval between these loads, no other locations are accessed by
Processor 1 that would interfere with A in the cache. If this
access pattern occurred on a uniprocessor, it would be reasonable to expect
the second load of A to hit in the cache, since A has not been
replaced by other accesses since it was first loaded. However, in the
multiprocessor scenario in Figure
, Processor 2 stores to location A during this interval, thus
invalidating A from Processor 1's cache, and resulting in a
cache miss the second time Processor 1 loads A. Such coherence
misses should be taken into account by the compiler during its analysis
phase when it is predicting which references to prefetch.