Earlier in Section , we discussed the
possibility of placing prefetched data in a separate target buffer, rather
than the normal cache hierarchy. We suggested that as a general policy,
this was not a good idea. However, there is one case where this may make
sense, which is when prefetched data is only used once (i.e. it has no locality). When data has no locality, it will not be reused after it is
brought into the cache. If a large amount of data with no locality is
placed in the cache, it may displace other data that would have a locality
benefit. Therefore one possibility is to issue uncached prefetches
whenever the compiler determines that a reference has no locality.
Using uncached prefetches has several complications. First, if the compiler
incorrectly predicts that a reference has no locality when it does in fact
have locality, the performance will suffer as a result. For example, the
A[j][0] reference in Figure may or may not
have temporal locality, depending on the size of n. If the compiler
assumes n is large and therefore A[j][0] has no locality, it
should realize that using uncached prefetches for A[j][0] will hurt
performance if n turns out to be small. Therefore uncached
prefetches should only be used when the compiler is certain that a
reference has no locality.
The second complication, as we mentioned earlier in Section
, is the hardware complexity of building the
separate target buffer. Given this complexity, it is unclear whether
it would be worth building such a structure even to support uncached
prefetches.