Caches are a critical first step toward coping with memory latency, but are
not a panacea, as we saw already in Figures and
. Caches reduce latency from a memory access to a cache
access whenever data items are found in the cache. The likelihood of
finding data in the cache depends not only on the size and organization of
the cache, but also on the inherent locality of reference within the
application. Locality can occur in both time and space: temporal
locality is the tendency for a recently-accessed item to be accessed again
soon, and spatial locality is the tendency for items near a
recently-accessed item to be accessed soon. Since most applications exhibit
a reasonable amount of locality, caches are generally quite useful. As a
result, most commercial RISC microprocessors provide support for cache
hierarchies, including on-chip primary instruction and data caches. The
benefits of caches in multiprocessors have also been recognized, where
despite the complication of keeping shared writable data
coherent [6], a number of multiprocessors with caches
have been implemented [55][47][41][3]. Therefore
caches are an integral part of the memory latency solution, and the
remaining techniques we discuss build upon caching as a foundation.