To reclaim some of the lost performance potential illustrated in Figures
and
, techniques for coping with
memory latency are essential. These techniques fall broadly into two
categories: those that reduce latency, and those that tolerate
latency. Techniques for reducing latency include caching data and making
the best use of those caches through locality optimizations. Techniques for
tolerating latency include buffering and pipelining references,
prefetching, and multithreading. We will briefly discuss each of these
techniques in this subsection to show how prefetching fits into the overall
approach to hiding latency, and to motivate why prefetching itself is worth
studying.