Software-controlled prefetching is only one of several different approaches for coping with latency. We discussed a number of these techniques in Section , and in Section we demonstrated that locality optimizations and prefetching are complementary. While locality optimizations are strictly a compiler-based technique, we will now consider several other techniques that require architectural support to see how they compare and interact with software-controlled prefetching. We begin in Section by comparing hardware-controlled prefetching with software-controlled prefetching. Next, we will discuss relaxed memory consistency models in Section , which are primarily useful for hiding write latency in multiprocessors. Finally, we evaluate multithreading in Section , which is a technique for hiding latency by exploiting parallelism across multiple threads of execution.