For software-controlled prefetching, the overheads include both instruction and memory contention overheads. Hardware-based techniques obviously do not suffer from instruction overhead, which is one of their advantages. However, they should suffer at least as much (if not more) from memory hierarchy contention, since the hardware probes the cache for each predicted reference, rather than only for references predicted to suffer cache misses (as software does). Consequently the primary cache tag contention may be quite high for hardware-based approaches. In addition, hardware-based techniques may suffer from TLB contention, since they must somehow deal with virtual addresses in order to follow the access patterns across physical pages. The TLB may need to be accessed either every reference or once every time a page is crossed, and this may contend with normal instructions that access the TLB. For the software-controlled case, access to the TLB is simplified since this is a normal part of processing an instruction that references memory.