Our algorithm for scheduling prefetches has two goals. First, we would like to minimize the amount of overhead that results from inserting prefetches. Second, we would like the prefetches to be as effective as possible at eliminating cache misses. To address this first goal, we use loop splitting techniques such as peeling and unrolling to isolate only those dynamic instances when the prefetch predicates are true. To address the second goal, we use software pipelining to schedule the prefetches so that they arrive in the cache just before they are needed. We discuss both of these steps in our scheduling algorithm in this section.