The SUIF compiler performs interprocedural constant propagation to
statically determine as many loop bounds as possible. The loop bounds are
often needed to determine the volume of data accessed by each loop, which
in turn is used to decide whether a loop is within the localized
iteration space (see the algorithm in Figure ).
When the loop bounds remain unknown, the compiler may be faced with
deciding whether an unknown volume of data fits within the cache. To
resolve such cases, our algorithm uses one of two policies: either (i)
unknown loop counts are assumed small, and hence the data would tend to fit
in the cache; or (ii) unknown loop counts are assumed large, and therefore
the data would tend not to fit. For our experiments in the previous
section, we consistently used the former policy of assuming unknown loop
counts to be small, which tends to overestimate what remains in the cache.
We now compare this with the latter policy of assuming unknown loop counts
to be large.
When the compiler assumes unknown iteration counts to be large rather than
small, it produces identical code for 11 of the 13 benchmarks-the two
benchmarks that change are OCEAN and MG. For OCEAN, the difference in
performance is negligible. However, MG performs 4%worse with the
large-loop policy, as shown in Figure (a). In this
case, the benefit of the extra prefetches is more than offset by increased
instruction overhead. Although assuming small loop counts happens to be
better in this case, the opposite could easily be true in programs where
unknown iteration counts are actually large. One solution may be to resolve
loop counts through profiling feedback or to generate adaptive code, as we
will discuss later in Section
. However, the
more interesting result of this experiment is how rarely loop bound
uncertainty affects performance. We observe that while nearly half the
benchmarks contain loops with unknown bounds, in most cases this has no
impact on locality, due to the patterns of data reuse within those loops.