Float Sum combine1: Maximum use of data abstraction: 11.43 cycles/element Float Sum combine2: Take vec_length() out of loop: 9.55 cycles/element Float Sum combine3: Array reference to vector data: 2.86 cycles/element Float Sum combine4: Array reference, accumulate in temporary: 2.86 cycles/element Float Sum combine4p: Pointer reference, accumulate in temporary: 2.86 cycles/element Float Sum Array code, unrolled by 2: 2.86 cycles/element Float Sum combine5p: Pointer code, unrolled by 3, for loop: 2.87 cycles/element Float Sum Array code, unrolled by 3, while loop: 2.86 cycles/element Float Sum Array code, unrolled by 4: 2.87 cycles/element Float Sum Array code, unrolled by 8: 2.88 cycles/element Float Sum Array code, unrolled by 16: 2.86 cycles/element Float Sum Pointer code, unrolled by 2: 2.86 cycles/element Float Sum Pointer code, unrolled by 3: 2.86 cycles/element Float Sum Pointer code, unrolled by 4: 2.88 cycles/element Float Sum Pointer code, unrolled by 8: 2.87 cycles/element Float Sum Pointer code, unrolled by 16: 2.87 cycles/element Float Sum combine6: Array code, unrolled by 2, Superscalar x2: 1.44 cycles/element Float Sum Array code, unrolled by 4, Superscalar x2: 1.42 cycles/element Float Sum Array code, unrolled by 8, Superscalar x2: 1.44 cycles/element Float Sum Array code, unrolled by 3, Superscalar x3: 0.96 cycles/element Float Sum Array code, unrolled by 4, Superscalar x4: 0.96 cycles/element Float Sum Array code, unrolled by 8, Superscalar x4: 0.97 cycles/element Float Sum Array code, unrolled by 6, Superscalar x6: 0.98 cycles/element Float Sum Array code, unrolled by 8, Superscalar x8: 0.98 cycles/element Float Sum Array code, unrolled by 10, Superscalar x10: 0.96 cycles/element Float Sum Array code, unrolled by 12, Superscalar x6: 0.96 cycles/element Float Sum Array code, unrolled by 12, Superscalar x12: 0.96 cycles/element Float Sum Pointer code, unrolled by 8, Superscalar x2: 1.43 cycles/element Float Sum Pointer code, unrolled by 8, Superscalar x4: 0.97 cycles/element Float Sum Pointer code, unrolled by 8, Superscalar x8: 0.98 cycles/element Float Sum Pointer code, unrolled by 9, Superscalar x3: 0.98 cycles/element Float Sum Array code, Unroll x2, Superscalar x2, noninterleaved: 1.43 cycles/element Float Sum Array code, unrolled by 2, different associativity: 1.43 cycles/element Float Sum Array code, unrolled by 3, Different Associativity: 1.27 cycles/element Float Sum Array code, unrolled by 4, Different Associativity: 0.95 cycles/element Float Sum Array code, unrolled by 6, Different Associativity: 0.97 cycles/element Float Sum Array code, unrolled by 8, Different Associativity: 0.98 cycles/element Float Sum SSE code, 1*VSIZE-way parallelism: 0.72 cycles/element Float Sum SSE code, 2*VSIZE-way parallelism: 0.36 cycles/element Float Sum SSE code, 4*VSIZE-way parallelism: 0.23 cycles/element Float Sum SSE code, 8*VSIZE-way parallelism: 0.24 cycles/element Float Sum SSE code, 12*VSIZE-way parallelism: 0.23 cycles/element Float Sum SSE code, 2*VSIZE-way parallelism, reassociate: 0.41 cycles/element Float Sum SSE code, 4*VSIZE-way parallelism, reassociate: 0.23 cycles/element Float Sum SSE code, 8*VSIZE-way parallelism, reassociate: 0.24 cycles/element