Updates 4/28/2003 |
We have posted our final report, introducing a new approach to
instruction matching for vetorization. [.ps | .pdf | [.dvi ]
|
Updates 4/14/2003 |
Please see our project milestone report
and Overview of Optimizations
for an update.
|
Introduction |
Some modern processors allow the same operation (for example, an addition)
to be performed on multiple values using one instruction and wide
registers. A 128-bit register can be configured to store 4 32 bit values,
for example, all of which are added or multiplied simultaneously.
Not only does this lead to a more efficient program (1 instruction can execute more quickly than four), but by packing multiple values into a single register it also allows better usage of registers. We wish to create optimizations to leverage these SIMD (Single Instruction, Multiple Data) instructions to optimize loop induction variables. Even though there may be a cost involved in folding several values into one register, we are hoping that by focusing this optimization on loop variables we can see a performance gain by offloading the folding and unfolding code to the boundaries of the loop, and by reducing the instruction count and register usage within the loop. The Sony PlayStation 2 game console is configured with a modified Toshiba 5900 MIPS processor and is capable of numerous SIMD instructions with 128-bit wide registers (the Emotion Engine). We wish to implement our optimizations on a version of GCC that targets the Emotion Engine.
|
Literature Search |
The literature we will survey includes work by Corinna Lee and previous
course projects. The rest are previous work on optimizations with SIMD
instructions.
E. Hogan, G. Judd, and S. Sinnamohideen. Automatically Identifying Opportunities for Using Special Purpose Instructions. 15-740 Course D. DeVries and C.G. Lee. A Vectorizing SUIF Compiler. In Proceedings of the First SUIF Compiler Workshop, pp. 59-67, January 1996. C.G. Lee and M.G. Stoodley. Simple Vector Microprocessors for Multimedia Applications. Accepted for publication in the 31st Annual International Symposium on Microarchitecture. Project. A. Bik, et al. Efficient Exploitation of Parallelism on Pentium III and Pentium 4 Processor-Based Systems. Intel Technology Journal, 1Q 2001. D. Naishlos et al. Compiler Vectorization Techniques for a Disjoint SIMD Architecture. IBM Research Report, November 2002. This work deals with optimizations specifically for digital signal processing with vector registers. Apparently, in order to do this well, non-traditional optimizations are required. M.G. Stoodley and C.G. Lee. Vector Microprocessors for Desktop Computing. Submitted for publication to the 26th Annual International Symposium on Computer Architecture. GCC 2.95.2 Online Documentation. Sony Computer Entertainment Inc. EE Core Instruction Set Manual Version 5.0
|
Plan of Attack |
Week-by-week schedule (last week left off for slippage):
Should we run into too many complications with the GCC implementation, we can first create a prototype that targets the Intel x86 architecture's SIMD instructions under SUIF.
|
Project Milestone |
By April 14th we hope to have a version of GCC that will print out
comments in the code which identify the instructions that are to be
optimized. This implies that we've defined our framework, identified
the loops and induction variables, and have found the specific
instructions to optimize.
|
Resources Needed |
Ryan and Steven will be working with GCC 2.95.2 set up to cross-compile
from a PC linux host system to the PlayStation 2 Emotion Engine.
|
Getting Started |
So far, Ryan has been reading through a few of the above papers.
|
Project web page |
http://www.cs.cmu.edu/~sosman/classes/compilers/project/
|