Table of Contents
porky - do assorted code transformations
porky [ options ] infile outfile { infile outfile } *
The porky program makes various transformations to the
SUIF version 1 code. Command line options specify which
transformations are performed. These options fall into
two broad catagories. The first catagory is transformations
to subsets of SUIF. The purpose of each such transformation
is to allow subsequent passes to make simplifying
assumptions, such as the assumption that there are no
branches into a loop body. The other catagory of transformations
is those that try to rearrange the code to make
it easier for subsequent passes to get information, but
don't get rid of any particular constructs. For example,
the -forward-prop transformation tries to move as much
computation as possible into the bound expressions of each
loop but can't guarantee any particular nice form for the
results.
This pass expects two or more file specifications on the
command line, paired into input and then output files.
For each input file, there must be a corresponding output
file. If more than one input/output pair is specified,
all the input files must have had their global symbol
table information merged so that they can form a single
file set (see the documentation for the linksuif pass).
The output files will also have merged global symbol table
information.
In addition to the file specifications, there can be any
number of options given on the command line. All options
begin with -. If no options at all are given, the SUIF
file will simply be read and written back again without
any transformations being performed.
- -V n
- This sets the the verbosity level to the integer n.
The default is verbosity level 0. At higher verbosity
levels, more comments will be written to
standard output about what porky is doing. Verbosity
levels above three have the same effect as
level three.
- -fortran
-
This tells porky to look at the suif file in Fortran
mode to see call-by-reference variables. For
some kinds of analysis, such as that required for
forward propagation (the -forward-prop option),
this gives porky more information, so it can do a
better job. It is illegal to specify this option
if the source of the SUIF code was not Fortran.
- -iterate
-
This flag says to keep doing all the specified
optimizations as long as any of them make progress.
- -max-iters n
-
This sets the maximum number of iterations done by
the -iterate option to the integer n. The default
is to have no limit on the number of iterations.
- -no-glob-merge
-
This option only affects the -cse, -loop-invari_ants,
and -fred-loop-invariants options. It says
not to consider references to global variables as
loads for the purposes of common sub-expression
elimination and loop-invariant moving. The default
is to essentially consider all references to global
variables as loads, so multiple dynamic references
to the same global variable will, if possible, be
changed to a read of the global variable into a
local followed by multiple references to the local
variable.
- -fast
- When doing data-flow analysis on structured control-flow,
for example for the -dead-code or -cse
passes, a faster algorithm is used when the -fast
flag is given. Some precision is lost, but it seldom
matters and the speed-up is often very important.
If a porky pass is taking an exceptionally
long time, try it with -fast.
-Dfors This dismantles all TREE_FORs.
- -Dloops
-
This dismantles all TREE_LOOPs.
- -Difs
- This dismantles all TREE_IFs.
- -Dblocks
-
This dismantles all TREE_BLOCKs.
- -Darrays
-
This dismantles all SUIF array instructions.
- -Dmbrs This dismantles all mbr (multi-way branch) instructions.
-
- -Dmins This dismantles all min instructions.
-
- -Dmaxs This dismantles all max instructions.
-
- -Dabss This
- dismantles all abs (absolute value)
instructions.
- -Ddivfloors
-
This dismantles all divfloor instructions.
- -Ddivceils
-
This dismantles all divceil instructions.
- -Dmods This dismantles all mod instructions.
-
- -Dmemcpys
-
This dismantles all memcpy instructions.
- -Dimins
-
This dismantles all integer min instructions.
- -Dimaxs
-
This dismantles all integer max instructions.
- -Diabss
-
This dismantles all integer abs instructions.
- -Dfcmmas
-
This dismantles all SUIF divfloor, divceil, min,
max, abs, and mod instructions. This must be done
before mexp/mgen as that back end can't handle
these instructions. It is equivalent to each of
-Ddivfloor, -Ddivceil, -Dmin, -Dmax, -Dabss, and
-Dmods.
- -Dfcimmas
-
This dismantles all SUIF divfloor and divceil
instructions, and also all integer min, max, and
abs instructions. This must be done before the
Iwarp software pipeliner because that can handle
floating point min, max, and abs instructions but
not divfloor and divceil or integer versions of
min, max, and abs. It is equivalent to each of
-Ddivfloor, -Ddivceil, -Dimin, -Dimax, and -Diabss.
- -defaults
-
This does the default options to be used right
after the front end, to turn some non-standard SUIF
that the front end produces into standard SUIF. It
also does some things, like constant folding and
removing empty symbol tables, to make the code as
simple as possible without losing information. It
is equivalent to all of the options -fixbad, -for_bound,
-no-index-mod, -no-empty-fors, -no-empty_table,
-control-simp, and -fold.
- -fixbad
-
This fixes ``bad'' nodes. This is used as part of
the default expansion after the front end. Many
analysis passes count on the simplifying assumptions
about control flow that they can make after
this pass.
The effects are as follows:
* Any jump or branch from inside a TREE_FOR or
TREE_LOOP to a label immediately following the
TREE_FOR or TREE_LOOP (i.e. no intervening instructions
that might do anything) are changed to use as
its target the break label of that TREE_FOR or
TREE_LOOP. This will save some nodes from being
dismantled because they will contain branches to
break labels instead of arbitrary outside labels.
* TREE_FOR nodes with GELE comparison are broken
into two TREE_FORs and a TREE_IF to decide between
them.
* Any TREE_FOR, TREE_LOOP, or TREE_IF node containing
a branch or jump, or the target label used by a
branch or jump, is entirely dismantled, unless both
the branch or jump and all its possible targets are
within the same instruction list. ``Contains'' in
this case means at any level down in nested sublists.
There is also an exception for branches or
jumps to labels that are defined as part of the
parent node, such as ``continue'' or ``break''
nodes for loops and fors and the targets such as
toplab() and jumpto() used as part of the test code
in TREE_IF and TREE_LOOP nodes. Note that this
applies only to the immediate children of the node
defining such a label; if TREE_LOOP A contains
TREE_LOOP B and within B's body there is a jump to
the ``continue'' label of A, then B will be dismantled
but A will not.
* Any FOR with a test of ``equal to'' or ``not
equal to'' is dismantled and a warning message is
printed.
- -fixbadstrict
-
This has the same effect as -fixbad except that it
uses a stricter definition of ``bad''. In this
case, any node containing a jump or branch, or possible
target of a jump or branch, is dismantled,
even if they are both within the same instruction
list. There are only two exceptions:
jumps/branches to the toplab() label of a TREE_LOOP
from nodes in the test part of that TREE_LOOP; and
jumps/branches to the jumpto() label of a TREE_IF
from nodes in the header part of that TREE_IF.
These only apply to nodes in the first level test
or header list, not to nested sub-lists. These two
exceptions reflect the fact that TREE_LOOP test
parts and TREE_IF header parts are supposed to have
potential jumps to these labels and only degenerate
cases wouldn't have them.
- -max-gele-split-depth depth
-
This sets the maximum depth of FOR loops with GELE
comparisons that will be split by -fixbad or
-fixbadstrict into two FORs and an IF to decide
between them to depth. Any more deeply nested GELE
FOR loops will be dismantled. This limits the maximum
code explosion to 2** depth. The default
depth limit is 5.
- -for-bound
-
This dismantles TREE_FORs unless porky can tell
that the upper bound and step are both loop constants.
- -no-index-spill
-
This dismantles TREE_FORs with a spilled index
variable.
- -no-index-mod
-
This dismantles TREE_FORs for which the index variable
might be modified by the TREE_FOR body.
- -no-empty-fors
-
This dismantles TREE_FORs with empty bodies.
- -no-call-expr
-
This takes any calls that are within expression
trees out of the expression trees and creates new
local variables for them to write their results
into, then substitutes a reference to that local
variable in the expression tree.
- -no-empty-table
-
This dismantles all TREE_BLOCKs that have empty
symbol tables.
- -fix-ldc-types
-
This puts the correct types on all ldc (load constant)
instructions that load symbol addresses.
This is needed after the front end because parts of
the types of symbols may be completed only after a
procedure that references the symbol is written
out. For example, p might be declared ``extern
char p[]'', then used in various procedures, then
later defined ``char p[30]''. The complete type
information isn't needed at the earlier stage, but
in SUIF we must use one symbol and use it consistently,
so the symbol's type must change to the
completed type. At that point any ldc instructions
already written will have the wrong type and must
be fixed up by this pass.
- -no-struct-copy
-
This gets rid of all structure copies, whether
through copy instructions, load-store pairs, or
memcopy instructions. They are replaced with memcopies
of integer sized chunks that cover all the
bits of the structure. This option is useful
before a RISC back end, so that it doesn't have to
generate code for multi-word moves.
- -no-sub-vars
-
This removes all sub-variables and replaces uses of
them with uses of their root ancestors, with the
appropriate offsets.
- -globalize
-
This changes all static local variables into global
variables in the file symbol table. It will do
this unconditionally to all static locals, so after
this pass there will no longer be any static
locals. Both the variable and its var_def will be
moved. If any annotations on the var_sym or
var_def refer to anything not visible in the file
symbol table (other than static locals that will
soon be moved to the file symbol table), or
operands or instructions, such annotations will be
deleted. If the type of the static local is not
visible in the file symbol table, its type will be
changed to the nearest approximation to that type
which can be made in the file symbol table and all
uses of the symbol will be changed to include casts
to the original type.
- -array-glob cutoff-size
-
This makes all statically allocated arrays with
size greater than cutoff-size and type visible in
the inter-file global symbol table into globals
with inter-file scope (external linkage). That is,
it will move static local arrays and arrays with
file scope that meet the size limit into the interfile
global symbol table. The variables are
renamed if necessary to avoid conflict with existing
global symbols. Note that to be safe this pass
should only be run on code that has been linksuif
`ed with all other source files to make sure all
global namespace conflicts are discovered. The
motivating use for this pass is to make all arrays
visible to the object-level linker so that array
alignment specifications given to the linker will
apply to all possible arrays. This allows chacheline
alignment of arrays when alignment
specifications cannot be given to a back-end C compiler.
- -glob-autos
-
This changes the behavior of the -array-glob flag
to affect automatic local arrays as well as static
local arrays, provided there is a guarantee of no
recursion, by way of the ``no recursion'' annotation.
- -guard-fors
-
This adds ``if'' nodes around some tree_for nodes
to insure that whenever any tree_if node is executed,
at the landing pad and first iteration will
always be executed. Any tree_for nodes that
already have guarded annotations will be unaffected
because this condition is already guaranteed.
All tree_for nodes end up with guarded"
annotations after this pass is done. This pass
also empties out the landing pads of tree_fors -after
they are guarded, it is legal to simply move
the landing pad in front of the tree_for, so this
pass does so.
- -no-ldc-sym-offsets
-
This breaks all load constant instruction of a symbol
and non-zero offset into an explicit addition
of the offset.
- -only-simple-var-ops
-
This puts in explicit loads and stores for access
to all variables that are not local, non-static,
non-volatile variables without the addr_taken flag
set.
- -kill-enum
-
This replaces all uses of enumerated types with a
corresponding plain integer type. This is useful
if a pass doesn't want to see any enumerated types,
just the corresponding plain integer types. It is
also useful before s2c if the back-end C compiler
to run after s2c may not handle enumerated types as
the SUIF code does (for example, a particular enumerated
type may be treated as an ``unsigned 8-bit
integer'' by SUIF but the same enumerated type declaration
for the back-end compiler might be treated
as a ``signed 32-bit integer'').
-fold This folds constants wherever possible.
- -reassociate
-
This tries to reassociate the result of any arrays
that are dismantled so that the dependence on the
index variable of the nearest enclosing TREE_FOR is
a simple linear expression, if possible. Since
arrays are dismantled only if the -Darrays option
is used, there is no effect if -Darrays is not
specified.
- -control-simp
-
This simplifies TREE_IFs for which this pass can
tell that one branch or the other always executes,
leaving only the instructions from the branch that
executes and any parts of the test section that
might have side effects. It also removes entirely
any TREE_FORs which it can tell will never be executed.
- -forward-prop
-
This forward propagates the calculation of local
variables into uses of those variables when possible.
The idea is to give more information about
loop bounds and array indexing for doing dependence
analysis and loop transformations, or generally to
any pass doing analysis.
- -copy-prop
-
This does copy propagation, which is the same as
forward propagation limited to expressions that are
simple local variables (i.e. if there is a simple
copy from one local variable into another, uses of
the source variable will replace the destination
variable where the copy is live).
- -const-prop
-
This does simple constant propagation.
- -ivar
- This does simple induction variable detection. It
replaces the uses of the induction variable within
the loop by expressions of the loop index and moves
the incrementing of the induction variable outside
the loop.
- -reduction
-
This finds simple instances of reduction. It moves
the summation out of the loop.
- -for-mod-ref
-
This puts mod/ref annotations on TREE_FORs. It
assumes that the address of a symbol is never
stored anywhere, which is valid for Fortran, but
usually not for C.
- -privatize
-
This privatizes all variables listed in the annotation
privatizable on each TREE_FOR.
- -scalarize
-
This turns local array variables into collections
of element variables when all uses of the array are
loads or stores of known elements. It will partly
scalarize multi-dimensional arrays if they can be
scalarized in some but not all dimensions.
- -know-bounds
-
This replaces comparisons of upper and lower bounds
of a loop inside the loop body with the known
result of that comparison. This is particularly
useful after multi-level induction variables have
been replaced.
- -cse
- This does simple common sub-expression elimination.
- -dead-code
-
This does simple dead-code elimination.
- -dead-code0
-
This does even simpler (flow insensitive) dead-code
elimination.
- -unused-syms
-
This removes symbols that are never referenced and
have no external linkage, or that have external
linkage but are not defined in this file (i.e. no
procedure body or var_def). Static procedures that
are never referenced but have bodies will be
removed, but only if this pass is re-run, because
by the time porky figures out that it is safe to
delete a procedure, it will already have been written.
The ``-iter'' option does not help this problem,
because that iterates within procedures, not
across all procedures; porky cannot iterate on the
entire file because it keeps only one procedure in
memory at a time.
- -unused-types
-
This removes types that are never referenced.
- -loop-invariants
-
This moves the calculation of loop-invariant
expressions outside loop bodies.
- -fred-loop-invariants
-
This is the same as the -loop-invariants flag
except that it only considers moving instructions
marked with the Fred annotation.
- -bitpack
-
This combines local variables that are used only as
single bits (i.e. assigned only one, zero, or the
value of another bit variable, and are never
addressed), packing them together into variables of
type ``unsigned int'' and using bitwise operations
to set and extract the appropriate bits. This can
be useful in some cases if it allows register allocation
of the packed bits, though in other cases
the cost associated with the bitwise operations
will outweigh the savings.
- -if-hoist
-
This moves certain ``if'' nodes up in the code
under some cirumstances that can allow the test for
the if to be eliminated. The ``if'' nodes that are
candidates to be hoisted are those that have a condition
depending on only a single variable. If
that is the case, and in the code preceeding the
``if'' (on the same tree_node_list) there is
another ``if'' which assigns a constant to the
value of that condition variable in either the
``then'' or ``else'' part, this will duplicate the
original ``if'' node and put it in both the
``then'' and ``else'' parts of the higher ``if''
node, if this is legal. This is useful for code
which has ``chains'' of ``if'' nodes; that is, the
body of one sets a variable that is used as a test
in a later ``if''. After hoisting, the constant
value can often be propagated into the condition in
one of the branches of the ``if''. In simple cases
where the flag is cleared before the higher ``if''
and then set only in one of its branches, the test
can be eliminated in both parts.
- -find-fors
-
This builds tree_for nodes out of tree_loop nodes
for which a suitable index variable and bounds can
be found.
- -glob-priv
-
Do some code transformations to help with privatization
of global variables across calls. It looks
for ``possible global privatizable'' annotations on
proc_syms. In each such annotation it expects to
find a list of global variables. It changes the
code so that a new parameter is added to the procedure
for each symbol in the annotation, and all
uses of the symbol are replaced by indirect references
through the new parameter, and at callsites
the location of that symbol is passed. If the procedure
is a Fortran procedure, the new parameter is
a call-by-ref parameter. It arranges for this to
work through arbitrary call graphs of procedures.
The result is code that has the same semantics but
in which the globals listed in each of these annotations
are never referenced directly, but instead
a location to use is passed as a parameter. If the
annotations are put on the input code properly,
this allows privatization of global variables to be
done as if the globals were local.
- -build-arefs
-
Add array reference instructions in place of
pointer arithmetic where possible. This helps
dependence analysis of programs that were originally
written in C, for example.
- -for-norm
-
Normalize all ``for'' loops to have lower bound of
zero, step size of one, and ``less than or equal
to'' test.
- -ucf-opt
-
Do simple optimizations on unstructured control
flow (branches and labels). The optimizations are
done simultaneously in such a way that the result
cannot benefit from any more of these optimizations
-- the output run through this pass again will not
change. The following optimizations are performed:
* Labels that are not the target of any possible
branch are removed.
* Uses of labels that are followed by unconditional
jumps or other labels without any intervening executable
code are changed to uses of the last label
that must always be executed before some executable
code, and those labels are removed.
* Unreachable code is removed.
* Branches that would end up in the same place
before any code is executed as they would if they
were not taken are removed.
* Conditional branches followed in the code by
unconditional branches without any intervening executable
code, followed without any intervening executable
code by the label that is the target of the
conditional branch, are changed to reverse the condition,
change its target to that of the unconditional
branch, and remove the conditional branch.
That is,
if (cond)
goto L1;
goto L2;
L1:
is replaced by
if (!cond)
goto L2;
L1:
(and L1 is removed if it is not a target of some
other instruction).
- -uncbr Replace
- call-by-reference scalar variables with
copy-in, copy-out. This is useful when a later
pass, such as a back-end compiler after s2c will
not have access to call-by-ref form. Instead of
seeing pointer references that might alias with
anything, this will allow the pass to see a local
variable. Note that without the ``-fortran'' flag,
this pass has no effect because without the ``-fortran''
flag, porky won't see anything in call-byref
form to begin with.
- -loop-cond
-
Move all loop-invariant conditionals that are
inside a TREE_LOOP or TREE_FOR outside the outermost
loop.
- -child-scalarize
-
This turns array references with constant indexes
that point to array elements that exactly overlap
scalar variables with the same type (through the
sub-variable mechanism) into uses of those scalar
variables.
- -child-scalarize-aggressive
-
This is the same as the ``-child-scalarize'' flag
except that if a sub-variable that exactly overlaps
doesn't exist but the array is already a descendant
of a variable with group type, a new sub-variable
will be added to meet the requirement. If the
array isn't already under a group type super-variable,
new subvariables aren't added because that
would tend to complicate some kinds of analysis.
If the array is already part of a group, the complication
of sub-variables is already there, so
it's assumed to be worth it to add another subvariable.
- -kill-redundant-line-marks
-
This removes all mark instructions that contain
nothing but line information that is followed immediately
by another line information mark.
- -nest
- This attempts to turn non-perfectly nested loop
nests into perfectly nested loop nests by pulling
conditionals as far out as possible. This is particularly
useful for pulling out loop guarding
expressions to restore nests that were originally
perfectly nested.
- -delinearize
-
This attempts to turn 1-dimensional array references
to multi-dimensional arrays into multi-dimensional
array references. It will only do so if it
prove that the new indices obey all bound restrictions.
- -form-arrays
-
This flag causes form array annotations to be
read and arrays to be formed based on them. See
the comments for the k_form_array annotation name
in the ``useful'' library for details. If any of
the original variables were themselves arrays, it's
best to run porky again, this time with the
``-chain-arefs'' flag, after ``-form-arrays'' is
done.
- -chain-arefs
-
This causes porky to attempt to chain together multiple
array reference instructions in series into a
single array reference instruction.
- -form-all-arrays
-
This causes porky to find and mark all sets of compatible
variables that can be formed into arrays.
All such sets are marked with form array annotations,
so if this pass is followed by porky with
the ``-form-arrays'' flag, the variables will actually
be transformed into arrays.
- -ub-from-var
-
This flag causes porky to attempt to extract upper
bound information for array reference instructions
from the variables for the arrays being referenced.
- -extract-consts
-
This causes porky to attempt to replace uses of
variables with is constant annotations with constants
based on the static initialization information.
- -extract-array-bounds
-
This causes porky to try to improve the array bound
information by replacing variables used in array
bounds with constants by looking for constant
assignments to those variables at the start of the
scope of each such array type.
- -cse-no-pointers
-
This flag only has an effect when used with the
``-cse'' flag. When used that way, it causes common
sub-expression elimination to be supressed on
sub-expressions with pointer type. This is useful
for avoiding creating temporary variables with
pointer type, which inhibits conversion back to
Fortran as a back end. It also generally helps
avoid breaking up address arithmetic expressions
which are often better left intact for the back
end.
- -breakup cutoff-size
-
This flag causes porky to attempt to break up
expression trees with more than cutoff-size
instructions into smaller expression trees reading
and writing new temporary variables. It will not
create any temporary variables with pointer type.
This is useful for back ends that have trouble with
really large expression trees, but which are better
off with address computations not broken up. This
is particularly useful when converting to Fortran
because of Fortran's limit of 19 continuation
lines; it can also be useful for C if the back-end
compiler has hard-coded limits on line sizes or
expression sizes, or for machine-code back-ends
that can't handle expressin trees that are very
large.
- -mark-constants
-
This flag causes porky to put is constant annotations
on all statically allocated var_syms for
which porky can prove the annotation applies.
- -fix-addr-taken
-
This flag causes porky to set the is_addr_taken
flag of each variable to TRUE or FALSE depending on
whether or not its address is actually taken, for
each variable used only in the given fileset. Any
variable which may be used outside the current
fileset will have its is_addr_taken flag set TRUE
if its address is taken in the given fileset, and
otherwise the is_addr_taken flag of that variable
will not be changed.
The original expander was written and maintained for the
old SUIF system by Michael Wolf. Chris Wilson rewrote
this expander for the SUIF 1.x system and added the other
features to create porky.
Table of Contents