Man page for porky.1

Name

porky - do assorted code transformations

Synopsis

porky [ options ] infile outfile { infile outfile } *

The porky program makes various transformations to the SUIF version 1 code. Command line options specify which transformations are performed. These options fall into two broad catagories. The first catagory is transformations to subsets of SUIF. The purpose of each such transformation is to allow subsequent passes to make simplifying assumptions, such as the assumption that there are no branches into a loop body. The other catagory of transformations is those that try to rearrange the code to make it easier for subsequent passes to get information, but don't get rid of any particular constructs. For example, the -forward-prop transformation tries to move as much computation as possible into the bound expressions of each loop but can't guarantee any particular nice form for the results.

This pass expects two or more file specifications on the command line, paired into input and then output files. For each input file, there must be a corresponding output file. If more than one input/output pair is specified, all the input files must have had their global symbol table information merged so that they can form a single file set (see the documentation for the linksuif pass). The output files will also have merged global symbol table information.

In addition to the file specifications, there can be any number of options given on the command line. All options begin with -. If no options at all are given, the SUIF file will simply be read and written back again without any transformations being performed.

Options that Set Operating Modes

-V n: This sets the the verbosity level to the integer n. The default is verbosity level 0. At higher verbosity levels, more comments will be written to standard output about what porky is doing. Verbosity levels above three have the same effect as level three.
-fortran: This tells porky to look at the suif file in Fortran mode to see call-by-reference variables. For some kinds of analysis, such as that required for forward propagation (the -forward-prop option), this gives porky more information, so it can do a better job. It is illegal to specify this option if the source of the SUIF code was not Fortran.
-iterate: This flag says to keep doing all the specified optimizations as long as any of them make progress.
-max-iters n: This sets the maximum number of iterations done by the -iterate option to the integer n. The default is to have no limit on the number of iterations.
-no-glob-merge: This option only affects the -cse, -loop-invari_ants, and -fred-loop-invariants options. It says not to consider references to global variables as loads for the purposes of common sub-expression elimination and loop-invariant moving. The default is to essentially consider all references to global variables as loads, so multiple dynamic references to the same global variable will, if possible, be changed to a read of the global variable into a local followed by multiple references to the local variable.
-fast: When doing data-flow analysis on structured control-flow, for example for the -dead-code or -cse passes, a faster algorithm is used when the -fast flag is given. Some precision is lost, but it seldom matters and the speed-up is often very important. If a porky pass is taking an exceptionally long time, try it with -fast.

Options Causing Transformations to Subsets of Suif

-Dfors This dismantles all TREE_FORs.

-Dloops: This dismantles all TREE_LOOPs.
-Difs: This dismantles all TREE_IFs.
-Dblocks: This dismantles all TREE_BLOCKs.
-Darrays: This dismantles all SUIF array instructions.
-Dmbrs This dismantles all mbr (multi-way branch) instructions.
-Dmins This dismantles all min instructions.
-Dmaxs This dismantles all max instructions.
-Dabss This: dismantles all abs (absolute value)

instructions.

-Ddivfloors: This dismantles all divfloor instructions.
-Ddivceils: This dismantles all divceil instructions.
-Dmods This dismantles all mod instructions.
-Dmemcpys: This dismantles all memcpy instructions.
-Dimins: This dismantles all integer min instructions.
-Dimaxs: This dismantles all integer max instructions.
-Diabss: This dismantles all integer abs instructions.
-Dfcmmas: This dismantles all SUIF divfloor, divceil, min, max, abs, and mod instructions. This must be done before mexp/mgen as that back end can't handle these instructions. It is equivalent to each of -Ddivfloor, -Ddivceil, -Dmin, -Dmax, -Dabss, and -Dmods.
-Dfcimmas: This dismantles all SUIF divfloor and divceil instructions, and also all integer min, max, and abs instructions. This must be done before the Iwarp software pipeliner because that can handle floating point min, max, and abs instructions but not divfloor and divceil or integer versions of min, max, and abs. It is equivalent to each of -Ddivfloor, -Ddivceil, -Dimin, -Dimax, and -Diabss.
-defaults: This does the default options to be used right after the front end, to turn some non-standard SUIF that the front end produces into standard SUIF. It also does some things, like constant folding and removing empty symbol tables, to make the code as simple as possible without losing information. It is equivalent to all of the options -fixbad, -for_bound, -no-index-mod, -no-empty-fors, -no-empty_table, -control-simp, and -fold.
-fixbad: This fixes ``bad'' nodes. This is used as part of the default expansion after the front end. Many analysis passes count on the simplifying assumptions about control flow that they can make after this pass.

The effects are as follows:

* Any jump or branch from inside a TREE_FOR or TREE_LOOP to a label immediately following the TREE_FOR or TREE_LOOP (i.e. no intervening instructions that might do anything) are changed to use as its target the break label of that TREE_FOR or TREE_LOOP. This will save some nodes from being dismantled because they will contain branches to break labels instead of arbitrary outside labels.

* TREE_FOR nodes with GELE comparison are broken into two TREE_FORs and a TREE_IF to decide between them.

* Any TREE_FOR, TREE_LOOP, or TREE_IF node containing a branch or jump, or the target label used by a branch or jump, is entirely dismantled, unless both the branch or jump and all its possible targets are within the same instruction list. ``Contains'' in this case means at any level down in nested sublists. There is also an exception for branches or jumps to labels that are defined as part of the parent node, such as ``continue'' or ``break'' nodes for loops and fors and the targets such as toplab() and jumpto() used as part of the test code in TREE_IF and TREE_LOOP nodes. Note that this applies only to the immediate children of the node defining such a label; if TREE_LOOP A contains TREE_LOOP B and within B's body there is a jump to the ``continue'' label of A, then B will be dismantled but A will not.

* Any FOR with a test of ``equal to'' or ``not equal to'' is dismantled and a warning message is printed.

-fixbadstrict: This has the same effect as -fixbad except that it uses a stricter definition of ``bad''. In this case, any node containing a jump or branch, or possible target of a jump or branch, is dismantled, even if they are both within the same instruction list. There are only two exceptions: jumps/branches to the toplab() label of a TREE_LOOP from nodes in the test part of that TREE_LOOP; and jumps/branches to the jumpto() label of a TREE_IF from nodes in the header part of that TREE_IF. These only apply to nodes in the first level test or header list, not to nested sub-lists. These two exceptions reflect the fact that TREE_LOOP test parts and TREE_IF header parts are supposed to have potential jumps to these labels and only degenerate cases wouldn't have them.
-max-gele-split-depth depth: This sets the maximum depth of FOR loops with GELE comparisons that will be split by -fixbad or -fixbadstrict into two FORs and an IF to decide between them to depth. Any more deeply nested GELE FOR loops will be dismantled. This limits the maximum code explosion to 2** depth. The default depth limit is 5.
-for-bound: This dismantles TREE_FORs unless porky can tell that the upper bound and step are both loop constants.
-no-index-spill: This dismantles TREE_FORs with a spilled index variable.
-no-index-mod: This dismantles TREE_FORs for which the index variable might be modified by the TREE_FOR body.
-no-empty-fors: This dismantles TREE_FORs with empty bodies.
-no-call-expr: This takes any calls that are within expression trees out of the expression trees and creates new local variables for them to write their results into, then substitutes a reference to that local variable in the expression tree.
-no-empty-table: This dismantles all TREE_BLOCKs that have empty symbol tables.
-fix-ldc-types: This puts the correct types on all ldc (load constant) instructions that load symbol addresses. This is needed after the front end because parts of the types of symbols may be completed only after a procedure that references the symbol is written out. For example, p might be declared ``extern char p[]'', then used in various procedures, then later defined ``char p[30]''. The complete type information isn't needed at the earlier stage, but in SUIF we must use one symbol and use it consistently, so the symbol's type must change to the completed type. At that point any ldc instructions already written will have the wrong type and must be fixed up by this pass.
-no-struct-copy: This gets rid of all structure copies, whether through copy instructions, load-store pairs, or memcopy instructions. They are replaced with memcopies of integer sized chunks that cover all the bits of the structure. This option is useful before a RISC back end, so that it doesn't have to generate code for multi-word moves.
-no-sub-vars: This removes all sub-variables and replaces uses of them with uses of their root ancestors, with the appropriate offsets.
-globalize: This changes all static local variables into global variables in the file symbol table. It will do this unconditionally to all static locals, so after this pass there will no longer be any static locals. Both the variable and its var_def will be moved. If any annotations on the var_sym or var_def refer to anything not visible in the file symbol table (other than static locals that will soon be moved to the file symbol table), or operands or instructions, such annotations will be deleted. If the type of the static local is not visible in the file symbol table, its type will be changed to the nearest approximation to that type which can be made in the file symbol table and all uses of the symbol will be changed to include casts to the original type.
-array-glob cutoff-size: This makes all statically allocated arrays with size greater than cutoff-size and type visible in the inter-file global symbol table into globals with inter-file scope (external linkage). That is, it will move static local arrays and arrays with file scope that meet the size limit into the interfile global symbol table. The variables are renamed if necessary to avoid conflict with existing global symbols. Note that to be safe this pass should only be run on code that has been linksuif `ed with all other source files to make sure all global namespace conflicts are discovered. The motivating use for this pass is to make all arrays visible to the object-level linker so that array alignment specifications given to the linker will apply to all possible arrays. This allows chacheline alignment of arrays when alignment specifications cannot be given to a back-end C compiler.
-glob-autos: This changes the behavior of the -array-glob flag to affect automatic local arrays as well as static local arrays, provided there is a guarantee of no recursion, by way of the ``no recursion'' annotation.
-guard-fors: This adds ``if'' nodes around some tree_for nodes to insure that whenever any tree_if node is executed, at the landing pad and first iteration will always be executed. Any tree_for nodes that already have guarded annotations will be unaffected because this condition is already guaranteed. All tree_for nodes end up with guarded" annotations after this pass is done. This pass also empties out the landing pads of tree_fors -after they are guarded, it is legal to simply move the landing pad in front of the tree_for, so this pass does so.
-no-ldc-sym-offsets: This breaks all load constant instruction of a symbol and non-zero offset into an explicit addition of the offset.
-only-simple-var-ops: This puts in explicit loads and stores for access to all variables that are not local, non-static, non-volatile variables without the addr_taken flag set.
-kill-enum: This replaces all uses of enumerated types with a corresponding plain integer type. This is useful if a pass doesn't want to see any enumerated types, just the corresponding plain integer types. It is also useful before s2c if the back-end C compiler to run after s2c may not handle enumerated types as the SUIF code does (for example, a particular enumerated type may be treated as an ``unsigned 8-bit integer'' by SUIF but the same enumerated type declaration for the back-end compiler might be treated as a ``signed 32-bit integer'').

Options Causing Other Transformations

-fold This folds constants wherever possible.

-reassociate: This tries to reassociate the result of any arrays that are dismantled so that the dependence on the index variable of the nearest enclosing TREE_FOR is a simple linear expression, if possible. Since arrays are dismantled only if the -Darrays option is used, there is no effect if -Darrays is not specified.
-control-simp: This simplifies TREE_IFs for which this pass can tell that one branch or the other always executes, leaving only the instructions from the branch that executes and any parts of the test section that might have side effects. It also removes entirely any TREE_FORs which it can tell will never be executed.
-forward-prop: This forward propagates the calculation of local variables into uses of those variables when possible. The idea is to give more information about loop bounds and array indexing for doing dependence analysis and loop transformations, or generally to any pass doing analysis.
-copy-prop: This does copy propagation, which is the same as forward propagation limited to expressions that are simple local variables (i.e. if there is a simple copy from one local variable into another, uses of the source variable will replace the destination variable where the copy is live).
-const-prop: This does simple constant propagation.
-ivar: This does simple induction variable detection. It replaces the uses of the induction variable within the loop by expressions of the loop index and moves the incrementing of the induction variable outside the loop.
-reduction: This finds simple instances of reduction. It moves the summation out of the loop.
-for-mod-ref: This puts mod/ref annotations on TREE_FORs. It assumes that the address of a symbol is never stored anywhere, which is valid for Fortran, but usually not for C.
-privatize: This privatizes all variables listed in the annotation privatizable on each TREE_FOR.
-scalarize: This turns local array variables into collections of element variables when all uses of the array are loads or stores of known elements. It will partly scalarize multi-dimensional arrays if they can be scalarized in some but not all dimensions.
-know-bounds: This replaces comparisons of upper and lower bounds of a loop inside the loop body with the known result of that comparison. This is particularly useful after multi-level induction variables have been replaced.
-cse: This does simple common sub-expression elimination.
-dead-code: This does simple dead-code elimination.
-dead-code0: This does even simpler (flow insensitive) dead-code elimination.
-unused-syms: This removes symbols that are never referenced and have no external linkage, or that have external linkage but are not defined in this file (i.e. no procedure body or var_def). Static procedures that are never referenced but have bodies will be removed, but only if this pass is re-run, because by the time porky figures out that it is safe to delete a procedure, it will already have been written. The ``-iter'' option does not help this problem, because that iterates within procedures, not across all procedures; porky cannot iterate on the entire file because it keeps only one procedure in memory at a time.
-unused-types: This removes types that are never referenced.
-loop-invariants: This moves the calculation of loop-invariant expressions outside loop bodies.
-fred-loop-invariants: This is the same as the -loop-invariants flag except that it only considers moving instructions marked with the Fred annotation.
-bitpack: This combines local variables that are used only as single bits (i.e. assigned only one, zero, or the value of another bit variable, and are never addressed), packing them together into variables of type ``unsigned int'' and using bitwise operations to set and extract the appropriate bits. This can be useful in some cases if it allows register allocation of the packed bits, though in other cases the cost associated with the bitwise operations will outweigh the savings.
-if-hoist: This moves certain ``if'' nodes up in the code under some cirumstances that can allow the test for the if to be eliminated. The ``if'' nodes that are candidates to be hoisted are those that have a condition depending on only a single variable. If that is the case, and in the code preceeding the ``if'' (on the same tree_node_list) there is another ``if'' which assigns a constant to the value of that condition variable in either the ``then'' or ``else'' part, this will duplicate the original ``if'' node and put it in both the ``then'' and ``else'' parts of the higher ``if'' node, if this is legal. This is useful for code which has ``chains'' of ``if'' nodes; that is, the body of one sets a variable that is used as a test in a later ``if''. After hoisting, the constant value can often be propagated into the condition in one of the branches of the ``if''. In simple cases where the flag is cleared before the higher ``if'' and then set only in one of its branches, the test can be eliminated in both parts.
-find-fors: This builds tree_for nodes out of tree_loop nodes for which a suitable index variable and bounds can be found.
-glob-priv: Do some code transformations to help with privatization of global variables across calls. It looks for ``possible global privatizable'' annotations on proc_syms. In each such annotation it expects to find a list of global variables. It changes the code so that a new parameter is added to the procedure for each symbol in the annotation, and all uses of the symbol are replaced by indirect references through the new parameter, and at callsites the location of that symbol is passed. If the procedure is a Fortran procedure, the new parameter is a call-by-ref parameter. It arranges for this to work through arbitrary call graphs of procedures. The result is code that has the same semantics but in which the globals listed in each of these annotations are never referenced directly, but instead a location to use is passed as a parameter. If the annotations are put on the input code properly, this allows privatization of global variables to be done as if the globals were local.
-build-arefs: Add array reference instructions in place of pointer arithmetic where possible. This helps dependence analysis of programs that were originally written in C, for example.
-for-norm: Normalize all ``for'' loops to have lower bound of zero, step size of one, and ``less than or equal to'' test.
-ucf-opt: Do simple optimizations on unstructured control flow (branches and labels). The optimizations are done simultaneously in such a way that the result cannot benefit from any more of these optimizations -- the output run through this pass again will not change. The following optimizations are performed:

* Labels that are not the target of any possible branch are removed.

* Uses of labels that are followed by unconditional jumps or other labels without any intervening executable code are changed to uses of the last label that must always be executed before some executable code, and those labels are removed.

* Unreachable code is removed.

* Branches that would end up in the same place before any code is executed as they would if they were not taken are removed.

* Conditional branches followed in the code by unconditional branches without any intervening executable code, followed without any intervening executable code by the label that is the target of the conditional branch, are changed to reverse the condition, change its target to that of the unconditional branch, and remove the conditional branch. That is,

if (cond)
goto L1;
goto L2;
L1:

is replaced by

if (!cond)
goto L2;
L1:

(and L1 is removed if it is not a target of some other instruction).

-uncbr Replace: call-by-reference scalar variables with copy-in, copy-out. This is useful when a later pass, such as a back-end compiler after s2c will not have access to call-by-ref form. Instead of seeing pointer references that might alias with anything, this will allow the pass to see a local variable. Note that without the ``-fortran'' flag, this pass has no effect because without the ``-fortran'' flag, porky won't see anything in call-byref form to begin with.
-loop-cond: Move all loop-invariant conditionals that are inside a TREE_LOOP or TREE_FOR outside the outermost loop.
-child-scalarize: This turns array references with constant indexes that point to array elements that exactly overlap scalar variables with the same type (through the sub-variable mechanism) into uses of those scalar variables.
-child-scalarize-aggressive: This is the same as the ``-child-scalarize'' flag except that if a sub-variable that exactly overlaps doesn't exist but the array is already a descendant of a variable with group type, a new sub-variable will be added to meet the requirement. If the array isn't already under a group type super-variable, new subvariables aren't added because that would tend to complicate some kinds of analysis. If the array is already part of a group, the complication of sub-variables is already there, so it's assumed to be worth it to add another subvariable.
-kill-redundant-line-marks: This removes all mark instructions that contain nothing but line information that is followed immediately by another line information mark.
-nest: This attempts to turn non-perfectly nested loop nests into perfectly nested loop nests by pulling conditionals as far out as possible. This is particularly useful for pulling out loop guarding expressions to restore nests that were originally perfectly nested.
-delinearize: This attempts to turn 1-dimensional array references to multi-dimensional arrays into multi-dimensional array references. It will only do so if it prove that the new indices obey all bound restrictions.
-form-arrays: This flag causes form array annotations to be read and arrays to be formed based on them. See the comments for the k_form_array annotation name in the ``useful'' library for details. If any of the original variables were themselves arrays, it's best to run porky again, this time with the ``-chain-arefs'' flag, after ``-form-arrays'' is done.
-chain-arefs: This causes porky to attempt to chain together multiple array reference instructions in series into a single array reference instruction.
-form-all-arrays: This causes porky to find and mark all sets of compatible variables that can be formed into arrays. All such sets are marked with form array annotations, so if this pass is followed by porky with the ``-form-arrays'' flag, the variables will actually be transformed into arrays.
-ub-from-var: This flag causes porky to attempt to extract upper bound information for array reference instructions from the variables for the arrays being referenced.
-extract-consts: This causes porky to attempt to replace uses of variables with is constant annotations with constants based on the static initialization information.
-extract-array-bounds: This causes porky to try to improve the array bound information by replacing variables used in array bounds with constants by looking for constant assignments to those variables at the start of the scope of each such array type.
-cse-no-pointers: This flag only has an effect when used with the ``-cse'' flag. When used that way, it causes common sub-expression elimination to be supressed on sub-expressions with pointer type. This is useful for avoiding creating temporary variables with pointer type, which inhibits conversion back to Fortran as a back end. It also generally helps avoid breaking up address arithmetic expressions which are often better left intact for the back end.
-breakup cutoff-size: This flag causes porky to attempt to break up expression trees with more than cutoff-size instructions into smaller expression trees reading and writing new temporary variables. It will not create any temporary variables with pointer type. This is useful for back ends that have trouble with really large expression trees, but which are better off with address computations not broken up. This is particularly useful when converting to Fortran because of Fortran's limit of 19 continuation lines; it can also be useful for C if the back-end compiler has hard-coded limits on line sizes or expression sizes, or for machine-code back-ends that can't handle expressin trees that are very large.
-mark-constants: This flag causes porky to put is constant annotations on all statically allocated var_syms for which porky can prove the annotation applies.
-fix-addr-taken: This flag causes porky to set the is_addr_taken flag of each variable to TRUE or FALSE depending on whether or not its address is actually taken, for each variable used only in the given fileset. Any variable which may be used outside the current fileset will have its is_addr_taken flag set TRUE if its address is taken in the given fileset, and otherwise the is_addr_taken flag of that variable will not be changed.

History

The original expander was written and maintained for the old SUIF system by Michael Wolf. Chris Wilson rewrote this expander for the SUIF 1.x system and added the other features to create porky.

Table of Contents