Go to the first, previous, next, last section, table of contents.


Array Types

A SUIF array type defines a new type that is a one-dimensional vector with elements of another type. Multi-dimensional arrays are handled as nested arrays (i.e. arrays of arrays) as in C. The size of an array is specified by the upper and lower bounds. Each bound is either an integer expression or unknown. If the bound is an integer expression, it can either be an integer known at compile time, or a SUIF variable symbol representing an integer value to be computed at run time.

The upper and lower bounds of an array type are stored in objects of the array_bound class. An array bound can be one of three things, an integer constant, a SUIF variable symbol, or entirely unknown. The is_constant, is_variable, and is_unknown methods check for these three conditions. If the bound is a constant, the constant method returns the integer value of the bound. Likewise, if it is a variable, the variable method returns a pointer to the variable symbol. The print method is also available to print out the value of an array bound. The global variable unknown_bound is a predefined array bound that can be used when a bound is unknown.

A type node with a TYPE_ARRAY operator is stored in an array_type object. The array_type class contains three fields: the lower and upper bounds and the element type. The lower_bound, upper_bound, and elem_type methods retrieve the contents of these fields, and the set_lower_bound, set_upper_bound, and set_elem_type methods change their values. The SUIF definition of arrays says that the elements will be stored adjacently. If this would violate alignment restrictions, the element types must be padded with extra space. For example, if some machine has 24 bit integers with a 32 bit alignment requirement, the 24 bit integers may not be used directly as array elements; instead, one would have to create a structure of size 32 bits that contains a 24 bit integer at offset zero and no other fields, and then use arrays of this structure type.

The bounds of an array type tell two things: the number of elements and the how the index of an array access instruction will be interpreted. The number of elements determines the size of the array -- the size will be the number of elements times the size of each element. Any array type with an upper or lower bound that is anything but an integer constant will not have a size that is known at compile time, so such a type cannot be used anywhere a known size is required, such as for a variable type, an expression type, or the type of an object copied by a memcpy instruction.

The other time when array bounds are meaningful is in an array access (a SUIF in_array instruction). There each index is required to be between the given bounds or the result is undefined. The location referred to is determined by subtracting the lower bound from the index and then multiplying by the size of the next lower dimension, or element size if there are no more dimensions in the array access instruction. The size in this case is not necessarily known at compile time and is considered to be the difference between the upper and lower bounds plus one (the number of elements) times the size of the next lower dimension, or the element size if this is the smallest dimension. The bounds are evaluated after all sub-expressions of the array access instruction have been evaluated and before the result of the array access instruction itself. Any bounds needed in this calculation must evaluate to integers when this array access instruction is executed, so they must be integer constants or variable symbols, and if variable symbols they must have defined values at this point. Note that this implies that the array type for an array access may not have unknown lower bounds and may have only one unknown upper bound, that of the outermost dimension. The are_bounds_unknown method checks if the upper bound of an array type is unknown.

Note also that all of this information from the array type is also available in a different form from the in_array class, between the bound expressions and offset_op expression. The results are required to be the same at run time whether the array type bound information is used or whether the bound and offset_op expressions are used, or else the resulting behavior is undefined. This redundancy allows more flexibility in analyzing array access patterns. Code transformation passes can treat the in_array sub-expressions like any other expressions and propagate additional information in, and dependence analysis on individual accesses can look at these general expressions. Code dealing with objects and types only has to deal with the symbolic placeholders.

Array bounds are subject to certain constraints. The upper bound may never be less than the lower bound. If both are constants, this constraint applies to those constants. If one or both are symbolic bounds, this constraint must hold for every array access for which this is the array type. Symbolic bounds are typically local variables that are assigned once each at the very beginning of a procedure and used only as bounds, so their values have the same scope as the types they are used in. This corresponds directly to the declarations in Fortran code of array types with non-constant bounds, but this particular form is not required by SUIF, and heavily transformed code may be somewhat different. What is required is that symbolic bounds not be sub-variables or call-by-reference parameters, since these are really just short-hand for other expressions (see section Features for Compiling Fortran).


Go to the first, previous, next, last section, table of contents.