Table of Contents

Name

s2c - convert a SUIF file to C

Synopsis

s2c [ options ] SUIF-file [C-file]

Description

The s2c program reads the specified SUIF file and prints out its translation into the Standard C language, if possible. If the C-file specification is given, it is used for output, otherwise the C code is written to standard output.

The output conforms to the ANSI/ISO specifications for Standard C. Each output file is self contained and does not use #include to include any other files. The output is not machine independent, however. The SUIF system determines the sizes of all data types and the locations of fields in structures and elements in arrays. To produce C output that retains type information for debugging, readability of the C code, and to allow for easier optimizations by the back-end C compiler, SUIF structures and arrays are translated into the corresponding C types, and direct structure and array accesses are translated into their C versions. In doing all of this, s2c makes assumptions about how the back-end C compiler will lay out and align the data within structures and arrays. It also must make assumptions about sizes of all types when doing pointer arithmetic or array indexing.

For these reasons, the C compiler used as a back end must use the same data size, alignment, and layout rules used by snoot, the SUIF C front end. If a particular existing C compiler is to be used as a back end for SUIF, snoot must have its configuration parameters set up to match that compiler. See the snoot documentation for more information.

Options

-logic-simp
Do simple logical simplification on the resulting C code before writing it.

-pseudo
Write pseudo-C code if the SUIF code cannot be represented by correct C code. This can happen, for example, if the SUIF code uses integer types with a size different from any of those available in the target C compiler. The default is to abort if correct C code cannot be written. In either case, a diagnostic message is printed to standard output.

The pseudo-C written sticks as close to C as possible. The idea is that even if the C code can't be run, the pseudo-C provides an easy-to-read way of looking at the contents of a SUIF file.

-keep-casts
Put a C type cast everywhere there was a SUIF convert instruction. The default is to fold type casts away where C would implicitly do type conversion anyway.
-omit-header
Omit the header comment that normally comes at the very top of the output C file. By default this header is inserted to record when the file was created and exactly what version of s2c created it.
-no-warn
Do not issue any warning messages.
-always-intnum
When writing integer types, never use C types, always use pseudo-types of the form intn where n is the size in bits, e.g. int8 instead of char.
-array-exprs
Write out SUIF array expressions directly as if C allowed array expressions. By default, SUIF expressions with array type are handled by casting their locations to pointers to dummy structures of the appropriate size and then using the structure type to access them. The semantics of C turn anything with array type into a pointer and implicitly take the location of it, so expressions of array type cannot exist in real C code. With this option, the output is not valid C, and cannot even be evaluated with any consistent semantics, because both arrays and locations of arrays are written the same way and cannot be distinguished. This flag exists to make certain unusual kinds of SUIF code more readable.
-annotes-all
Write out all annotations on the SUIF code as C comments. Annotations used internally by s2c cannot be written out as comments, but all others will be.
-annotes-named name
Write out all annotations with the annotation name name. No other annotations will be written unless they are specified by other command-line options.
-annotes-opcode opcode
Write out all annotations on instructions with opcode opcode.
-annotes-object-kind kind
Write out all the annotations on one specific kind of object, where kind is one of the following:

fses -- all file set entries

nodes -- all tree nodes (except tree_instr nodes, because no permanent annotations are allowed on tree_instrs)

loops -- all ``loop'' nodes

fors -- all ``for'' nodes

ifs -- all ``if'' nodes

blocks -- all ``block'' nodes

all-instrs -- all instructions

symtabs -- all symbol tables

all-syms -- all symbols

vars -- all variable symbols

proc-syms -- all procedure symbols

labels -- all label symbols

var-defs -- all variable definitions

types -- all types

-gcc-bug
Write C code that uses a work-around to avoid a bug in the way gcc handles initialization of unnamed bit-fields. The ANSI rules say that unnamed bit fields are to be ignored in initialization of structures, and s2c assumes this when writing output code. Currect versions of gcc, however, try to assign initializers to the unnamed bit fields and all the initializers get assigned to the wrong fields, resulting in compile-time warnings and errors, and run-time failures. The work-around causes s2c to put in named fields to replace unnamed bit fields whenever possible.
-drop-bounds
Make for bounds into temporary variables if they contain loads, uses of symbols with their addresses possibly taken, including global variables, or SUIF intrinsics that might be turned into control-flow, such as io_max instructions or io_gen macros. The idea is to make sure it is clear to the back-end C compiler that the bounds are loop invariant, so loop optimizations such as software pipelining may be done.
-ll-suffix
If the type of an integer constant is ``long long'' use the suffix ``ll'' and if the type is ``unsigned long long'' use the suffix ``ull'' instead of trying to cast the constant to the given type. This is an extension to the ANSI ``l'' and ``ul'' suffixes on integer constants. Since ``long long'' types are not part of the ANSI C standard, the standard says nothing of suffixes for integer constants to get these types. This is left as an optional switch because some implementations that accept ``long long'' might not accept these two new suffixes. The idea is to keep the default behavior as portable as possible.
-explicit-zero-init
Make all initializations to zero explicit. Normally, static data that is initiailized to zero is given no explicit initialization unless it's the initialization of a structure or array with nonzero initializers to come, in which case some zero initializers are needed to position the non-zero initializers properly. This takes advantage of the fact that ANSI C guarantees all static data is implicitly initialized to zero if there is no explicit initialization. This flag overrides that feature to force explicit initialization even when the data is zero. This flag may be useful under some circumstances when using a Solaris C compiler as a back-end compiler, because the Solaris compiler reportedly puts explicitly and implicitly initialized data in different segments. This causes the source code to bison to break because bison redefines optind, which is also in libc, and which segment optind is in determines which version of optind the linker tries to use.
-limit-escape-sequences
Do not produce simple alphabetic escape sequences other than \n in string constants or string literals. That is, no \a, \b, \f, \r, \t, or \v. Instead, other representations, such as \007, are used. This is a work-around for some back-end compilers that don't recognize all the ANSI C required alphabetic escape sequences. The DEC OSF 3.2 cc compiler, for example, doesn't recognize \a.
-fill-field-space-with-char-arrays
Use character arrays to pad structures instead of bit fields, when possible. This is a work-around for back-end compilers that lay out bit fields differently.

Annotations

The s2c program reserves all annotations names beginning with the prefix ``s2c `' for its own use. Some annotations it uses internally, and some are recognized on input files and may be used to affect the behavior of s2c. The following are the annotations s2c recognizes on its input.

``s2c comments''
This annotation is used to specify comments to be inserted into the C code. Its data should be a list of any number of strings, each of which is interpreted as a separate comment. Each comment should consist of all the text between the ``/*'' and ``*/'' markers. Comments can be put on SUIF ``mark'' instructions, in which case they will be put on lines of their own, or they may be put on any other SUIF object and they will be put in the C code near the C code that most directly results from that SUIF object.

``s2c pragma''
This annotation is used to specify pragmas to be inserted into the C code. s2c looks for this annotation on file_set_entries, symbol tables, and io_mrk instructions. There may be multiple pragma annotations on an object. Each such annotation generates one pragma line in the output.

The exact form of the output line is the string ``#pragma'' followed by a space followed by the printed representations of the immeds in the annotation, separated by a space if there are multiple immeds. Strings are printed without extra quotes around them or any other interpretation, so a pragma annotation with a single string immed as its data allows the form of the pragma line to be specified exactly. All other immeds are printed in natural ways.

An example of the use of this annotation would be with a global symbol table containing an annotation of

pragma": no side effects <sqrt,0>]

which would give this output line:

``s2c preamble pragma''
This annotation is is just like the ``s2c pragma'' annotation except that it only has an effect when it is placed on the global symbol table and the pragmas for this annotation are printed before the declarations for the global symbol table instead of after those declarations.

``s2c pound line''
This annotation provides a general way to insert pre-processing directives into the C code. They are similar to ``s2c pragma'' annotation except that instead of beginning with ``#pragma'' and a space, the lines in the C code begin with only ``#'' and no space, followed directly by the immeds. This allows ``#define'', ``#ifdef'', ``#line'', or any other pre-processing directives at all to be inserted into the code. Whoever creates these annotations is of course responsible for insuring that they are valid pre-processing directives and that they interact properly with whatever else s2c puts in the C file.

An example of the use of this annotation would be with an io_mrk instruction containing an annotation of

pound line": line 37 \"file.c\""]

which would give this output line:

``s2c genop format''
This annotation may be used on the global symbol table to specify the way that SUIF ``io_genop'' instructions will be written. Note that ``io_genop'' instructions cannot in general be translated to valid C code, so this annotation is useful only with the -pseudo command-line option. In that case, s2c functions to make the SUIF code easy to read for a human. To that end, this annotation allows flexibility in the way ``io_genop'' instructions are written.

There are two forms that are recognized for the data of this annotation. The first is two strings. The first string specifies the name of a ``genop'' and the other is a format string for printing the ``genop''. The other form for the data is a string, then an integer, then another string. In this case the first and last string have the same meanings as before, but the integer specifies the number of arguments, and only ``genop''s with that number of arguments use that format string.

For each ``genop'' to be printed, if there is a format annotation specifying the right number of arguments, that is used. Otherwise, if there is a format annotations not specifying anything about arguments, that is used. If neither of these cases apply, the default method of printing ``genop'' instructions is used: they are printed as function calls with the name of the ``genop'' as the function name.

Each format string is interpreted as follows. The ``%'' character is used as an escape in the format string -- other characters are generally printed directly. The following escape sequences are recognized:

* %%
-- print one ``%'' character
* %a
-- print one of the arguments * %n text %m -- print all remaining unmatched arguments, separated by text if there is more than one, or nothing if there are no unmatched arguments

Note that there can be any number of ``%a'' directives, possibly before and after the ``%n'' directive, but there may only be one ``%n'' directive. Arguments to all ``%a'' directives are matched first, either from the beginning of the argument list for those preceeding a ``%n'' directive or from the end for those coming after one. It is an error for there to be too few arguments to match all ``%a'' directives, but it is not an error to have too many arguments.

Within a ``%n'' directive, the text string may include ``%%'', which translates to one ``%'', but no other occurances of ``%''.

Note that the default format if nothing else is given is equivalent to ``<name>(%n, %)'' if there are any arguments or ``<name>'' if there are no arguments, where <name> is the name of the generic instruction.

EXAMPLES:
Format
output
fun(%n, %)
fun() fun(op1)  fun(op1, op2)" fun(op1, op2, op3)"
fun()
fun() fun() fun()" fun()" {%a: %n, %m} <error> {op1: } {op1: op2}" {op1: op2, op3}"
%a ? %a : %a <error> <error>
<error> op1 ? op2 : op3"

See Also

snoot(1)

History

Robert French wrote an s2c program for an earlier version of the SUIF system. Due primarily to limitations of the SUIF format of that time, the output of the early version wasn't quite correct C code, but instead provided a useful way to format a SUIF file for easy reading. Todd Smith updated portions of the code to compensate for drastic changes in the SUIF system. Chris Wilson made more updates and rewrote parts of the program to produce correct C output.


Table of Contents