Forum: CompilerIssue: CONSTANT-COMPILABLE-TYPES
References: CLtL pp. 56, 77-80, 324
Issue CONSTANT-MODIFICATION
Issue CONSTANT-CIRCULAR-COMPILATION
Issue CONSTANT-COLLAPSING
Issue QUOTE-SEMANTICS
Issue LOAD-OBJECTS
Issue CONSTANT-FUNCTION-COMPILATION
Category: CLARIFICATION, ADDITION
Edit history: 11/07/88, V1 by Cris Perdue
11/14/88, V2 by Cris Perdue
11/22/88, V3 by Cris Perdue
12/20/88, V4 by Cris Perdue
01/06/89, V5 by Sandra Loosemore (minor editorial
clarifications, expand discussion)
03/05/89, V6 by Cris Perdue (more response to comments,
especially from Moon and and from Loosemore)
03/05/89, V7 by Loosemore (more editorial tweaks)
03/13/89, V8 by Loosemore (discussion)
03/22/89, V9 by Loosemore (restructure)
04/03/89, V10 by Loosemore (amendment)
Status: Proposal SPECIFY passed March 1989.
Problem description:
CLtL does not specify what objects can be in compiled constants and it
does not say what relationship there is to be between the constant
passed to the compiler and the one that is established by compiling
and then loading its file. Relevant remarks in CLtL concerning file
compilation and the definition of QUOTE suggest that the compiler
handles constants in ways that are not actually possible.
Introduction to the proposal:
The proposal CONSTANT-COMPILABLE-TYPES:SPECIFY attempts to spell out
what types can appear in compiled constants, and what it means when
they appear.
The key is a definition of an equivalence relationship between Lisp
objects, "similarity as constants". Code passed through the file
compiler and then loaded must behave as though quoted constants in it
are "similar" to quoted constants in the corresponding source code.
Issue CONSTANT-COLLAPSING addresses the issue of whether, for two
objects that are not EQL in the source code (but which are similar as
constants), the corresponding objects in the compiled code may be
EQL.
Issue CONSTANT-CIRCULAR-COMPILATION addresses the issue of whether,
for two objects that are EQL in the source code, the corresponding
objects in the compiled code must also be EQL.
Comments within the text of the proposal are enclosed in double angle
brackets, <<like this>>.
Proposal: CONSTANT-COMPILABLE-TYPES:SPECIFY
An object may be used as a quoted constant processed by COMPILE-FILE
if the compiler can guarantee that the resulting constant established
by loading the compiled file is "similar as a constant" to the
original.
Some types of objects, such as streams, are not supported in constants
processed by the file compiler. Such objects may not portably appear
as constants in code processed with COMPILE-FILE. Conforming
implementations are required to handle such objects either by having
the compiler and/or loader reconstruct an equivalent copy of the
object in some implementation-specific manner; or by having the
compiler signal an error.
Of the types supported in constants, some are treated as aggregate
objects. For these types, being similar as constants is defined
recursively. We say that an object of these types has certain "basic
attributes", and to be similar as a constant to another object, the
values of the corresponding attributes of the two objects must also be
similar as constants.
This kind of definition has problems with any circular or "infinitely
recursive" object such as a list that is an element of itself. We use
the idea of depth-limited comparison, and say that two objects are
similar as constants if they are similar at all finite levels. This
idea is implicit in the definitions below, and applies in all the
places where attributes of two objects are required to be similar as
constants.
The following terms are used throughout this proposal:
The term "constant" refers to a quoted or self-evaluating constant,
not a named (defconstant) constant.
The term "source code" is used to refer to the objects constructed
when COMPILE-FILE calls READ, and additional objects constructed by
macroexpansion during COMPILE-FILE.
The term "compiled code" is used to refer to objects constructed by
LOAD.
Two objects are defined to be "similar as a constant" if and only if
they are both of one of the types listed below and satisfy the
additional requirements listed for that type.
Number
Two numbers are similar as constants if they are of the same type
and represent the same mathematical value.
Character
Two characters are similar as constants if they both represent
the same character.
<<Note that this definition has to depend on the results of the
Character Set proposals. The intent is that this be compatible with
how EQL is defined on characters.>>
Symbol
Issue COMPILE-FILE-SYMBOL-HANDLING defines how the file compiler
and loader handle interned symbols.
An uninterned symbol in the source code is similar as a constant
to an uninterned symbol in the compiled code if their print names
are similar as constants.
Package
A package in the source code is similar as a constant to a package in
the compiled code if their names are similar as constants. Note that
the loader finds the corresponding package object as if by calling
FIND-PACKAGE with the package name as an argument. An error is
signalled if no package of that name exists at load time.
Random-state
Let us say that two random-states are functionally equivalent if
applying RANDOM to them repeatedly always produces the same
pseudo-random numbers in the same order.
Two random-states are similar as constants if and only if copies of
them made via MAKE-RANDOM-STATE are functionally equivalent.
Note that a constant random-state object cannot be used as the "state"
argument to the function RANDOM (because RANDOM side-effects this
data structure).
Cons
Two conses are similar as constants if the values of their respective
CAR and CDR attributes are similar as constants.
Array
Two arrays are similar as constants if the corresponding values each
of the following attributes are similar as constants:
For 1-dimensional arrays:
LENGTH, ARRAY-ELEMENT-TYPE, and ELT for all valid indices.
For arrays of other dimensions:
ARRAY-DIMENSIONS, ARRAY-ELEMENT-TYPE, AREF for all valid indices.
In addition, if the array in the source code is a SIMPLE-ARRAY, then
the corresponding array in the compiled code must also be a
SIMPLE-ARRAY. If the array in the source code is displaced, has a
fill pointer, or is adjustable, the corresponding array in the
compiled code is permitted to lack any or all of these qualities.
Hash Table
Two hash tables are similar as constants if they meet the following
three requirements:
(1) They both have the same test (e.g., they are both EQL hash tables).
(2) There is a unique one-to-one correspondence between the keys of
the two tables, such that the corresponding keys are similar as
constants.
(3) For all keys, the values associated with two corresponding keys
are similar as constants.
If there is more than one possible one-to-one correspondence between
the keys of the two tables, the results are unspecified. A conforming
program cannot use such a table as a constant.
Pathname
Two pathnames are similar as constants if all corresponding pathname
components are similar as constants.
Stream, Readtable, Method
Objects of these types are not supported in compiled constants.
Function
Issue CONSTANT-FUNCTION-COMPILATION specifies how the compiler and
loader handle constant functions.
Structure, Standard-object
<<There is a cl-cleanup issue, LOAD-OBJECTS, pending which proposes
a mechanism for dealing with objects.>>
Rationale:
For the benefit of users, this proposal tries to define a fairly large
set of types that all Common Lisp implementations are to handle. It
also attempts to leave room for implementations to differ. Some
implementations have made opposing choices because the language
doesn't specify one over the other. Some implementations already
handle constants that this proposal defines as not valid in Common
Lisp programs, and that is useful to users of those systems.
Different implementors have different amounts of resources to apply to
their system, and implementations differ in their whole approach in
some cases.
This proposal appears to reflect user demand and appears not to exceed
the capabilities of most implementations of the language.
Current practice:
>From Gail Zacharias (Nov 14): "Coral pretty much implements this
proposal (I think we currently coalesce hash table keys, but that's
just a bug that will be fixed). We also fasdump packages (using the
package name) and compiled functions (but not foreign functions). For
symbols, we dump the name, and if (roughly speaking) the symbol would
get printed with a package prefix, we also dump the package name and
load the symbol into that package (otherwise it gets loaded into the
current load-time package)."
>From David Gray (Nov 9): "The Explorer can compile constant functions,
read tables, and hash tables; an error is signalled for a stream. A
package object used to break the compiler but in release 5 it has been
fixed to generate instructions to call FIND-PACKAGE on the package
name at load time." (Nov 15): [The Explorer does not guarantee
retention of displaced-to and displaced-index-offset attributes.]
"The Explorer also does not currently support dumping closures (either
compiled or evaluated), although non-closure compiled functions can be
dumped."
>From David Moon (Jan 24): "Symbolics Genera current practice: aside
from some current bugs we have with circular structures of certain
types and with preserving the identity of CONSes under EQ, this is
more or less consistent with our current practice, if you made the
changes implied by my earlier comments. We preserve the :displaced-to
and :fill-pointer array attributes. I doubt that we do what the
proposal says for hash-tables, readtables, and random-states. We
support dumping compiled and interpreted functions, but not closures,
which in effect means we don't support dumping functions."
>From Sandra Loosemore (Mar 3): "UCL currently can handle only
constants that are of type number, character, symbol, cons,
simple-vector, or string (which it turns into simple-string). It
signals an error if an attempt is made to compile any other kind of
object as a constant."
Adoption cost:
Not known. Probably moderate or low -- for most implementations. The
cost would be to implementors rather than users since this part of the
language is currently underspecified. The author believes the cost
will be reasonable for KCL, an implementation where there is some
concern about this issue.
This proposal is close to compatible with the Franz, Lucid, Coral,
Texas Instruments, and Symbolics implementations. It is probably
compatible or nearly compatible with other "Lisp Machine"
implementations.
Benefits:
Users would be able to use aggregate objects in constants with
confidence about the behavior of their code.
Conversion cost:
Where this proposal *requires* different behavior than an existing
implementation, there is a conversion cost for users of that
implementation. It appears that this cost will be small, less than
the cost of leaving things unspecified.
Esthetics:
Since there is no adequate definition at present, a fuller definition
would be more esthetic.
Discussion:
This proposal does leave some user-visible attributes of objects
unspecified across the compile-and-load process, except that they must
be consistent with the attributes that must be retained. This
situation is a compromise between the desire for full specification on
the one hand, and on the other hand the desire to leave freedom for
different implementations to remain different and to support some
optimizations such as compacting hash tables and "simplifying" arrays.
Proposals will be entertained for tighter specification of datatypes
such as arrays.
The definition of similarity for random-states supports the
possibility of random states that are immutable because of being in
compiled constants.
Readtables need not be supported by an implementation. If a readtable
contains only symbols to represent functions, here is Cris Perdue's
suggested spec for similarity of readtables:
Character syntax type for each character in the table;
function for each readmacro character, mappings for
dispatch macros; whether terminating or nonterminating
for each readmacro.
Interest has been expressed by a number of people including users, in
support for user-definable "dumping" of CLOS objects and structure
instances. The cleanup issue LOAD-OBJECTS deals with this.
This subsumes the issue CONSTANT-ARRAY-ATTRIBUTES.
Earlier versions of this proposal specified an additional constraint
on uninterned symbols, requiring EQLness to be preserved across the
entire file. However, this special case was removed because it was
thought that including it in this issue made its presentation
unnecessarily complicated, since preservation of EQLness is really a
separate issue (CONSTANT-CIRCULAR-COMPILATION). A consequence of the
decision to remove the special casing for uninterned symbols is that,
unless we accept one of the two CONSTANT-CIRCULAR-COMPILATION
proposals that requires EQLness of constants to be preserved, the
behavior for uninterned symbols will be rather strange. PCL will
reportedly break if uninterned symbols that are EQL in the source code
do not remain EQL in the compiled code.