-*- Dictionary: design; Package: C -*- System building: When we go to new data structure representations, Genesis will need to be mostly rewritten. Since it is badly in need to a rewrite anyway, this is no should think about what a good bootstrapping process is before we do the rewrite. In particular, we should consider what sort of a bootstrapping environment we require. Since the compiler will run in any CL, it would be nice to provide a bootstrap path that will also run in any CL. Genesis has some fairly gratuitous implementation dependence. There isn't any very good reason why Mach VM allocation and I/O is used to build the image. Integer arrays could easily be used, and a "write-n-bytes" interface (write-string in CL) would do writing with acceptable efficiency. The convoluted interface between the loader and Genesis is also a non-portability. Obviously we can't assume an arbitrary CL will have our fasloader. We could allow conditional compilation of the fasloader for use with genesis so that it wouldn't clobber the real loader. But there are still some problems, since the fasloader is fairly implementation dependent for efficiency reasons. I guess these things are susceptible inefficient portable implementations, though. Another possibility would be to go back to a reader-based cold-load format distinct from the fasl format: - Genesis would be slower. - System hacking more difficult, since cold file can't be loaded (but not a big deal now that we have incremental compilation.) + Genesis much easier to understand, more portable (important, since Genesis is often modified in a port). + Remove cold-load considerations from normal dumper/loader, simplifying them and potentially improving efficiency (fasl format could be tuned to an implementation without breaking Genesis). + Fasl dumper can use CMU CL extensions, since it isn't used during a bootstrap. - Constant sharing won't happen in kernel functions (unless sharing done by genesis, which would be more tense anyway, since it would cross file boundaries.) Debugging info is probably the most important case. Another bootstrapping problem with Genesis is that it often works by creating the object at cold-load time, and then transducing the bits into the cold-load image. This is a portability problem, since it uses internal primitives to access the bits. It is also a bootstrapping problem, since the bits may not be the same. The biggest problem is with floats, since the target float format may be different that the current float format (may have more precision). There are similar problems with character bits, codes, etc. But I guess these aren't really Genesis problems, since the lossage will first happen when the compiler reads the constant in the source. Also byte/bit ordering. Genesis should be parameterized by the target byte/bit order (shouldn't care about bootstrap byte/bit order, since written in CL). Code might be a problem in a bootstrap. Assembler might need some sort of parameterizable byte swapping. Structures are also a problem, in that there is no portable way to get at the structure's guts, and the current non-portable way makes strong assumptions about the bootstrap structure representation being the same as the target representation. To do this right, we need some primitive in the bootstrap implementation that will yield the slot values for a structure along with the corresponding names. One way to get this effect would be to alter the #S reader in Genesis so that it reads as some recognizable cookie holding the value/name pairs. Then all we would have to do is ensure that when the cold dumper prints structures, it doesn't use any defined print functions. But the print functions are defined using our DEFSTRUCT, so we can control that. In a bootstrap, the print functions would never be in effect, since the printer would be using whatever native database there is for print functions. Once bootstrapped, we can suppress structure print functions by some kind of printer flag. Cross compilation would have to be done with our defstruct (instantiated through the compiler's global environment, or through package hackery). The cold dumper would dump our defstruct description format. Genesis would then use this info along with the slot value/name pairings to construct structures in the target format. When we change the structure header to point to a type descriptor, rather than just holding the symbol type name, life potentially becomes more difficult for Genesis. Probably we would do something similar to the package system: postpone the initialization of the type system until kernel core initialization. We would build a data structure holding the names of all the structure types dumped as constants along with all the instances. Then at startup time, we can go and bash the headers to point to the appropriate type descriptors. The main uses for structure constants in cold load are type descriptors (defstruct descriptions) and debug info. There is also a bootstrapping problem with array types. Given the array type cleanup, it is guaranteed that (typep (make-array ... :element-type ) '(array )) But guaranteeing (typep '#.(make-array ... :element-type ) '(array )) in a bootstrap seems impossible. The problem is that the bootstrap implementation need not have the same specialized array types as the target implementation. Since there isn't any way to specify array element type to the reader, problems only arise when we construct constants at compile time, and then at run time assume them to be of the constructed type. [But if we go though the printer and reader, then all array type information other than string and bit-vector will be lost, so even once we are bootstrapped, kernel code couldn't depend on specialized vector types in constants.] One attractive bootstrapping scheme would be to use a text-based cold load format with Genesis, but the initial core would only contain the bare necessities to run the loader: file and terminal streams, necessary system interfaces, and the loader. When the core starts up, it just reads a file to load, and loads everything else. We can eliminate any peculiarities of the cold load process by loading the initial files again, this time as fasl files. So genesis can punt on dumping any part of the initial code that isn't absolutely necessary for it to run (i.e. debug info). Also, Genesis only needs to handle constants of types needed by the bootstrap loader. Structure, i-vector and weird number dumping are probably unnecessary. Note that using the text cold-load file representation doesn't eliminate the need for the fasdumper to run in the bootstrap dialect, since we must generate fasl files for the entire system. But this would really be necessary anyway, unless we put the entire system (including compiler) into the initial core, and always rebuilt the core for each change until the compiler was running native. + Simplifies genesis because only simple stuff needs to be dumped. + Makes initial core build faster and less frequently necessary, making the less efficient/more awkward text cold load file more feasible (simplifying Genesis.) Structure fasdumping: A better way to dump structures would be to dump them as slot name/value pairs. This has the advantage of being insensitive to the actual structure representation: the dumper doesn't need to know how the structure is laid out. This also will result in much more reasonable behavior when loading a constant for a structure type that has been redefined. A possible implementation would be to dump the names as keywords, and then fop-funcall the default constructor function. (this would require existence of a default constructor, but then so does #S.)