-*- Dictionary: int:design; Package: C -*- loading of fasl files with non-binary extensions or with :contents :binary doesn't work? (sloloads?) allocate value cells on cstack for no-xep functions allocate bignum stub not known values. Finish merging CLX top-level changes? SSA flow optimizations? CMU 16e release compile for RT, EAPC Glue paper CLOS The main things that will probably happen are: -- Motif-based inspector/debugger/UI -- Finish the HPPA port -- full ANSI compliance except possibly for PCL problems -- finish the space-saving byte-compilation option -- a port for >386 PCs running Mach & maybe Windows NT -- Merge Eliot Moss's generational GC Might happen: -- a native CLOS -- We will probably add low-level light-weight-process support when integrating the GC. We probably won't get around to adding all of the locks necessary to make things concurrent & re-entrant. -- We have an undergrad part-timer just starting work on a RS6000 port. Structure stuff: Layout: -- contains information needed by the runtime system for type tests & GF dispatching -- changes whenever an open-coded accessor would be invalidated -- may be more than one, because type was redefined in-core, or because the compiler created it. Normally not more than one valid layout. Class: -- Is the "name" of a defined type -- Holds the current run-time info about this type, including the current layout. -- Not duplicated at compile-time when redefined for bootstrapping; only layout and layout-info is duplicated. see INFO TYPE COMPILER-LAYOUT Defstruct description & other meta-info: -- Used by the compiler, (and by introspective tools ?) Todo: -- defstruct terminates top-level form so that subsequent code gets the new layout. Put some cookie at the end of expansion which forces synch? -- bootstrapping: need to make a working compiler with the new stuff. deal with accessor-for in old or new format? Dylan: -- naming problems. we'd like to be able to support class names that aren't lisp symbols in the dumper. One solution would be to continue to use load-time-value to reference dylan classes (the bootstrapping issues can be ignored for now.) PCL stuff: -- Fix structure-class stuff -- integrate class system w/ PCL -- use new object representation w/ layout instead of wrapper? old wrapper junk is in LAYOUT-INFO. Add new condition system with its own instance type. Dylan: array/vector type system confused. aref not generic. Can you make new array types? New vector types? Is a 1d-array a vector? Top level hacks: Add a new IR1 entry: ir1-convert-top-level-lambda. This converts a lambda and returns an XEP for it. The XEP is tagged with the :top-level-external kind which prevents it from being deleted even when there are no references. (flush old :top-level-xep kind and top-level form coalescing.) Maybe we should have a new original-function kind for top-level definitions? Or not... After the initial local call analysis, it would only confuse things. The main issue is that either this entry needs to pre-create the XEP for top-level functions, or it has to somehow let local-call know that this function needs an XEP (and shouldn't be deleted.) I suppose we could also have a per-component list of the top-level functions & maybe some other information (a hashtable?) I guess the :TOP-LEVEL component kind would be deprecated, since this is mainly an efficiency hack to separate top-level code which installs global definitions, and would no longer be necessary. There is also the question of how we handle top-level code that can't be just fasl-converted: one answer would be to fasl-convert a funcall to a new dummy no-arg function. We could then also eliminate the :TOP-LEVEL functional kind. How important is it to be able to efficiently close over top-level variables? [i.e. not combine the top-level code with run-time.] If we still want this, then some of the existing top-level stuff will have to remain. [And analogous to the fasl funcall, we would also need a core-funcall. ?] Instead of representing pending top-level forms just as a list of lambdas, have a more general format where entries can also be a call to some function with constant or top-level-lambda arguments. This can be converted into a fop-funcall with arguments. new eval-when semantics Local allocation: must break TN load/save out of the VOP generator. change vop-info to reference lists of vop-op-info structures which include costs, load-scs, and additionally a load-functions vector to find the load function, and a load predicate function (or NIL). jam together optimized saving and load-tn allocation. allocate the same load-tn for adjacent uses of the same constant or other TN. In code generation, we do necessary loads before calling each generator, and write back results after calling. But if we have previously loaded a particular load-tn, then we need not load it again. except for debugger interactions, we can delay doing any saves of result load-TNs until the end of the block, or until we need to free up the register. This could save some stores in the case where we have multiple writes to the same TN within a block. Probably pretty rare, and SSA conversion will eliminate any such constructs. Might want to change saving to be more implicit, and to work using the same mechanism. This would minimize the risk of spurious moves, such as initializing a stack slot, and then reloading the value, even though the initial value is still in a register. Library: pseudo-scheme: with-standard-io-syntax for translator? pprint-table hackery for #t, #f, (). new MK utilities, JZ? new feebs Ilisp series, other Waters hacks? arm:{mk,jz}-packages.txt pseudo-scheme, rees, comp.lang.scheme aliens Inspect and describe? Should it sort-of-work to load-foreign, save-lisp, restart, then re-load-foreign? Of course, if the code isn't reloaded at the exact same address, then all references will be wrong. Documentation: Format lavexinfo stuff 2-sided no REMOTE package, really all in WIRE *suppress-values-declaration*? DEBUG:*STACK-TOP-HINT* DI:FLUSH-FRAMES-ABOVE Move encapsulation documentation elsewhere? Is it ENCAPSULATED-DEFINITION or RAW-DEFINITION, or what? One naive question that the user manual was unable to answer: How do I quit Lisp cleanly? update doc-diff to understand latex aux files. #| Function indexing case-sensitive. |# Add \fill when notes cause lines to wrap. Fix documentation for fasl file types. Talk about interpreter. Consistently use "evaluator" v.s. "compiler". Many comments still refer to "passing locations" where this concept is now mostly obsolete. Time GC overhead? Document room. Document vm:space-usage, etc? document & export *use-implementation-types* as a portability feature? What is the meaning of :constant SCs in define-sc? other aspects of VM definition document debug:*auto-eval-in-frame* Glossary of compiler error messages. (assign error numbers, or terse ids?) (apply #'+ (mapcar #'sqrt list-of-numbers)) is bad, you might mention the CLtL2 alternative: (reduce #'+ list-of-numbers :key #'sqrt) [Except reduce is horribly inefficient... Probably not much worse than apply #'+, though.] ANSI: defstruct keyword constructor should not bind slot name vars. stream external format, external format in OPEN, etc. New eval-when/top-level semantics. dynamic-extent EQUALP defined on hash-tables. MAP-INTO Any type can be a declaration specifier? Free declarations do not include initforms? In lambda arg defaults? What types can type spec args be? Is (rational 0.5) o.k.? compiler severity conditions Signal appropriate conditions instead of always SIMPLE-ERROR. CLOS conditions/define-condition semantics CLOS type system integration structure-class semantics make-load-form GF CLOS symbols: ADD-METHOD ALLOCATE-INSTANCE BUILT-IN-CLASS CALL-METHOD CALL-NEXT-METHOD CHANGE-CLASS CLASS CLASS-NAME CLASS-OF COMPUTE-APPLICABLE-METHODS DEFCLASS DEFGENERIC DEFINE-METHOD-COMBINATION DEFMETHOD DESCRIBE-OBJECT ENSURE-GENERIC-FUNCTION FIND-CLASS FIND-METHOD GENERIC-FUNCTION INITIALIZE-INSTANCE INVALID-METHOD-ERROR MAKE-INSTANCE MAKE-INSTANCES-OBSOLETE MAKE-METHOD METHOD METHOD-COMBINATION METHOD-COMBINATION-ERROR METHOD-QUALIFIERS NEXT-METHOD-P NO-APPLICABLE-METHOD NO-NEXT-METHOD SLOT-BOUNDP SLOT-EXISTS-P SLOT-MAKUNBOUND SLOT-MISSING SLOT-UNBOUND SLOT-VALUE STANDARD STANDARD-CLASS STANDARD-GENERIC-FUNCTION STANDARD-METHOD STANDARD-OBJECT UNBOUND-SLOT UNBOUND-SLOT-INSTANCE UPDATE-INSTANCE-FOR-DIFFERENT-CLASS UPDATE-INSTANCE-FOR-REDEFINED-CLASS WITH-ACCESSORS WITH-SLOTS Conditions: CELL-ERROR-NAME PRINT-NOT-READABLE PRINT-NOT-READABLE-OBJECT SIMPLE-CONDITION-FORMAT-STRING => SIMPLE-CONDITION-FORMAT-CONTROL PARSE-ERROR WITH-CONDITION-RESTARTS COMPILE-FILE-PATHNAME COMPILER-MACRO (?) DEFINE-SETF-EXPANDER GET-SETF-EXPANSION DYNAMIC-EXTENT FILE-STRING-LENGTH FUNCTION-KEYWORDS MAKE-LOAD-FORM MAKE-LOAD-FORM-SAVING-SLOTS MAP-INTO SPECIAL-OPERATOR-P STREAM-EXTERNAL-FORMAT STYLE-WARNING CLtL1 compatibity: COMPILER-LET DEFINE-SETF-METHOD GET-SETF-METHOD GET-SETF-METHOD-MULTIPLE-VALUE SPECIAL-FORM-P STRING-CHAR STRING-CHAR-P Probably correct as they are (what is COMPILER-MACRO?) COMPILER-MACROEXPAND COMPILER-MACROEXPAND-1 SPARC usability: Stack overflow/recursive errors. Stuff for other ports: don't need generic <=, >=, since they are never used. reserved locations in SC definition. nl5, l2. Sunos: Put all bugs in bugs.txt clear-input doesn't work (on PTY?) Usability: SYSTEM function? tree shaker native PCL Sparc tuning Gen GC More ports (HPPA, 386, RS/6000) Space tuning Testing Improve environment (graphic debugger, new Motif interface, etc.) RT Stuff: RT float stuff do mode hackery: operations to get/set the two status regs lisp code for floating-point-modes, (setf floating-point-modes) make float exceptions precise, sync'ing if necessary. changed debugger float access to use make-single/double-float instead of having sap ref vops? Bugs: (defun create-default-constructor (defstruct creator) (collect ((arglist (list '&key)) (types) (values)); o.k. if vals, dies with assert failure o/w. (dolist (slot (dd-slots defstruct)) (let ((dum (gensym)) (name (dsd-name slot))) (arglist `((,(intern (string name) "KEYWORD") dum))) (types (dsd-type wlot)) (values dum))) (funcall creator defstruct (dd-constructor defstruct) (arglist) (values) (types) (values)))) expt, sin, etc., don't seem to be doing argument count checking. Do something about the save-lisp fd-stream interation. In: DEFUN IMAGE-NOSWAP (INDEX* SRCINC (INDEX1- HEIGHT)) --> THE VALUES PROG1 LET ==> (* (THE ARRAY-INDEX SRCINC) (THE ARRAY-INDEX (INDEX1- HEIGHT))) Note: Unable to recode as shift and add due to type uncertainty: [there was no message....] make the DEBUG quality be in LISP, not EXT. Flush old DEBUG-INFO quality. bogus frames instead of interrupted frames when you get an internal error in the debugger? causes the wrong function name to be printed, because we search for the first interrupted frame. Some race condition where a closed file may not yet exist for TRUENAME, e.g. in COMPILE-FILE printing the output file? inspecting dotted pairs seems broken. i/o of broken weak pointers seems broken. this apparently breaks the inspection of structures with weak pointers in them. ==== Bug COMPILED-FUNCTION-P returns T on an interpreted function such as # ==== Suggestion format-universal-time :PRINT-MERIDIAN NIL doesn't print in 24h-format, just skips am/pm. should be able to define new formats should be able to set default format should perhaps print unknown timezone in "rfc822" format (+0100 for -1 etc) concatenate transforms can't really hack full-sized strings, since they want to represent bit offsets as fixnums. XOR boxes doesn't work in inspector on color display. Why are all your scripts in csh? I think sh script is generaly more portable. I could not run some of your csh script on a standard sun. If you want i can give you the equivalent sh scripts. lose more gracefully when some loser converts a dotted list. CLOS functionality: Integrate with type system CLOS-ify condition system (?) integrate make-load-form CLOS run-time compilation reduction: More args: Add &more, more-argument and more-values. Need new more-values VOP. When a use of more-values is the only argument to a tail MV-call, then we can avoid an extra BLT by passing the more arg base pointer to TAIL-CALL-VARIABLE. Change discriminating functions to use single &more arg, and pass values through with more-values. Need fewer discriminating functions, since we don't need a variant for each number of fixed args: only for different patterns of discriminators. Figure out how precompilation works (or why it doesn't work.) Add special-case threaded interpreters for: LAP code secondary discriminating functions method combinations Figure out what the cached customized method stuff is doing, and how often it will run-time compile. Get some sort of CLOS benchmark. CLOS instrumentation: How often can IVs be assigned a single offset? Properties of the class hierarchy: How close to single-inheritance? Is "mixin" usage recognizable and can it be special-cased? What are the dynamic properties of GF dispatch? How predictable is a particular call? How well would the "same as last time heuristic" with methods validating the cache work? How multi are methods. Can you statistically associate methods with classes in a conventional way? How many classes are there? Could every class have a class-indexed mapping table to tell where methods and IVs are located in that subclass? That is, how big is N^2? CLOS peak speed: Maintain a database of about each program which contains information about the structure of the class hierarchy, what IVs there are, what methods are defined, and where. This helps prevent us from making mistakes at compile or load time about where to locate IVs, which codes to assign to generic functions, etc. In a finalized compilation mode, this information can be assumed true at compile time, and we will give a compile-time error if we discover that it wasn't really right. One issue is how we locate the source when the go back to compile combined methods and generic functions. It seems like it would result in a prohibitive heap size to keep all the code in core as s-expressions. For methods that directly appear in the source, we can just record the source location. Any methods resulting from macroexpansion I guess we can write out into temporary files, and record that location. This sounds like earlier ideas about automatic block compilation. I guess we could also record information about the call graph, and maybe do some block compilation too. The main difference between this and the earlier ideas is that the information sould persist seperately in the filesystem, and would be generated by the compiled. This is instead of being in-core and being generated at load-time. So it would not be necessary to load a system in order to compile it, and in the non-frozen mode, you could keep using an old system description for as long as you wanted (though it would pay to update it eventually.) This also sounds like of like the defsystem/modules/compilation-order ideas we were kicking around. I suppose we want a moderately general database with a functional interface, instead of simply a data-file format. Can you say OODB? It seems kind of attractive putting summary information in fasl files, but not really. In our current practice, we often delete fasl files. Also, the information should all be semantic or source-textual, and independent of the target machine. So you could share one database across multiple architectures. So it seems that the database is more source-like. But you might not want to have parallel files in the source directory: -- it would clutter them -- you might want to have one file in more than one configuration. -- They aren't really source files, and wouldn't want to be managed with RCS, and would want to be easily deletable. -- people might want to base derived configurations on system software, such as defining subclasses of system classes. Anyway, it seems we would want the database more-or-less parallel to the source tree. But the structure of the database would reflect the program structure, and not the source tree structure (execpt insofar as they are the same.) For example, you might have a search-list database: where database:/foo.fndb would have the information about the FOO function in some package. We don't have to use a seperate file for *everything*, but using a file for each extensible region means we can get the filesystem to do most of our storage management. Might want to have C program to sort and index the information dumped out by Lisp. Or for something less general, how about supposing that all the information (other than the source itself) can be kept in core. And we have this file format which represents deltas against the database. You can either write out the entire database, or write out deltas. Deltas can be incrementally appended to the database file, giving an updated database. Writing the entire database produces a "rationalized" version with the minimal number of deltas. Or I suppose you could do something in-between. You certainly would want some capability for multiple databases (sub-databases) for e.g. system code v.s. user code. Splitting by packages would still make sense, though users also need to be able to define deltas w.r.t. system code e.g. to subclass STREAM. The main point here is that if a given programmer works on several unrelated programs, there is no need to suck in the database for everything. How does this integrate with GLOBALDB (if at all)? It does seem like this would supersede it, though we would presumably keep about the same interface and in-core data structures. But we need a more powerful interface that can describe things such as the slots in a type descriptor hanging off of some info type. Could have a different Fdefinition object for each call, and have the called function validate the call and clobber to point to a new cached method if a different one applies this time. Fdefinition object has the raw PC, so we can actually vector into different places in the code object, e.g. to call a function that validates the cache, v.s. one that knows it is valid. What do you have to do to validate a last-time-call cache? We would assume that any class or method redefinitions would explicitly invalidate the cache. So, the only issue is whether the discriminated classes are the same and the EQL specializers match. If the class match is not exact, then the cache *might* be invalid. But if we happen to know that there are no methods on subclasses, then any subclass is o.k. too. So if there was a quick way to test for subclassness, then it might be interesting to cache the method for the largest superclass which has an unambiguous method. The cache data is the function that you call (stored in a fdefn). You also need a cache tag to determine if the data is valid. So we need to be able to associate some tag information with each fdefinition (at a call site.) For the classic single-object dispatch, we only need the class for which the cache is valid. But, in general, we need multiple caches and EQL objects. Note that the representation of the tag information can be method-specific. Whenever we jump in at the validating entry, we can assume that the tag is in the format expected by that method (as long as the cache miss routine can figure that out and initialize it.) So we could just have room for one cached class, and use an indirect vector when we need more information. We could use the FDEFN-NAME slot for the cache tag. We also need to somewhat maintain a global data structure of all the call sites so that we can invalidate them on redefinition (but this data structure could be discarded if redefinition is forbidden.) Interestingly, the cache-last strategy handles streams quite well, which is a case that method cloning can't really deal with. And if cache-last is failing, it could be that cloning could solve the problem. But with things like "redisplay yourself" or "optimize yourself", you would often have a call site that iterates over many objects, doing the same operation on each. In this case, the best you can do is a jump-table. Most conventionally, the jump-table (vector of function pointers) is associated with the receiver's class. You index or hash into the table according to the operation (generic function) being done. PCL is different in that it puts the dispatch table in the generic function. This is encouraged by multi-methods and EQL specializers, but you could automatically introduce secondary dispatch code after the primary dispatch. Secondary dispatch code (especially for EQL specializers) is a lot like combined methods, in that we would really like to smash them together into one function. You can think of EQL methods as being conditionally run by the method combination. Need the primary key for a GF always be the first argument? Need it bear any relationship to the argument precedence order associated with the GF? One way to implement the primary key approach would be to associate an integer ID with each GF (at least if named.) Each class could have an operation lookup table with base and length. Just subtract out the base, and see if (<= 0 index length). If in the range, index into the table and call that function. Might consider having the last-call cache jump to different local miss routines, depending on how it missed. In particular, the cache check must see if all of the arguments are still instances. If we miss, but the primary key is still an instance, then we might want to do some inline (in the method) cache filling for the "frob yourself" case. So if the next method isn't the same as the last, it would directly chain to the next method, instead of invoking some horrible out-of-line cache filling. Yet another problem is what you do with first-class generic functions. PCL does this pretty well by associating methods with the GF, but quite possibly to the detriment of the normal case where a given compiled call is calling a known GF with a known number of arguments. It seems that the best solution might be to regard this is a totally separate problem. In particular, the result of GENERIC-FUNCTION is probably best treated more like syntactic sugar for TYPECASE (any method for speeding subclass tests would be advantageous, e.g. FREEZE-TYPE.) Speed up IV access: -- Structure-like representation eliminates indirection. -- freeze-type Declaration allows slot offset to be determined at compile time. Speed up GF dispatch. -- More args? -- Improve code for PLAP or rip out the LAP stuff. -- Where is LAP used? -- Delete effectless vops in lifetime post-pass when we discover that the results are all dead? Would eliminate spurious initial values. -- Change BULTIN-WRAPPER-OF to be (svref *builtin-wrapper-table* (get-type x)) -- Eliminate secondary dispatch functions? -- Compile-time optimizations: -- Use fdefinition objects and call-named for methods when we can determine which method is being called. At load-time, we would resolve the signature to a fdefinition object. This still allows methods to be incrementally redefined, and if we are careful, also allows new methods to be defined. To support definition of new methods, we need to be sure that all calls which might end up calling different methods use different FDEFNs. After all methods are loaded, we can determine which calls can directly call a method (if the signature is unambiguous) and which calls must call some sort of discriminating function (which might be specialized to incorporate information from the signature.) -- Have type specifiers to specify exact class, and freeze-type specifies no new subclasses. This can help to determine which method. For example, if a method arg is a frozen leaf class, then we know that the arg is of that exact class. -- Duplicate methods for particular exact argument classes so that the method can be compiled with exact argument class knowledge. (both for IVs and GFs.) If we use any interpreted implementation strategies, have an adaptive fallback to native compilation for code that ends up being used heavily. Speed up make-instance: -- Either make the constructor mechanism work automatically, or -- Figure out a better way. -- There should be some way to get structure-like allocation efficiency. Are combined methods efficient? i.e. call-next-method shouldn't do anything; all the code should be precompiled together. CLOS format unification w/ structures & bootstrapping: Change structures to have type descriptor instead of name. Points to DEFSTRUCT-DESCRIPTION which inherits WRAPPER. Put PCL in the cold load? Or at least, allow constant instances and some other stuff in the initial core. PCL w/o the compiler or real interpreter? It seems doable, as long as we can precompile or interpret all of the discriminating functions, method combinations and secondary dispatch functions that we will normally need. Don't need to handle weird method combinations w/o the compiler. What is all that bootstrap shit doing anyway? Can we dump out all the interesting classes, wrappers, etc. as constants, and use the make-load-form stuff to back-patch? Actually, we probably want to have genesis allocate the wrappers and set up vital slots, but the actual class structure can be glommed onto this at initialization time. This is because we need to be able to do structure-eq type tests very early on, so structure instances need to have the correct wrapper. But then functions can only get their hands on the correct wrapper to do an EQ test by using load-time eval, so perhaps this isn't a hard constraint. We would, however, have to ensure that nobody attempts a type test until we have both initialized all the tags and also built a data-structure that can be used to find the correct tag for a type name (for load-time-value processing of tag refs.) Ultimately structure wrappers would be in the INFO TYPE STRUCTURE-INFO (or whatever it is called.) But during initialization, we would need a non-structure-based data-structure (i.e. an alist or property) which would do the mapping for us. For structures, genesis could just create the wrappers as needed, and record them on the plist of the name. When we actually run the defstruct form, we look on the plist to see if we have already created the info. We can fill in the various wrapper slots at that time. (Possibly a few wrappers might have to be hand-initialized early on.) Note that if we use the ordinary load-time-value mechanism to get our hands on the wrapper in type tests, then we won't be able to do a type test until the top-level form for the testing function has run. This is different than the current situation, which would probably break everything. What we could do is dump these wrapper references using a special FOP, and have Genesis resolve those references as well. Create fundamental metaobjects as DEFSTRUCT structure-classes, then change metaclass once you've got them. MOP generally doesn't use MI (not at all?) Generic functions definitely won't be set up by genesis, and unless absolutely necessary, won't be special-case bootstrapped ahead of the time that their top-level form runs. It seems that the worst circularity is in using GFs to determine the applicable methods for a GF. This needs to be explicitly bottomed out at some point. It does seem kind of gratuitous to have both "early" and "magic" GFs, though. Code cleanup: Perhaps when there are multiple default defstruct constructors, we should declare them so that they are known to be defined. Yanking a big input into a shell buffer running a Lisp doesn't work. (on mach?) Report GC overhead in TIME. Debugger cleanup: function end breakpoints don't work for :known-return convention. In debug source, store original pathname, not truename if an absolute pathname. Note that we must define target: by default for this to work. Dynamic state restoring for debug-return make function-debug-function return the main entry Editor breakpoint stuff is broken. Doesn't actually set breakpoints. Sometimes prompts to choose between one breakpoint. Delete-buffer hook doesn't work then the slave is dead. breakpoint-lists should be in the server-info, not a global list. Function-end break locations not known locations? Hide internal frames (use function address to determine.) In the debugger, an error always seems to report the original error function, rather than the function that actually got the new error. (new escape frame is not recognized as such, so the old one is used.) Interpreter environment and source context access commands (and primitives). Interpreter breakpoints. (function-end as well.) debug-return Stepping commands. Make sure that all the stuff exported by DI is available through a command: catchers, etc. Some occasionally useful old commands are missing. We probably want some switches to make old code and users work with the new compiler: Compiler cleanup: Is there a problem with merging tail sets in local call conversion now that we un-tailify calls? Problems with block termination: -- It isn't the case that a terminated call to a known function can be trusted not to return, since someone might always say (the nil (cons a b)). -- When there is a NIL assertion on a known function, we can terminate the block and then optimize away the call (e.g. because its value is unused.) This results in arbitrary code jumping to the component tail. In particular, a conditional may transfer directly to the tail. When both args to check-bounds are constant, shouldn't it just do the check at compile-time? It currently doesn't. Add "Internal Error" banner when the compiler goes into the debugger. Add handler around eval-when (compile). Invalidate IR1 conversion of interpreter functions when macros are redefined. (keep track of expanded macros in IR1 COMPONENT, possibly interesting in compiled code as well.) We just scan the cache, looking for functions that expand the macro being redefined. Could speed this up by having a hash table of all macros expanded by some function in the cache. We would have to search if the macro is present in this table. An entry could be deleted if we do the search and don't find any current uses. Misc: Bug report debugger command? Would stick backtrace together with print-herald, "df" "/usr/cs/etc/version" and user comments in a message and mail it to gripe. *inhibit-tail-recursion* Prevents any full calls from being considered tail-recursive. [Do a post-pass to IR1 clearing tail-p flags.] *ignore-type-declarations* Ignore all type declarations. Makes code with broken type declarations work. In: DEFUN WRITE-AND-MAYBE-WAIT (* (FLOAT COUNT) 10.0) Note: Forced to do GENERIC-* (cost 25). Some load nops not being deleted that should be: VOP RETURN t15[CS0]>t36[NL0] t18[CS1]>t37[CNAME] t35[A0] {1} A0: LW NL0, CFP, 0 A4: NOP A8: LW CNAME, CFP, 4 AC: MOVE CSP, CFP B0: MOVE CFP, NL0 Fix type error in clx/text.lisp text-widths. flush function-info-predicate-type now that we aren't using it for structure types? Change primitive predicates over to using it? Reference to a malformed function name apt to flame out in FBOUNDP because globaldb stuff just passes the name though. Doing force-output on the compiler-error stream is causing output to be forced on the error file all the time. Something of an inefficiency, though probably dwarfed by the overhead of the terminal output. Stuff for other ports: update instruction database for nop deletion. add emission of entry-point in call, static-fun and assembly/support. don't need generic <=, >=, since they are never used. change SPARC over to move to/from word/fixnum. Add notes. move to/from word/fixnum allow descriptor-reg. reserved locations in SC definition. eql/fixnum fix check-fixnum primitive type template definition make sure sap allocators (and other move functions?) have notes. Better error if immediate-constant TN's SC is not allowed by the primitive type (i.e. is not a constant SC of any allowed SC.) Change ir1-finalize to compare the derived function return type with the type for any previous definition, giving a note if the new type is not a subtype of the old. Requires remembering the old defined type somewhere. Vector structure linearization in GC. Scavenge the last descriptor object transported before resuming scavenging at the clean pointer. GC advise (GGC interaction..) Compiler support for clearing unused storage. Correctness issues, backward compatibility enhancements. Change *trace-print-level* to default to the plain *print-level* if their value is NIL. Export *EVENT-NOTE-THRESHOLD* from EXT: (?) Series: scan-array array &rest dimensions collect-array series shape &rest dimensions Dimensions is the dimension numbers for the order in which to scan the array. (i.e. row-major is 0, 1, 2, ... and column-major is n, n-1, n-2.... If no dimensions are specified, scan in row-major order. SCAN-ARRAY result is alterable. SHAPE is a list of the sizes of the dimensions, as returned by ARRAY-DIMENSIONS. Or maybe: SCAN-ARRAY array &rest index-series COLLECT-ARRAY shape series &rest index-series The index-series args are used to form the array indices for accessing a series of array elements. The result length is the length of the shortest index series. If no index-series are supplied, scan in row-major order. Would also be good if we could have a collector that used the alterability information to build a result whose structure copies the argument. Or maybe, that took an array arg and copied its dimensions? But that is just eliminating a call to array-dimensions. Debugger: make source command work on top-level forms. Need to dump most of the source, and backpatch only the offsets, or something. We could patch all offsets at EOF, and use form offsets for top-level forms, or we could add some FOP that patches each TLF offset. Fix and spiff up Hemlock debugger interface. Cleverly print the supplied args when printing an XEP frame (including more args.) debug-return, at least for standard return convention. Error system interface: Fix old-style FUNCTION declaration to accept spread result types Add COMPATIBILITY-NOTE function. Deal better with out-of-order function arg signature changes. Some stuff to detect multiple definitions in the same file or in different files? Location info for variables too? Is there a variable that is supposed to be bound to the input file of compile in ANSI? Defer wrong-type arg warnings until end of compilation to see if correctly redefined forward reference? Have a "searchable" syntax for error messages? Continuation substitution can push compile-time type error messages back too far. It seems that sometimes it might be better to use the DEST as source context rather than the erroneous use (if the use is buried in the guts of a macro.) In this case, it is more likely that the DEST is wrong (use of the wrong function or call syntax), rather than the use returning the wrong type. Perhaps think of a way to avoid screens of dead code notes when we fuck up big. I guess we could record the compiler-error-contexts for all the messages, then at the end of the compilation, we could use the source path to try to figure out the outermost enclosing forms. (That is, delete all notes for forms that appear inside some form we also have a note for.) Note: if we extend the source path to include the lexenv at the time of the original/derived transition, then we can see if a deleted var is really the same one accessible in the source, or just has the same name, instead of counting on the var being present in the source form. 3] The compiler should have told me why it was eliminating code, something like: Eliminating inaccessible code. Code is inaccessible since FOO always returns nil. It probably wouldn't have been so hard to find this if the compiler hadn't seemingly been so peculiar in which pieces of the code it eliminated. Note deletion note for various non-flow conditions? Constant predicates and unused flushable functions? Figure out some way to notice compile time type errors that are detected after type check generation. Make unexpected EOF error do a better job of finding the form start (i.e. ignore comments.) Perhaps add some sort of general efficiency note in the old sense that could be printed for bad implementations (static call VOP) when there is no good implementation that could even potentially apply? Is an efficiency note is called for whenever we do a full call for a function that has templates? Don't flame about "unable to trust output type assertion" except in extreme policies (brevity 0?). Way to handle the compiler condition cleanup: make a new function Compiler-Style-Warning that signals style warnings. This function will be used instead of Compiler-Warning for things that probably indicate a problem, but can't be proven to cause a run-time error. Compiler-Note will be reserved for cases where there is no reason to suppose that the program won't work, mainly for efficiency notes (anything else?). Compiler-Note would signal a different condition, which should probably be a subtype of Style-Warning. That way, at least in error output, we can distinguish between totally harmless conditions that may persist in finished code v.s. things that should definitely be attended to, if for no other reason that to prevent real problems from being obscured. The general idea is that innocuous warnings such as "bound but not referenced" and serious notes like calls with the wrong number of args would be style notes. Probably dead code notes should be style notes too, since we would like notes to be totally ignorable. Make sure that COMPILER-ERROR doesn't try to insert proxy code when it runs at random times. For example, we don't want some random error not really attributable to the running compilation and not in IR1 conversion to try to insert proxy code. In compile-file, etc., we look for errors before inside WCU, so print-summary warnings are not considered when computing the return values. Maybe add a WCU keyword that causes it to return the two status values? Enhancements: Better dead code notes (know when a variable is different.) Put lexenv in source-path somehow so we can recognize different bindings. Also useful for advanced debug info? In describe, print type expander info, other type info. hash/equality stuff. Make dumper use S-A-C hashtable. Make use of the backend features list for better cross compilation/building support. Genesis marks as referenced all exported symbols in cold load? All symbols in cold load form? Generally, dump all info. ==== Suggestion There should be a way to be warned about redefining things. [i.e. in a different file.] Move range checks on irrational functions to the transform args, so that we get better efficiency notes. Transforms for complex arithmetic. Add notes to all DEFTRANSFORMs. Automatically copy assigned arguments into their own variables so that backtrace shows the original values, and so that we can take types from FTYPE declarations. Put dependency info into fasl files for make utility? Relates to per-function dependency info (macros called, etc?) We can represent which file each macro was compiled from, and perhaps the actual value of any constants referenced. block compilation control Give better warnings when some function isn't an entry that should be. At least reliably make it undefined, and preferably add a new function kind, which is a block-local function. Rip out note-name-defined in %defun translator? Improve accountability of IR1 optimizers. Perhaps associate with the event mechanism? Extend events to have at least succeed/fail counters? Note that function+signature is the name of the optimizer. But we might want to talk more explicitly about what is going on. Make DESCRIBE say what sort of info the compiler has about a function: transforms (source, ir1 and optimizer), type inference and code generators. Is it foldable?, etc. Add void, void-object types. Change slowload to use the compiler's reading stuff so that we can give the functions a :FILE source info (allowing editing definitions of interpreted functions.) New sequence iterator transforms as a basis for reimplementing sequence functions. Especially for n-ary functions like concatenate, some and every, map. Separate inline expansion from IR1, take function's cost into consideration. Don't preclude other optimizations. User level &more Cleanup: Divide compiler files into subdirectories: front, back and runtime. Merge hemlock new_compiler branch back onto trunk. Fix old-style FUNCTION declarations (change to FTYPE) Change keywords in FUNCTION types to not be implicitly keywordified. Make sure that all of the miscellaneous unimportant return values that X3J13 defined are in fact correct, both in the functions and in fndb. Package restructuring of system. Rewrite coerce. (Type also has a coerce method? makes sense...) Tuning: Compiler is calling data-vector-set a lot. Data-vector-ref & set are much more pessimal than they need to be due to repeated use of array type predicates. Constant and global-var caching could be a pretty big win for many program (10%-25%) Constant caching would help global var access, and loop invariant would help it more (orthogonal). Cache some float constants in float regs? Add support for float constants directly in the code object? Inline declarations don't work right on LABELS functions. And block compilation inlines are also messed up. IR1-C-G-INLINE won't inline expand functionals. named call unboxed args/return value? Static functions? might make some sort of sense with the fdefinition object stuff. Assume that the number entry will always have the same type signature, must have been compiled with the same declaration (check at load time). Keyword entries too? Save space by not macroexpanding operand load/save in VOP generator functions. For load, we could just have a function that takes the TN-REF (and perhaps load-scs vector), and returns the TN to bind in the body. [And then there's :LOAD-IF]. Something similar for results... Possible block compilation targets: IR2tran, ir1opt system utilities: globaldb, type stuff. Perhaps allow semi-inline expansion of the cache wrapper for get-info-value in ir1tran? Assembler/assem-opt? Debug-dump? fasl dumper? reader. local call ir1opt find-initial-dfo copyprop? constraint prop? ltn? checkgen? semi-inline sset operations? semi-inline type operations? Semi-inline any defun-cached wrapper? Space tuning: -- compact debug-info -- reduce compiler back-end code bloat. -- fix compiler GC lossage. -- Tense static function call. -- Make functions with most calls static? -- Space optimizations: code hoisting, on IR2 or assembler. Subtroutinization? Though these optimizations probably wouldn't get much now, they would be good for cleaning up after optimizations that introduce code, such as inline expansion, generalized IF optimization, loop replication for making the zero-iteration test invariant, etc. Code speed improvements: -- Policy tuning: add DEFPOLICY and make critical parts of the system run unsafe. -- Compile critical parts of the system with efficiency notes, flushing generic arithmetic, etc. Add some block compilation. -- Improved opportunistic inline expansion: delay inlining until after IR1 transforms run. -- Misc IR1 optimizations: arithmetic expression reorganization to expose constants, etc. -- Static call convention. fixed arg, value. A space win as well as a time win (if not moreso.) Non-descriptor operands? -- IR2 optimizations: loop-invariant and GCS. -- Loop detection, for input to representation selection, pack, SSA conversion. -- Advanced IR1 optimizations: SSA conversion. -- Automatic inline expansion/block compilation. -- CLOS support & optimization. -- New localized pack: Minimize saving costs by only saving and restoring when we need to. Cache constants. Add support for callee-saves registers. Inter-routine register allocation (?) Do intra-block lifetime analysis on assembly code, and do code reorganization before register allocation. Do simple move elimination on assembly code. Compiler tuning/cleanup: Reduce size of backend code from VM definition. Tune assembly optimizer. Lifetime representation changes to speed up IR2 stuff and reduce the size. Change to conflict-set from ref order for intra-vop conflicts. add in-buffer support back to FD streams. Fix nconc to have a transform, since we can't inline expand it due to rest args. Append too? Have append-2, append-3, etc? Other than saving space & time when actually doing a call, there is no advantage to >2 arg ops. Not clear that punting optimized saving and copy propagation actually saves compile time. Change optimize qualities to be represented as floats internally? Put symbol hash in symbol header? Note: it might be legal for SXHASH to EQ hash any objects in static or r/o that EQUAL compares with EQ. Not clear how big a win this is. This might not be allowed by the new similar-as-a-constant interpretation, but the new interpretation does allow a sensible hash of vectors, structures (?), etc. Allow local macros around inline function definitions by extracting the local macros and reintroducing them in a macrolet around the body of the inline expansion. [### Or if inlines were dumped as compacted IR1 (with with macros expanded) then there would be no problem.] Added TYPEP or EQ constraint when we see a SETQ? Transform (ASH i n) to (IF (MINUSP n) (%ASHR i (- n)) (%ASHL i n)). This puts the left/right decision in IR1 where it can be optimized. Add negation collapsing: (- (- )) becomes (identity (identity )) using an IR1 optimizer. So (ASH x (- n)) could become (%ASHR x n). Flush UNDEFINED-VALUE with unsafe policy. Use IR1 optimizer that substitutes the prev for the cont when the prev has no dest. Would need IR2 convert method then, since we could no longer use inline expansion. Type stuff: Perhaps we really need a separate NODE-ASSERTED-TYPE or something. There are some sorts of type errors that may fail to get detected due to our quietly ignoring inconsistent node-derived-types. But we can't warn because in dead code, we may make inconsistent inferences. Can we test STATISFIES types at compile time? Even if it doesn't get an error, it might return the wrong answer. Structure TYPEP tests not recognized by constraint propagation after transformation. Introduce some funny primitive in the expansion that allows for this? For some reason, when we have a type error like: (the (values t t) (values a a a)) we get only one value back, which seems a bit random. Presumably someone is seeing NIL and saying "one value". Probably other things fuck up with NIL too. Do something about non-overlapping integer unions. Currently primitive-type of (or (integer -1 -1) (integer 1 10)) loses. Do type check even with a full call or safe template when the continuation's assertion is stronger than the assertion on that operand in the function/template type. Is there enough context in the approximate use warnings (incompatible definition, etc.)? [### It seems that the unknown function location info could be used...] Misc correctness issues and frivolous optimizations: Clear lambda-var-indirect in preanalyzed variables that are still set after optimization, but no longer closed over. The value returned from PROCLAIM and DECLAIM is confusing. (defun test (fn-symbol) (funcall (the symbol fn-symbol))) Note: Called function might be a symbol, so must coerce at run-time. coerce doesn't hack deftypes. Fix *default-cookie*/locally interaction in main. (don't capture proclaims, or something.) Is there a problem with preanalyzing top-level environment stuff, then reanalyzing? In particular, can the format of a closure ever change after someone has already been compiled? Is it really true that only null environment functions (DEFUNS) are referenced across component boundaries? Probably should only transform >= to (not <) when we know we have a rational, since this identity doesn't hold in the presence of IEEE NaNs. Should decode-float signal some IEEE exception if passed an infinity or NAN? Change %unary-round to look at the current rounding mode. Should optimize declaration at lambda head not affect the XEP (e.g. suppressing argument count checking)? Perhaps forbid local calls within WITH-STACK-ALIEN? I guess we could just inhibit use of known call in any component where WITH-STACK-ALIEN is used. Fix up tail-p/tail-set stuff so that correctness doesn't depend on IR1 optimize being done to completion. Note also hacks with retaining the tail set when the RETURN is deleted due to all calls being TR (see DELETE-RETURN.) Last fixed entry not called but still compiled when there is a more arg. At entry analyze, delete any :OPTIONAL lambda with no refs. Control analyze will delete the blocks. [Should have some flag that prevents deletion-driven reoptimization from happening after environment analysis so that we wouldn't accidentally let-convert something in control analysis.] Sometime, jam together the lifetime post-pass and pack pre-passes into one loop over ir2-blocks, with multiple loops over each block. Handling of named constants is odd. It seems that we would like to be able to fold together named and anonymous uses of the same constant. Does Common Lisp allow this? What would be the significance of entering the same Constant structure under multiple names in *free-variables*? Recompute DFO more often so that we are sure all unreachable code is flushed? Perhaps on policy? It would be useful to know if DFO deleted any blocks (but I guess delete-block will be setting component-reoptimize). Make sure everyone that should be marking blocks as needing to be optimized is doing so. This primarily concerns control optimizations, although there may also be missing places in IR1 optimization itself. Move Continuation-Starts-Block and other utilities out of IR1tran. Probably not a big deal, but it seems that with more arg XEPs, some unnecessary type checking is being done on the more arg count. Somehow make the Define-VOP lifetime info less error-prone? A number of bugs have been due to failing to allocate argument or result temps when they are needed. I suppose if we made this easier, as proposed elsewhere, then we would be less likely to lose. Maybe we could somehow make this the default? May want to give the arguments to XEPs and the keyword value temporaries in more entries the actual variable names, so that users can figure out what happened when they get an argument type error. [### Or maybe in a keyword/more entry, we just want to mark the more arg as such so that the debugger can print it?] Hemlock cleanup: process redisplay update problem Make buffer writable before trying to trash it by reading a file. Change "Backward Up List" to skip over prefix chars. Some problem with creating new spelling dictionaries via file option. combined scroll and refresh (ala MWM click to raise) almost always shows the wrong stuff. Hemlock: Check for repeated words in "Check Buffer Spelling?" decache line length for TS streams on window change "Find Unbalanced Parens" and "Check Indentation" (flags lines in region indented incorrectly.) make TTY redisplay special-case indentation (use clear-to-eol and cursor positioning.) Fix indentation for? "obviously" non-call lists, like: '(foo bar bax rax) or: (:start 3 :end 1) Maybe have alternate indentation possibilities and tab alternates between them? Perhaps tab should be changed to "Lisp Re-indent Line", and it initially does indent, but then alternates between a fixed repertoire of alternatives. Some sort of "fairness" criterion in serve-event to prevent runaway slave output from wedging the editor? Undo last spelling correction/auto fill interaction lossage. sometimes when TTY Hemlock wedges, the hemlock input handler is left active, and the TTY is still in raw mode. correct if unique flavor that just "accepts" a misspelling, w/o entering it in the dict. Don't display on TTY when not in the editor. Untabify elisp compatibility GNU emacs user interface default TTY baud rate hackery doesn't work at 19.2k? If you compile a defun that isn't closed off, error, don't quietly compile the next defun. Renaming slave buffers doesn't update the eval server string table names. Completion doesn't do arm:News/Comp.la...