To: "Timothy B. Moore" Subject: Re: some questions about writing vops In-reply-to: Your message of Sat, 11 May 91 11:36:44 -0600. <9105111736.AA09224@defmacro.utah.edu> Date: Mon, 13 May 91 16:44:13 EDT From: Rob.MacLachlan@LISP-PMAX2.SLISP.CS.CMU.EDU For the multiplication and division vops, I would like the arguments to come in in the millicode argument registers (nl2 and nl3), and declare that these values will get trashed by the millicode. How would you write the :args spec in the vop to for this? The usual idiom is: (define-vop (*/signed) (:args (x :scs (signed-reg) :target nl2) (y :scs (signed-reg) :target nl3)) (:results (r :scs (signed-reg))) (:temporary (:sc signed-reg :offset nl2-offset :from (:argument 0) :to (:result 0) :target r) nl2) (:temporary (:sc signed-reg :offset nl3-offset :from (:argument 1)) nl3) (:generator xxx (move nl2 x) (move nl3 y) ...call... (move r nl2))) This assumes that you have constants nl2-offset, nl3-offset, and the result is left in nl2. Move should be a macro or pseudo-instruction that uses location= to see if the source and destination actually differ. The :TARGET specs tell the register allocator to attempt to make the moves uncessary. The lifetime specs are needed on the temps, since temps are by default assumed to conflict with all arguments and results, making targeting impossible. A brief digression on these lifetime declarations: the register allocator doesn't look at the assembly code to determine when registers are actually read and written. Instead, the DEFINE-VOP user must declare argument and result lifetimes such that the register allocator computes the correct conflict set for each TN. Declarations are based on a splitting up of VOP evaluation into discrete times: :LOAD (:ARGUMENT 0) ... (:ARGUMENT n) (:EVAL 0) ... (:EVAL n) (:RESULT 0) ... (:RESULT n) :SAVE The register allocator uses these declarations to compute what might be called a "virtual reference order" which is used to determine which operands and temps are concurrently live. Unfortunately, with complex VOPs, it is non-trivial to get the lifetime declarations right. And if you get them wrong, the VOP may fail in only some contexts. Well, it works... Also, there isn't really unsigned immediate data on the hppa. I created 3 kinds of signed data SCs: immediate5, immediate11, and immediate14, corresponding to the lengths available in different instructions. Actually, we've been moving away from having many different immediate constant scs, in favor of having just one immediate-constant SC, and using the :CONSTANT VOP argument restriction facility. For example, instead of saying: (define-vop (foo) (:args (x :scs (any-reg immediate11))) (:generator xxx (sc-case x (any-reg ...register case...) (immediate11 ...immediate case...)))) [Actually, it is also useful to have a special SC for immediate constants that are cached in registers. For example, on the MIPS we have NULL and ZERO SCs.] Do: ;;; This vop handles just the register case: (define-vop (foo) (:args (x :scs (any-reg))) (:generator xxx ...register case...)) ;;; this vop handles just the constant case: (define-vop (foo/c) (:info x) (:arg-types (:constant (signed-byte 11))) (:generator (- xxx n) ...constant case...)) There are several advantages of the latter: -- The constraint on the constant can be any Lisp type, eliminating the need to figure out all the interesting subsets of integers, define immediate constant SCs. -- The constant case (which is often mostly related) can be in a different VOP, allowing you to allocate fewer temporaries, etc. I think that the only port that we have which consistently uses only a single immediate-constant SC is the RT, but I suggest that you take this approach in your work. The main reason why the MIPS doesn't is that the :CONSTANT argument type stuff was added after the MIPS backend was written. Will be there any problem specifying these storage classes for the arguments of vops that take unsigned arguments (like the /unsigned=>unsigned arithmetic vops)? Which constant (and other) SCs can be loaded into a particular operand SC depends on what move-functions you define. On the RT, these are the two move functions that load integers. (define-move-function (load-immediate 1) (vop x y) ((null immediate) (any-reg word-pointer-reg descriptor-reg)) (let ((val (tn-value x))) (etypecase val (integer (inst li y (fixnum val))) (null (move y null-tn)) (symbol (load-symbol y val)) (character (inst li y (logior (ash (char-code val) type-bits) base-character-type)))))) (define-move-function (load-number 1) (vop x y) ((immediate) (signed-reg unsigned-reg)) (inst li y (tn-value x))) The general moving, loading and storing is in move.lisp. Character and float related moves are in char.lisp and float.lisp. Rob Here are some introductory notes that William wrote when we were training Bill Chiles. Probably you've figured most of this out anyway... ________________________________________________________________ -*- Mode: Text, Fill, Spell -*- In order to aid in portability (and other reasons), the compiler does not convert directly from source code to object code, but instead runs through several internal representations in the middle (two to be exact, IR1 and IR2). Almost all of the machine dependent aspects of the compile are contained in IR2, so we will emphasize that. But before explaining how the IR2 is used to generate code, we need to introduce some terms/concepts first: A Storage Base (SB) corresponds to some storage resource. Typical storage bases are the different register files, the different stacks, and constants. Storage bases are either :finite, :unbounded, or :non-packed. Finite storage bases are things like register files where some finite number of elements will fit in it. Unbounded storage bases are things like stacks which can grow. Non-packed storage bases are for things like immediates and constants that don't have to be packed into some resource. A Storage Class (SC) corresponds to way or place an object can be represented. Each storage class uses some storage base, and can optionally restrict the locations in that storage base that can be used. Each storage class has a list of alternate SCs and constant SCs. The list of alternates is used when the compiler wants to use more of a particular SC than the SB that SC is using. When the compiler runs out of locations in the SB, it can spill one or more objects into any of the SCs alternate SCs. The list of constant SCs are just that: the list of SCs that are used to represent any constants that would have otherwise been represented with this SC (if it were not for the fact that they were constant). A Primitive Type is the compilers idea of what the most basic name for a particular object is. Primitive types contain a list of allowed SCs. This means that any object of that primitive type can only be represented by those SCs (or their alternate SCs and constant SCs). This is how the compiler assures that GC disciplines will be followed. By controlling the allowed SCs for a primitive type, and controlling the allowed locations and/or the SB for the SC, you can control exactly where and how objects of a particular primitive type will be represented. If an object can be represented in more than one way at different times, there must be seperate SCs for each representation. Also, if the representation does not contain the type of the object, that representation must have it's own SC, otherwise it would be impossible to tell the different types of objects in that SC apart. For example, all descriptor representations could use the same SC, but 32-bit raw numbers and 32-bit single floats would each need their own SC so that they could be distinguished from each other and descriptor objects. (In practice, there will probably be more than one SC for descriptor objects to distinguish those objects that GC must scavenge and those objects that GC can ignore.) Temporary Names (TNs) are used by the compiler to represent a particular value or object. The compiler allocates TNs for every variable in the source code in addition to any temporary value that it might need to introduce into complex expressions. The program is represented in IR2 as a set of basic blocks, each block consisting of a sequence of Virtual OPerations (VOPs). Each vop corresponds to a small sequence of code. VOPs take TNs as arguments, use TNs as temporaries, and produce TNs as results. Actually, it would be more correct to say that the code sequence that corresponds to the VOP takes arguments from the locations specified by the argument TNs and puts results in the locations specified by the result TNs, possibly using additional locations specified by any temporary TNs. This distinction is important conceptually. VOPs don't create or destroy TN's; they just extract info from them. When a VOP is defined, the argument and result TNs can be restricted to some particular SC or SCs. It is very important to understand that this restriction will not control whether or not the compiler tries to use the VOP, but instead allows the VOP definition mechanism to automatically coerce the argument or result from one representation to another. For example, a VOP can restrict an argument to a register SC, and the define VOP macro will arrange for any stack TNs to be loaded into a temporary load-TN which is in the correct SC. Additionally, a primitive type can be listed for each argument or result. In this case, the compiler will only use the VOP if the primitive types of the TNs match those listed for the VOP. Note the difference between specifying a primitive type restriction and a SC restriction on a VOP. The primitive type restriction controls whether the VOP is used or not, while the SC restriction just controls what automatic loading the VOP will do on your behalf. VOPs are either emitted by name, or are marked as translating a function known to the compiler. Primitive type restrictions are really only useful when controlling whether or not a particular VOP should or shouldn't be used to translate a particular function. To: "Timothy B. Moore" Subject: Re: some questions about writing vops In-reply-to: Your message of Tue, 14 May 91 14:39:09 -0600. <9105142039.AA22873@defmacro.utah.edu> Date: Tue, 14 May 91 17:28:18 EDT From: Rob.MacLachlan@LISP-PMAX2.SLISP.CS.CMU.EDU Just to cement (ha!) my understanding of lifetimes, :eval is some time after the argument tns are loaded and before the result tns are. Does (:temporary (:sc non-descriptor-reg :to :eval) temp) mean that temp is live until the result tn's are loaded? Does (:eval n) mean anything, or is there only one :eval time? Hmmm... Well, the way I think about it is that: :LOAD is the beginning of time, which is when all argument values are initialized. (:ARGUMENT n) is the time that the N'th argument is *read*. (:EVAL n) are arbitrary intermediate points the VOP evaluation after all arguments have been read, and before any results have been written. It is rare to use more than one :EVAL stage, but you can use as many as you want. (:RESULT n) is the time that the N'th result is written. :SAVE is the end of time, at which all results are read. But :ARGUMENT, :RESULT, etc., are really just naming conventions, with very little inherent semantics. By default, the life of the N'th argument ends at (:ARGUMENT n), and the life of the N'th result begins at (:RESULT n). You can override those defaults using :TO/:FROM in the argument or result spec. (define-vop (foo) (:args (x :target x-temp) (y :target y-temp) (z :target z-temp)) (:results (r)) (:temporary (:from (:argument 0) :to (:result 0) :target r) x-temp) (:temporary (:from (:argument 1)) y-temp) (:temporary (:from (:argument 2)) z-temp) (:temporary (:from (:eval 0) to (:eval 1)) random)) Here is the picture I use for visualizing the lifetimes of the above VOP: ----------> the arrow of time ld a0 a1 a2 e0 e1 r0 sv \/ \/ \/ \/ \/ \/ \/ \/ X ------)(----------------)(------ R Y ---------)(--------------------- Z ------------)(------------------ RANDOM ---------------)(----) On using :constant, presumably this works for vops that are translations, but not for vops emitted by name? Yes, that is correct. One last question, the :generator values for vops seem kind of arbitrary. Are they basically an estimate of the number of instructions executed by the vop, with adjustments to discourage some translations? Practically speaking, the costs are used to sort the applicable VOPs, and hence to determine which VOP to use when more than one applies. In addition to using this ordering, the compiler also assumes that the cost has some real-world significance when deciding what efficiency notes to print. The rule for assigning costs that I have been using is that it should be the "reasonable best case number of cycles." For example, I assume that targeting will succeed, that operands will be in registers, that there will be no TLB/cache misses, etc. Rob