To: "Timothy B. Moore" <moore%defmacro@CS.UTAH.EDU>
Subject: Re: some questions about writing vops 
In-reply-to: Your message of Sat, 11 May 91 11:36:44 -0600.
             <9105111736.AA09224@defmacro.utah.edu> 
Date: Mon, 13 May 91 16:44:13 EDT
From: Rob.MacLachlan@LISP-PMAX2.SLISP.CS.CMU.EDU

     For the multiplication and division
    vops, I would like the arguments to come in in the millicode argument
    registers (nl2 and nl3), and declare that these values will get
    trashed by the millicode. How would you write the :args spec in the
    vop to for this?

The usual idiom is:
(define-vop (*/signed)
  (:args (x :scs (signed-reg) :target nl2)
	 (y :scs (signed-reg) :target nl3))
  (:results (r :scs (signed-reg)))
  (:temporary (:sc signed-reg :offset nl2-offset
	       :from (:argument 0) :to (:result 0)
	       :target r)
	      nl2)
  (:temporary (:sc signed-reg :offset nl3-offset :from (:argument 1)) nl3)
  (:generator xxx
    (move nl2 x)
    (move nl3 y)
    ...call...
    (move r nl2)))

This assumes that you have constants nl2-offset, nl3-offset, and the result
is left in nl2.  Move should be a macro or pseudo-instruction that uses
location= to see if the source and destination actually differ.  The
:TARGET specs tell the register allocator to attempt to make the moves
uncessary.  The lifetime specs are needed on the temps, since temps are by
default assumed to conflict with all arguments and results, making
targeting impossible.

A brief digression on these lifetime declarations: the register allocator
doesn't look at the assembly code to determine when registers are actually
read and written.  Instead, the DEFINE-VOP user must declare argument and
result lifetimes such that the register allocator computes the correct
conflict set for each TN.  Declarations are based on a splitting up of VOP
evaluation into discrete times:
    :LOAD
    (:ARGUMENT 0) ... (:ARGUMENT n)
    (:EVAL 0) ... (:EVAL n)
    (:RESULT 0) ... (:RESULT n)
    :SAVE

The register allocator uses these declarations to compute what might be
called a "virtual reference order" which is used to determine which
operands and temps are concurrently live.  Unfortunately, with complex
VOPs, it is non-trivial to get the lifetime declarations right.  And if you
get them wrong, the VOP may fail in only some contexts.  Well, it works...


    Also, there isn't really unsigned immediate data on the hppa. I created 3
    kinds of signed data SCs: immediate5, immediate11, and immediate14,
    corresponding to the lengths available in different instructions.

Actually, we've been moving away from having many different immediate
constant scs, in favor of having just one immediate-constant SC, and using
the :CONSTANT VOP argument restriction facility.  For example, instead of
saying:

(define-vop (foo)
  (:args (x :scs (any-reg immediate11)))
  (:generator xxx
    (sc-case x
      (any-reg ...register case...)
      (immediate11 ...immediate case...))))

[Actually, it is also useful to have a special SC for immediate constants
that are cached in registers.  For example, on the MIPS we have NULL and
ZERO SCs.]

Do:

;;; This vop handles just the register case:
(define-vop (foo)
  (:args (x :scs (any-reg)))
  (:generator xxx
    ...register case...))

;;; this vop handles just the constant case:
(define-vop (foo/c)
  (:info x)
  (:arg-types (:constant (signed-byte 11)))
  (:generator (- xxx n)
    ...constant case...))

There are several advantages of the latter:
 -- The constraint on the constant can be any Lisp type, eliminating the
    need to figure out all the interesting subsets of integers, define
    immediate constant SCs.
 -- The constant case (which is often mostly related) can be in a different
    VOP, allowing you to allocate fewer temporaries, etc.

I think that the only port that we have which consistently uses only a
single immediate-constant SC is the RT, but I suggest that you take this
approach in your work.  The main reason why the MIPS doesn't is that the
:CONSTANT argument type stuff was added after the MIPS backend was written.

    Will be there any problem specifying these storage classes for the
    arguments of vops that take unsigned arguments (like the
    /unsigned=>unsigned arithmetic vops)?

Which constant (and other) SCs can be loaded into a particular operand SC
depends on what move-functions you define.  On the RT, these are the two
move functions that load integers.

    (define-move-function (load-immediate 1) (vop x y)
      ((null immediate)
       (any-reg word-pointer-reg descriptor-reg))
      (let ((val (tn-value x)))
	(etypecase val
	  (integer
	   (inst li y (fixnum val)))
	  (null
	   (move y null-tn))
	  (symbol
	   (load-symbol y val))
	  (character
	   (inst li y (logior (ash (char-code val) type-bits)
			      base-character-type))))))

    (define-move-function (load-number 1) (vop x y)
      ((immediate)
       (signed-reg unsigned-reg))
      (inst li y (tn-value x)))

The general moving, loading and storing is in move.lisp.  Character and
float related moves are in char.lisp and float.lisp.


  Rob

Here are some introductory notes that William wrote when we were training
Bill Chiles.  Probably you've figured most of this out anyway...

________________________________________________________________
-*- Mode: Text, Fill, Spell -*-

In order to aid in portability (and other reasons), the compiler does not
convert directly from source code to object code, but instead runs through
several internal representations in the middle (two to be exact, IR1 and
IR2).  Almost all of the machine dependent aspects of the compile are
contained in IR2, so we will emphasize that.  But before explaining how the
IR2 is used to generate code, we need to introduce some terms/concepts
first:

A Storage Base (SB) corresponds to some storage resource.  Typical storage
bases are the different register files, the different stacks, and
constants.  Storage bases are either :finite, :unbounded, or :non-packed.
Finite storage bases are things like register files where some finite
number of elements will fit in it.  Unbounded storage bases are things like
stacks which can grow.  Non-packed storage bases are for things like
immediates and constants that don't have to be packed into some resource.

A Storage Class (SC) corresponds to way or place an object can be
represented.  Each storage class uses some storage base, and can optionally
restrict the locations in that storage base that can be used.  Each storage
class has a list of alternate SCs and constant SCs.  The list of alternates
is used when the compiler wants to use more of a particular SC than the SB
that SC is using.  When the compiler runs out of locations in the SB, it
can spill one or more objects into any of the SCs alternate SCs.  The list
of constant SCs are just that: the list of SCs that are used to represent
any constants that would have otherwise been represented with this SC (if
it were not for the fact that they were constant).

A Primitive Type is the compilers idea of what the most basic name for a
particular object is.  Primitive types contain a list of allowed SCs.  This
means that any object of that primitive type can only be represented by
those SCs (or their alternate SCs and constant SCs).  This is how the
compiler assures that GC disciplines will be followed.  By controlling the
allowed SCs for a primitive type, and controlling the allowed locations
and/or the SB for the SC, you can control exactly where and how objects of
a particular primitive type will be represented.

If an object can be represented in more than one way at different times,
there must be seperate SCs for each representation.  Also, if the
representation does not contain the type of the object, that representation
must have it's own SC, otherwise it would be impossible to tell the
different types of objects in that SC apart.  For example, all descriptor
representations could use the same SC, but 32-bit raw numbers and 32-bit
single floats would each need their own SC so that they could be
distinguished from each other and descriptor objects.  (In practice, there
will probably be more than one SC for descriptor objects to distinguish
those objects that GC must scavenge and those objects that GC can ignore.)

Temporary Names (TNs) are used by the compiler to represent a particular
value or object.  The compiler allocates TNs for every variable in the
source code in addition to any temporary value that it might need to
introduce into complex expressions.

The program is represented in IR2 as a set of basic blocks, each block
consisting of a sequence of Virtual OPerations (VOPs).  Each vop
corresponds to a small sequence of code.  VOPs take TNs as arguments, use
TNs as temporaries, and produce TNs as results.  Actually, it would be more
correct to say that the code sequence that corresponds to the VOP takes
arguments from the locations specified by the argument TNs and puts results
in the locations specified by the result TNs, possibly using additional
locations specified by any temporary TNs.  This distinction is important
conceptually.  VOPs don't create or destroy TN's; they just extract info
from them.

When a VOP is defined, the argument and result TNs can be restricted to
some particular SC or SCs.  It is very important to understand that this
restriction will not control whether or not the compiler tries to use the
VOP, but instead allows the VOP definition mechanism to automatically
coerce the argument or result from one representation to another.  For
example, a VOP can restrict an argument to a register SC, and the define
VOP macro will arrange for any stack TNs to be loaded into a temporary
load-TN which is in the correct SC.

Additionally, a primitive type can be listed for each argument or result.
In this case, the compiler will only use the VOP if the primitive types of
the TNs match those listed for the VOP.  Note the difference between
specifying a primitive type restriction and a SC restriction on a VOP.  The
primitive type restriction controls whether the VOP is used or not, while
the SC restriction just controls what automatic loading the VOP will do on
your behalf.

VOPs are either emitted by name, or are marked as translating a function
known to the compiler.  Primitive type restrictions are really only useful
when controlling whether or not a particular VOP should or shouldn't be
used to translate a particular function.


To: "Timothy B. Moore" <moore%defmacro@CS.UTAH.EDU>
Subject: Re: some questions about writing vops 
In-reply-to: Your message of Tue, 14 May 91 14:39:09 -0600.
             <9105142039.AA22873@defmacro.utah.edu> 
Date: Tue, 14 May 91 17:28:18 EDT
From: Rob.MacLachlan@LISP-PMAX2.SLISP.CS.CMU.EDU

  
    Just to cement (ha!) my understanding of lifetimes, :eval is some time
    after the argument tns are loaded and before the result tns are. Does 
    
    (:temporary (:sc non-descriptor-reg :to :eval) temp)
    
    mean that temp is live until the result tn's are loaded?

    Does (:eval n) mean anything, or is there only one :eval time?

Hmmm...  Well, the way I think about it is that:
    :LOAD is the beginning of time, which is when all argument values are
    initialized.

    (:ARGUMENT n) is the time that the N'th argument is *read*.

    (:EVAL n) are arbitrary intermediate points the VOP evaluation after
    all arguments have been read, and before any results have been written.
    It is rare to use more than one :EVAL stage, but you can use as many as
    you want.
    
    (:RESULT n) is the time that the N'th result is written.

    :SAVE is the end of time, at which all results are read.

But :ARGUMENT, :RESULT, etc., are really just naming conventions, with very
little inherent semantics.  By default, the life of the N'th argument ends
at (:ARGUMENT n), and the life of the N'th result begins at (:RESULT n).
You can override those defaults using :TO/:FROM in the argument or result
spec.

(define-vop (foo)
  (:args (x :target x-temp) (y :target y-temp) (z :target z-temp))
  (:results (r))
  (:temporary (:from (:argument 0) :to (:result 0) :target r) x-temp)
  (:temporary (:from (:argument 1)) y-temp)
  (:temporary (:from (:argument 2)) z-temp)
  (:temporary (:from (:eval 0) to (:eval 1)) random))

Here is the picture I use for visualizing the lifetimes of the above VOP:

----------> the arrow of time

         ld     a0 a1 a2 e0    e1 r0     sv
         \/     \/ \/ \/ \/    \/ \/     \/
	X ------)(----------------)(------ R
	Y ---------)(---------------------
	Z ------------)(------------------
   RANDOM ---------------)(----)


    On using :constant, presumably this works for vops that are
    translations, but not for vops emitted by name?

Yes, that is correct.
    
    One last question, the :generator values for vops seem kind of
    arbitrary. Are they basically an estimate of the number of
    instructions executed by the vop, with adjustments to discourage some
    translations?

Practically speaking, the costs are used to sort the applicable VOPs, and
hence to determine which VOP to use when more than one applies.  In
addition to using this ordering, the compiler also assumes that the cost has
some real-world significance when deciding what efficiency notes to print.
The rule for assigning costs that I have been using is that it should be
the "reasonable best case number of cycles."  For example, I assume that
targeting will succeed, that operands will be in registers, that there will
be no TLB/cache misses, etc.

  Rob