Consider the integral power procedure. Each example in this section is
presented in Scheme and in C. For code generation the Scheme example
assumes the implementation provides a compile
procedure, and the C
example uses the DCG system [DCG].
(define (power exp base) (if (= 0 exp) 1 (let* ((n (power (>> exp) base)) (nn (* n n))) (if (odd? exp) (* nn base) nn))))
double pow(double base, int exp) { int t, bit = 1; int square = base; double result = 1.0;while (bit <= exp) { if (bit & exp) result *= square; bit = bit << 1; square *= square; } return result; }
This code looks clean and efficient, but often it is far from optimal. If the exponent were known to be, eg 20 (), you could instead use this specialized version:
(define (power-20 base) (let* ((base-2 (* base base)) (base-4 (* base-2 base-2)) (base-8 (* base-4 base-4)) (base-16 (* base-8 base-8))) (* base-4 base-16)))
double pow_20(double base) { double b2, b4, b8, b16;b2 = base * base; b4 = b2 * b2; b8 = b4 * b4; b16 = b8 * b8; return b16 * b4; }
Since the control flow, conditionals, and shifting of the original code depended only on the exponent, they can be eliminated. On typical RISC machines, this will run much faster (on a MIPS R3000 the C version ran 2.7 times faster; on an Alpha 2.4 times).
Sometimes the exponent isn't known at compile time, but you do know that, for whichever values arrives later, it will be called many times (perhaps 10,000). This situation is the motivation for run time code generation (RTCG), the synthesis of specialized machine code at run time [note run-time]. Using this technique, a generating extension is called with just the exponent. It returns a procedure that finishes the computation:
(define (power-gen exp) (define (loop exp) (if (= 0 exp) 1 `(let* ((n ,(loop (>> exp))) (nn (* n n))) ,(if (odd? exp) '(* nn base) 'nn)))) (compile `(lambda (base) ,(loop exp))))The lisp compiler is called to convert the sexpr data into a procedure. [note removing-times-one].
DCG is an efficient, retargetable, C-callable interface for RTCG [DCG]. The `D' stands for `Dynamic' (which is actually a better term
than `run time'). Using it, power_gen
looks something like this:
typedef double (*FPtr)(double);This compiler will run very fast.Symbol loop(int exp, Symbol base) { Symbol n, nn; if (1 == exp) return scnstd(cnstd(1.0)); n = loop(exp >> 1, base); nn = multd(n, n); if (exp & 1) nn = multd(nn, base); return nn; }
FPtr power_gen(int exp) { int ncalls = 0; char name[20]; Symbol base, res;
base = sargi(); dcg_param_alloc(&base, ncalls); regtree(retd(loop(exp, base))); sprintf(name, "power_%d", exp); return dcg_gen(sfunc(name), &base, ncalls); }
The following sections follow a generalization of this procedure typically occuring in 3D graphics. But this represents only one possible path: in a mathematical library, you might want to provide a more general purpose power procedure that handled negative and then rational exponents. You might also support complex numbers. Such features require additional tests, but as long as they depend solely on the exponent, a compiler can eliminate them.