

















#### A graph-coloring problem A graph is *k*-colorable if every node in the graph can be colored with one of k colors such that two adjacent nodes do not have the same color Assigning k registers = Coloring with k colors W v <- 1 X w <v + 3 <w eax u <v edx u t <u + x ecx <w <t <-11 Carnegie Mellon

## **History**

For early architectures, register allocation was not very important

Early work by Cocke (in 1971) proposed the idea that register allocation can be viewed as a graph coloring problem

Chaitin was the first to implement this idea for the IBM 370 PL/1 compiler, in 1981

In 1982, at IBM, Chaitin's allocator was used for the PL.8 compiler for the IBM 801 RISC system

Today, register allocation is the most essential of code optimizations

Carnegie Mellon 🛃

chool of Computer Scie

## History, cont'd

Motivated by the first MIPS architecture, Chow and Hennessy developed priority-based graph coloring in 1984 "top down" coloring

Another popular algorithm for register allocation based on graph coloring is due to Briggs in 1992 "bottom up" coloring

Carnegie Mellon





#### **Edges of Interference Graph** Building the interference graph Intuitively: for each defining inst i Two variables interfere if they overlap at some point in the program. let x be temp defined at inst i Algorithm: for all y in LIVE-IN of succ(i) At each point in program. insert an edge between x and y enter an edge for every pair of variables at that point An optimized definition & algorithm for edges: v <- 1 {} w <v + 3 {v} For each defining inst i w х w + v {w,v} x <-Let x be definition at inst i u <v {w,x,v} For each variable y live at end of inst i t <- u + x {w,u,x} insert an edge between x and y u <х $\{x, w, u, t\}$ Faster? <-{w,u,t} w Better quality? A = 2 (A<sub>2</sub>) {D} {A,D} <- t Edge between A and D {u,t} <- u {u} **Carnegie Mellon** Carneg of Computer Scien





















# **Graph coloring**

Once we have the interference graph, we can attempt register allocation by searching for a K-coloring This is an NP-complete problem ( $K \ge 3$ )\*

But a linear-time simplification algorithm by Kempe (in 1879) tends to work well in practice

\* [1] H. Bodlaender, J. Gustedt, and J. A. Telle, "Linear-time register allocation for a fixed number of registers," in *Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms*, pp. 574–583, Society for Industrial and Applied Mathematics, 1998

[2] S. Kannan and T. Proebsting, "Register allocation in structured programs," in *Proceedings of the sixth annual ACM-SIAM symposium on Discrete algorithms*, pp. 360–368, Society for Industrial and Applied Mathematics, 1995.
[3] M. Thorup, "All structured programs have small tree width and good register allocation," *Inf. Comput.*, vol. 142, no. 2, pp. 159–181, 1998.

Computer Science

# Kempe's algorithm

Basic observation:

- given a graph that contains a node with degree less than K, the graph is K-colorable iff the graph without that node is K-colorable
- this is called the "degree<K" rule
- So, step #1 of Kempe's algorithm:
  - iteratively remove nodes with degree<K

Carnegie Mellon 🌺

ol of Computer Scien















# **Spilling to Memory**

#### **RISC Architectures**

- Only load and store can access memory
  - · every use requires load
  - · every def requires store
  - · create new temporary for each location

#### **CISC** Architectures

- can operate on data in memory directly
  - makes writing compiler easier(?), but isn't necessarily faster
- pseudo-registers inside memory operands still have to be handled

Carnegie Mellon

# Spilling a use

For an instruction like -t <- (u,v)If u is marked as an actual spill, transform to  $-\underline{u}' := \mathbf{u}$  (*i.e.*, *a load instruction*)  $-t <- (\underline{u}',v)$ where u' is a new temp  $\mathbf{u}$  and  $\underline{u}'$  are special:  $-\mathbf{u}$  is spilled and thus *unallocatable*  $-\underline{u}'$  is marked as *unspillable* 

Carnegie Mellon

# Spilling a def

For an instruction like -t <- (u,v)If t is marked as an actual spill, transform to  $-\underline{t}' <- (u,v)$   $-t := \underline{t}'$  (*i.e., a store instruction*) where t' is a new temp t and  $\underline{t}'$  are special: -t is spilled and thus *unallocatable*  $-\underline{t}'$  is marked as *unspillable* 

Carnegie Mellon





# **Spill code generation**

The effect of spill code generation is to turn long live ranges into a bunch of tiny live ranges

This introduces new temps

Hence, register allocation must start over from scratch whenever spill code is generated

Carnegie Mellon

Carnegie Mellon

School of Computer Sci



# What to Spill? When choosing potential spill node want: A node that makes graph easier to color Fewer spills later A node that isn't "expensive" to spill An expensive node would slow down the program if spilled We can apply heuristics both when choosing potential spill nodes and when choosing actual spill nodes

not required to spill node that we popped off stack and can't color















































# Why Two Methods?

•Why not?

•With Briggs, one needs to look at all neighbors of a & b •With George, only need to look at neighbors of a.

Carnegie Mellon

So:

Use George if one of a & b has very large degree
 Use Briggs otherwise







## **Alternative Allocators**

#### Local/Global Allocation

gcc's approach, unless -fnew-ra

Carnegie Mellon

- Allocate "local" pseudo-registers
   Lifetime contained within basic block
  - Register sufficiency no longer NP-Complete!
- Allocate global pseudo-registers
  - Single pass global coloring
  - Coloring heuristic may reverse local allocation
- Reload pass to fix spills (allocator does not generate spill code)
- Can also do global then local
- Advantages? Disadvantages?

## In Chaitin's words

"...since I was a mathematician, the register allocation kept getting simpler and faster as I understood better what was required. I preferred to base algorithms on a simple, clean idea that was intellectually understandable rather than write **complicated** *ad hoc* **computer code**...

So I regard the success of this approach, which has been the basis for much future work, as a triumph of the power of a simple mathematical idea over ad hoc hacking. Yes, **the real world is messy and complicated**, but one should try to base algorithms on clean, comprehensible mathematical ideas and only complicate them when absolutely necessary. In fact, certain instructions were omitted from the 801 architecture because they would have unduly complicated register allocation..."









Complexity of Register Allocation



## **Complexity of Register Allocation**

Complexity of local register allocation?

- linear algorithm for register sufficiency

### SSA Form?

- interference graph is turns out to be both perfect<sup>1</sup> and chordal<sup>2</sup>
  - · can color in linear time
- BUT all bets are off after SSA elimination<sup>3</sup>

 Philip Brisk, Foad Dabiri, Jamie Macbeth, and Majid Sarrafzadeh. Polynomial time graph coloring register allocation. In 14th International Workshop on Logic and Synthesis. ACM Press, 2005.
 Schastian Hack. Interference graphs of programs in SSA-form. Technical Report ISSN 1432-7864, Universitat Karlsruhe, 2005.
 Bens Palsberg and Fernando Magno Quintao Pereira Register allocation after classical SSA elimination is NP-complete. In Proceedings of FOSSACS'06, Foundations of Software Science and Computation Structures. Springer-Verlag (LNCS), Vienna, Austria, March 2006.

Carnegie Mellon

## **Complexity of Register Allocation**

Complexity of optimizing spill code? – NP-complete even without control flow<sup>1</sup> Complexity of optimal coalescing? – NP-complete<sup>2</sup>

 Martin Farach and Vincenzo Liberatore. On local register allocation. In 9th ACMSIAM symposium on Discrete Algorithms, pages 564 { 573. ACM Press, 1998.

2] Andrew W. Appel and Lal George. Optimal spilling for cisc machines with few registers. In Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation, pages 243–253. ACM Press, 2001. Carrogie Mellon

of Computer Science