Load Shedding

The preceding heuristic attempts to cluster mutually interested objects together in order to reduce the number of replicas in the system. However, it does not attempt to balance the load on servers in the system. The object placement component uses a separate load shedding heuristic to offload objects from a node when it is too heavily loaded (i.e., above

). However, this heuristic must also try to avoid increasing the number of replicas in the system to reduce the danger of oscillations resulting with the clustering heuristic. It proceeds as follows:

When

, node

attempts to offload primaries from itself to more lightly loaded nodes in the system. To decide which primary objects to offload,

first traverses the (undirected) interest graph formed by the objects (both primary and replica) in its object store and finds the connected components formed by interest links. The motivation here is that a component can be migrated to another node without increasing aggregate load in the system since no object in the component maintains a link to

. Hence, at worst, migrating the entire component only maintains existing inter-node interest links and does not create new ones. We can actually ignore the replicas in the interest graph when forming components; however, by including replicas in the component, we are guaranteed to eliminate them away from

when migrating the entire component, since no other object on

is interested in them. Hence, the savings that

obtains by migrating a component

, the set of primary objects in the component, is:

$\displaystyle savings(C) ~~=~~ \sum_{x \in C} f_x + \sum_{k \neq n} weight(C,k)~,$

$\displaystyle weight(C)$	$\displaystyle =$	$\displaystyle ~~\sum_{x \in C} u_x \cdot interested(x,n) + \sum_{y \in I_n} u_y~,$
$\displaystyle I_n$	$\displaystyle =$	$\displaystyle ~~\{ y ~\vert~ \forall k \neq n, y \in interests(C,k) \}~,$

The load discovery primitive described above is used to discover nodes that are below

. A random subset of these nodes is picked as migration candidates (to avoid everyone offloading to the same nodes).

can decide to migrate a component of objects

to node

if:

$\displaystyle highwater ~~<~~ load_m + \sum_{x \in C} f_x + \sum_{k \neq m} weight(C,k)~,$

We consider migrating smaller clusters first, because they are less likely to oscillate between nodes than larger ones and they impact fewer objects. The candidates are examined in ascending order by their load and the heuristic offloads clusters to the candidate with the lightest load (updating the estimated load as we decide to move a cluster to a node) until the savings brings

's load below