Next: State of the art Up: Improving traditional task assignment Previous: Improving traditional task assignment Contents

Task assignment in server farms

Distributed servers have become increasingly common because they allow for increased computing power while being cost-effective and easily scalable. In a distributed server system, jobs (tasks) arrive and must each be dispatched to exactly one of several host machines for processing. The rule for assigning jobs to host machines is known as the task assignment policy. The choice of the task assignment policy has a significant effect on the performance perceived by users. Designing a distributed server system thus often comes down to choosing the ``best'' possible task assignment policy for the given model and user requirements. While devising new task assignment policies is easy, analyzing even the simplest policies can prove to be difficult. Many of the long standing open questions in queueing theory involve the performance analysis of task assignment policies.

In this chapter, we consider the particular model of a distributed server system in which hosts are homogeneous and the execution of jobs is nonpreemptive (run-to-completion), i.e. the execution of a job cannot be interrupted and subsequently resumed. We consider both the case where jobs are immediately dispatched upon arrival to one of the host machines for processing (the immediate dispatching model), and the case where jobs are held in a central queue until requested by a host machine (the central queue model). The immediate dispatching of jobs is sometimes crucial for scalability and efficiency, as it is important that the router not become a bottleneck. On the other hand, the central queue model allows us to delay the decision of where to assign jobs, and this can lead to better utilization of resources and lower response time of jobs.

The immediate dispatching and central queue models with the nonpreemptive assumption are motivated by servers at supercomputing centers, where jobs are typically run to completion. The schedulers at supercomputing centers typically only support run-to-completion within a server machine for the following reasons. First, the memory requirements of jobs tend to be huge, making it expensive to swap out a job's memory for timesharing [52]. Also, many operating systems that enable timesharing for single-processor jobs do not facilitate preemption among several processors in a coordinated fashion [157]. Our models are also consistent with validated stochastic models used to study a high volume web sites [82,184], studies of scalable systems for computers within an organization [166], and telecommunication systems with heterogeneous servers [24].

Next: State of the art Up: Improving traditional task assignment Previous: Improving traditional task assignment Contents

Takayuki Osogami 2005-07-19