Improving traditional task assignment policy (Chapter 5)

In Chapter 5, we consider a fundamental question in multiserver systems: ``Which job should be assigned to which server?'' Specifically, we study task assignment policies in a server farm architecture shown in Figure 1.1(a), where our objective is to minimize the mean response time. We assume that the execution of a job cannot be interrupted and subsequently resumed; i.e., a job is run to completion (nonpreemptive). Our model is motivated by servers at supercomputing centers and other systems, where jobs are typically run to completion.

When service demands (job sizes) have high variability, it has been shown analytically and empirically that a size-based task assignment policy, Dedicated, far outperforms other common task assignment policies^1.5 [65,174]. In the Dedicated policy, some servers are designated as the ``short servers'' and others as the ``long servers.'' Short jobs are always sent to the short servers and long jobs to the long servers. The idea behind Dedicated is that, under highly variable service demands, it is important to isolate short jobs from the long jobs, as having short jobs wait behind long jobs is very wasteful. Given the extremely high variability of service demands under many computer workloads [46,47,66,113,174], Dedicated is preferable to other policies.

However Dedicated is still not optimal. One problem is that Dedicated can lead to situations where the servers are not fully utilized: five consecutive short jobs may arrive, with no long job, resulting in an idle long server. This is especially likely in common computer workloads, where there are many short jobs and just a few very long jobs, resulting in longer idle periods between the arrivals of long jobs.

The possible low utilization of Dedicated motivates us to study a new task assignment policy, the size-based task assignment with cycle stealing under immediate dispatching, SBCS-ID (see Figure 1.8). The idea behind SBCS-ID is that we would like to segregate jobs by size so as to provide isolation for short jobs, but during times when the long job server is free, we would like to steal the long server's idle cycles and use those to serve incoming short jobs. We will see that SBCS-ID provides significant improvement over Dedicated.

**Figure 1.8:** *The* `SBCS-ID` policy. An arriving long job is always dispatched to the long server. An arriving short job first checks to see if the long server is idle. If so, the short job is dispatched to the long server; otherwise, the arriving short job is dispatched to the short server. Jobs at each server are serviced in FCFS order.
$\includegraphics[width=.57\linewidth]{fig/model/SBCS.eps}$

However, under SBCS-ID, only those short jobs arriving after the long server has entered an idle period can steal idle cycles of the long server. Hence, if a short job arrives just before the long server enters an idle period, the short job is not eligible for stealing idle cycles of the long server.

This motivates us to study another new task assignment policy, the size-based task assignment with cycle stealing under central queue, SBCS-CQ (see Figure 1.9). Under SBCS-CQ, arriving jobs are held in a central queue, instead of being dispatched immediately. The central queue delays the decision of which jobs should be processed by which server, and this enables more short jobs to utilize the idle cycles of the long server.

Although cycle stealing is an old concept, the performance analysis of policies with cycle stealing has eluded researchers as it involves multidimensional Markov chains. We will analyze the performance of SBCS-ID and SBCS-CQ via DR. Our analysis shows that the short jobs can benefit by an order of magnitude under both SBCS-ID and SBCS-CQ as compared to Dedicated, while the long jobs are penalized only by a small percentage. We also find that SBCS-CQ is a superior strategy to SBCS-ID from the perspective of both the short jobs and the long jobs. We will also study the effect of service demand variability, the effect of increasing the arrival rate of the short jobs and the number of short servers, and the effect of stealing idle cycles of short servers.

**Figure 1.9:** *The* `SBCS-CQ` policy. All jobs are held in a central queue. Whenever the short server becomes idle, it picks the first short job in the queue to run. Whenever the long server becomes idle, it picks the first long job in the queue. However, if there is no long job, the long server picks the first short job in the queue.
$\includegraphics[width=.45\linewidth]{fig/model/SBCS-CQ.eps}$