By far the most common task assignment policies used in the immediate dispatching model are Random and Round-Robin. The Random policy assigns an incoming job to each server with probability , where is the number of servers, while the Round-Robin policy assigns jobs to servers in a cyclic order. Random and Round-Robin are simple, but they neither maximize utilization of the hosts, nor minimize mean response time.
In the immediate dispatching model, the Shortest-Queue policy has been shown to be optimal under various constraints and objective functions under the conditions that the service demand has an exponential distribution and job sizes are not known a priori [49,106,124,192,209]. Under the Shortest-Queue policy, incoming jobs are immediately dispatched to the host with the fewest number of jobs.
In the central queue model, the M/G/k/FCFS policy has been proven to minimize mean response time and maximize utilization under the condition that the service demand has an exponential distribution and job sizes are not known a priori [210]. The M/G/k/FCFS policy holds all jobs at the central queue, and when a host becomes free, it receives a job from the central queue in the order of their arrivals (first come first served, FCFS). The M/G/k/FCFS policy is provably identical to the Least-Work-Remaining policy in the immediate dispatching model [64]. The Least-Work-Remaining policy assumes that the jobs sizes are known a priori, and it sends each job to the host with the least total remaining work.
While policies like Shortest-Queue and M/G/k/FCFS perform well when job sizes have an exponential distribution (or distribution with increasing failure rate), they perform poorly when the job size distribution has higher variability [106,201]. In such cases, it has been shown analytically and empirically that the Dedicated policy far outperforms these other policies with respect to minimizing mean response time [65,174]. In the Dedicated policy, some hosts are designated as the ``short hosts'' and others as the ``long hosts.'' Short jobs are always sent to the short hosts and long jobs to the long hosts. Observe that the Dedicated policy is defined for both the immediate dispatching model and the central queue model, and the two models are identical under the Dedicated policy. The Dedicated policy is popular in practice (e.g. Cornell Theory Center) where different host machines have different duration limitations: 0-1/2 hour, 1/2 - 2 hours, 2 - 4 hours, etc., and users must specify an estimated required service demand for each job. The Dedicated policy performs well when job sizes have high variability because it isolates shorts jobs from the long jobs, as waiting behind the long jobs is costly. The Dedicated policy is also popular in supermarkets and banks, where an ``express'' queue is created for ``short'' jobs.
The Dedicated policy requires to know job sizes upon their arrivals; however, even when the job size is not known, a policy similar to Dedicated, known as the TAGS policy (task assignment by guessing size) works almost as well when job sizes have high variability. Like Dedicated, the TAGS policy significantly outperforms other policies that do not segregate jobs by size [64].