|
Figure 4.1(a) shows the optimal number of servers as
a function of the load and variability of the job size,
where the job size has a two-phase PH distribution.
All of our results are expressed as a
function of the variability of the job size distribution
(specifically, its squared coefficient of variation, )4.1, and the
server load,
. While other factors, e.g. the exact form of the
job size distribution, might affect our results, we posit that
load and variability be the most relevant factors.
As is proved by Stidham [188],
a single server minimizes mean response time when
(exponential distribution)
or when
(Erlang-2 distribution).
Observe, however, that under high job size variability and/or high
load, the optimal number of servers is more than 1; we prefer multiple
slow servers to a single fast server. For example, at load
and
, we see that three
servers are best. Computations are only done for up to six
servers -- the level curves shown will continue into the upper
right portion of the plot if a larger number of servers is considered.
Figure 4.1(b) shows that for any particular job size
variability, , having a larger number of slower servers may
reduce the mean response time up to a point, after which further
increasing the number of servers increases the mean response time. To
understand why, note that by increasing the number of servers (while
maintaining fixed total capacity), we
are allowing short jobs to avoid queueing behind long jobs -- specifically,
an arriving short job is more likely to find a server free. Thus,
increasing the number of servers mitigates the impact of job size variability, hence
improving performance. If the number of servers is too high
however, servers are more likely to be idle, under-utilizing the
system resources.