Summary

Next: New approximations for many Up: How many servers are Previous: Response time as a Contents

Summary

To summarize the above observations, the optimal number of servers depends on the loads and job size variabilities, and whether a few fast servers are preferable to many slow servers can be understood by combining the following set of rules of thumb.

A few fast servers have an advantage over many slow servers with respect to high utilization. When the load is lower, the difference in the utilization becomes greater, and as a result the advantage of a few fast servers becomes greater (for both the high and low priority jobs).
Many slow servers have an effect of reducing the impact of job size variability on mean response time, since they allow small jobs to avoid queueing behind large jobs. When the variability of job sizes is higher and/or when the load is higher, this effect of many slow servers becomes greater, and as a result the advantage of many slow servers becomes greater (for both the high and low priority jobs).
Many slow servers have an effect of reducing the impact of prioritization on low priority jobs. When the mean and/or variability of the higher priority job size are larger and/or when the load of the higher priority job is higher, this effect of many slow servers becomes greater, and as a result the advantage of many slow servers becomes greater (for low priority jobs).

The third rule has an interesting implication on the overall mean response time of a dual priority system, since prioritizing small jobs has the same effect as having multiple servers in the sense that both allow small jobs to avoid queueing behind large jobs. Thus, when small jobs have higher priority, a smaller number of (fast) servers is more preferable. Also, since prioritizing large jobs has the counter effect, a larger number of (slow) servers is preferable when large jobs have higher priority.

The above rules immediately provide answers to the first three questions that we have posed in the introduction. Also, our discussion regarding Figure 4.5 provides an answer to the fourth question. In particular, the mean response time (for both overall and per-class) can be dramatically different for different number of servers, and in all studied cases, there exists an ``optimal'' number of servers where using fewer or more servers results in worse performance under highly variable service distributions.

Next: New approximations for many Up: How many servers are Previous: Response time as a Contents

Takayuki Osogami 2005-07-19