Our analysis illuminates principles on the performance (mean response time, in particular) of multiserver systems as compared to single server systems, which can be characterized primarily by three rules of thumb.
In fact, the third rule has an important implication in designing scheduling policies in multiserver systems. Namely, prioritizing small jobs to improve mean response time is not as effective in multiserver systems as in single server systems.
We have also proposed an approximate analysis, DR-A, of the (per class or overall) mean response time in an M/PH/ queue with many () priority classes. DR-A is based on DR, but as opposed to the two approximations that we have introduced in Section 3.6 (DR-PI and DR-CI), its running time does not grow as the number of priority classes, , increases. The accuracy of DR-A as well as two existing approximations (BB and MK-N) is evaluated against simulation, and the results are discussed extensively. We find that the error in DR-A is within 5% for a range of loads and job size variabilities, while the error in BB and MK-N can be as high as 50%. Since BB is based on the assumption that the effect of priority is similar between a single server system and a multiserver system, the error in BB can also be explained by the above observation that prioritizing small jobs to improve mean response time is not as effective in multiserver systems as in single server systems.
In this chapter, we primarily limit our discussion to ``how many servers are best?'' However, the analysis of M/PH/ queue with priority classes via DR has a broad applicability in capacity planning of multiserver systems with multiple priority classes. For example, in [204], we study the impact of system tasks on the performance of user tasks in the context of dependability systems, where systems tasks have higher priority than user tasks for the purpose of fault recovery, fault isolation, fault masking, intrusion detection, virus checking, etc.
The results in this chapter can be used to infer how the performance is affected by changing the number of servers or by prioritizing some of the jobs, in more complex multiserver systems. For example, in the rest of the thesis, we consider multiserver systems consisting of multiple queues and multiple servers, where each server behaves differently from another. We typically assume that there are no priorities within each queue, and there is only a single server of each type. By contrast, the model that we have studied in this chapter has multiple homogeneous servers and a single queue where some jobs have priority over other jobs. The results in this chapter can be used to infer how the mean response time is affected in the systems studied in the rest of the thesis when there are priorities within each queue and there are multiple servers of each type.