Next: State of the art Up: Configuring multiserver systems with Previous: Configuring multiserver systems with Contents

Introduction

An important problem in designing multiserver systems is how to set up system configurations, given a limited amount of resources and user requirements. For example, a system architect may want to decide whether he/she should buy a few fast (but expensive) servers or many slow (but cheap) servers to minimize mean response time, given a limited amount of budget. In this chapter, we study this problem in particular settings: Is it preferable to use a single fast server of speed , or slow servers each of speed to minimize mean response time? What is the optimal ? Although our analysis via DR extends to more general settings, taking into account the actual costs of servers as a function of their speed, this would complicate the analysis and obscure the insights into the fundamental nature of multiserver systems. Our goal in addressing the above problem is to illuminate principles on how the number of servers affects the performance of multiserver systems, which are useful in multiserver configurations.

The question of ``how many servers are best'' in exactly our settings has been asked by a stream of research [120,130,131,172,175,188], but all of this work is limited to the case where jobs are served in the order of their arrivals (first come first served, FCFS). However, real-world systems are often not simply FCFS, where all jobs have equal importance. Rather, there is often inherent scheduling in the system to allow high priority jobs to move ahead of low priority jobs. For example, the high priority jobs may carry more importance, e.g. representing users who pay more. Alternatively, high priority jobs may simply have small service demand, since giving priority to short jobs reduces mean response time overall. Either way, priorities are fundamental to many modern systems.

This motivates the question of what the optimal system configuration is (a few fast servers or many slow servers) in a setting where there are different priority classes. We specifically assume that the high priority jobs and the low priority jobs arrive according to independent Poisson processes, their service demands have phase type (PH) distributions (as defined in Section 2.2), and jobs within each class are served in FCFS order. With these assumptions, the system behavior can be modeled as a foreground-background (FB) process, as discussed in Section 3.4, and it allows us to analyze this system via DR, for the first time.

In particular, DR allows us to address the following questions:

Under what conditions are multiple slow servers preferable to a single fast server? Is the optimal number of servers sensitive to changes in the relative loads of the priority classes, changes in the variability of the service time distributions, etc.?
Does the answer to ``how many servers are best'' differ for the different priority classes? E.g., does the lower priority class prefer more or fewer servers than that preferred by the higher priority class?
How does the optimal number of servers in a dual priority system differ from the case when all jobs have been aggregated into a single priority class?
If one chooses a non-optimal number of servers, how does that affect the overall mean response time and the per-class mean response time?

We also compare the analysis via DR with existing approximations in the literature, and find that the existing approximations can lead to qualitatively misleading conclusions regarding the optimal number of servers. Since existing approximations have poor accuracy and DR becomes computationally expensive as the number of servers and priority classes increases, we propose a new approximation based on DR, which we refer to as DR-A (DR with class aggregation). DR-A allows us to efficiently analyze the performance of multiserver systems with many priority classes. We will study the error in the existing approximations and DR-A extensively, and find that the error in DR-A is within 5% for a range of parameters, which is much smaller as compared to the existing approximations.

The rest of this chapter is organized as follows. In Section 4.2, we discuss prior work dealing with our question of ``how many servers are best?'' In Section 4.3, we answer question 1 above in the case of an M/PH//FCFS queue with a single priority class. In Section 4.4, we address all the four questions above for the case of an M/PH/ dual-priority queue. In Section 4.5, we propose DR-A and study the accuracy of DR-A and existing approximations.

Next: State of the art Up: Configuring multiserver systems with Previous: Configuring multiserver systems with Contents

Takayuki Osogami 2005-07-19