### Browsed byCategory: Reliability Theory

Extensions of the Single Server Efficiency Model

## Abstract

Editor’s note: this paper comprises the third chapter of the PhD dissertation by Rachel Traylor. Visit here and here to see chapters one and two, respectively. Herein we further generalize the single server model presented in [3]. In particular, we consider a multichannel server under the cases of both singular and clustered tasks. In the instance of singular tasks, we present a load balancing allocation scheme and obtain a stochastic breakdown rate process, as well as derive the conditional survival function as a result. The model of a multichannel server taking in clustered tasks gives rise to two possibilities, namely, independent and correlated channels. We derive survival functions for both of these scenarios.

## Load Balancing Allocation for a Multichannel Server

### Model Description

Previously, we had assumed that a web server functions as a single queue that attempts to process jobs as soon as they arrive. These jobs originally brought a constant stress $\eta$ to the server, with the system stress reducing by $\eta$ at the completion of each job.

Now, suppose we have a server partitioned into K channels. Denote each channel as $Q_k,$ $k = 1,\ldots,K$. Jobs arrive via a nonhomogenous Poisson process with rate $\lambda(t)$. Upon arrival, each job falls (or is routed) to the channel with the shortest queue length. If all queue lengths are equal or multiple channels have the shortest length, the job will enter one of the appropriate queues with equal probability.

We retain the previous notation for the baseline breakdown rate, or hazard function. This is denoted by $r_0(t)$ and is the hazard function under an idle system. We also retain the assumption that the arrival times $\mathbf{T}$ are independent. In addition, the service times $\mathfrak{W}$ are i.i.d. with distribution $G_W(w)$. We assume that all channels are serving jobs at the same time, i.e., a job can be completed from any queue at any time. We do not require load balancing for service. In other words, any queue can empty with others still backlogged. We also retain the FIFO service policy for each queue.

Since we have now “balanced,” or distributed, the load of jobs in the server, not all jobs will cause additional stress to the system. Suppose all jobs bring the same constant stress $\eta$ upon arrival. Under load balancing, we will define the additional stress to the system as $\eta\max_k|Q_k|$. Figure 1 shows an example server with current stress of $4\eta$.

### Examples

Due to the dynamic nature of arrival times, allocation to queues, and service times, we have many possible configurations of jobs at any point in time. Therefore, the allocation scheme adds an additional layer of variation to the service times and order of service. The placement of jobs in the various queues (and thus the order of service and service times) is wholly dependent on all arrival times and service times of the prior arrivals. The following examples illustrate the effect on the workload stress added to the system in various scenarios.

###### Figure 2(b). Breakdown Rate Process Trajectory

Example 1Suppose for simplicity we have 2 channels. Suppose at the time of observation of the system, 3 jobs have arrived and none have finished. WLOG, suppose job 3 fell into $Q_{1}$. See Figure 2(a). The stress to the system at $t=t_{\text{obs}}$ is $r_{0}(t_{\text{obs}}) + 2\eta$, as shown in Figure 2(b).

Note in Example 1 that Job 2 does not add any additional stress to the system. Job 1 sees an empty queue upon arrival, and $\max_{K}|Q_{K}| = 1$ when it falls into any particular queue. Job 2 arrives as Job 1 is still being processed, and thus the placement of Job 1 forces Job 2 into the empty channel. Since $\max_{K}|Q_{K}|$ is still 1, the stress to the system doesn’t change. Job 3 arrives as Jobs 1 and 2 are in service, and thus its choice of queue is irrelevant due to the configuration of the two queues at $T_{3}$. Regardless of which queue Job 3 falls into, $\max_{K}|Q_{K}| = 2$. Thus the arrival of Job 3 increases the breakdown rate by $\eta$ again.

The next example shows the change in system stress Job 1 from Example 1 when one job has finished processing before $T_{3}$.

###### Figure 3(b). Example 2 Breakdown Rate Process Trajectory

Example 2. Consider the same two-channel system from Example 1. However, now suppose WLOG that $T_{3} < T_{1}+W_{1}$. In other words, service for Job 1 was completed before Job 3 arrived. Hence Job 3 will fall into the opposite queue as Job 2. The stress to the system at the time of observation would be $r_{0}(t) + \eta$. See Figures 3(a) and 3(b).

In this scenario, the workload due to Job 3 does not contribute any additional stress to the server. Also observe that upon completion of Job 1, the workload stress to the server does not decrease, as Job 2 still resides in the system and is being served.

Contrast this behavior with the breakdown rate process given in [3]. In the single-channel, single-server model described in both [1] and [3], each job adds stress to the server upon arrival. Under the load balancing allocation scheme, the additional stress to the server depends on the arrival and service times of all prior jobs. From a stochastic perspective, this breakdown rate process has full memory.

The examples above illustrate that $\max_{K}|Q_{K}|$ depends on the intersection of the intervals $I_{j} = [T_{j}, T_{j} + W_{j}]$, $j = 1,\ldots,N(t)$. The next section details the methodology to obtain the configuration of jobs in the server at time $t$ by deccomposition of $\bigcup_{j=1}^{N(t)}I_{j}$ into disjoint atoms and derives the stochastic breakdown rate process under the load balancing allocation scheme.

### Breakdown Rate Process and Conditional Survival Function

Let $\epsilon = (\epsilon_{1},\ldots,\epsilon_{N(t)})$ be a $N(t)$-tuple whose components $\epsilon_{j}\in\{\emptyset, c\}$, where $\emptyset$ denotes the empty set, and $c$ denotes the complement of the set. Let $E = \{\epsilon \mathrel{:} \epsilon_{j}\in\{\emptyset, c\}\}$ denote the set of all possible $\epsilon$, excepting $\epsilon = (c,\ldots,c)$. Then by Lemma 1 (see Appendix),

$$\bigcup_{j=1}^{N(t)}I_{j} = \bigcup_{\epsilon \in E}\bigcap_{j=1}^{N(t)}I_{j}^{\epsilon_{j}}$$

Remark. $\cap_{j=1}^{N(t)}I_{j}^{\epsilon_{j}}$ indicates which jobs are still in the server at time $t$. The union is disjoint; thus only one $\epsilon$ will describe the server configuration at any given time $t$. For example, if 3 jobs have arrived to the server at time $t_{\text{obs}}$, $|E| = 3\times 2 - 1 = 5$. These may be enumerated:

• $I_{1} \cap I_{2} \cap I_{3}$

• $I_{1}^{c} \cap I_{2}^{c} \cap I_{3}$

• $I_{1} \cap I_{2}^{c} \cap I_{3}^{c}$
• $I_{1}^{c} \cap I_{2} \cap I_{3}$

• $I_{1}^{c} \cap I_{2} \cap I_{3}^{c}$

As an illustration, refer to Example 1. All three jobs are in the system at $t = t_{\text{obs}}$ (that is, none have completed service), and thus $t_{\text{obs}} \in I_{1} \cap I_{2} \cap I_{3}$. Expanding, $t_{\text{obs}} \in [T_{1}, T_{1}+W_{1}]$, $[T_{2}, T_{2}+W_{2}]$, and $[T_{3}, T_{3}+W_{3}]$.

Compare the case with that of Example 2. In this case, three jobs have arrived at $t = t_{\text{obs}}$, but Job 1 has finished by $t_{\text{obs}}$. Thus $t_{\text{obs}} \not\in I_{1}$, but since Jobs 2 and 3 are still in the system, $t_{\text{obs}} \in I_{2} \cap I_{3}$. Thus $t_{\text{obs}} \in I_{1}^{c} \cap I_{2} \cap I_{3}$.

Now, since the additional workload stress to the server is a multiple of $\eta\max_{K}|Q_{K}|$, it remains to derive the appropriate multiplier that accounts for the number of jobs that contribute additional stress to the system.

Let $n = \sum_{j=1}^{N(t)}\mathbf{1}(\epsilon_{j} = \emptyset | \epsilon_{j} \in \epsilon)$ for a particular $\epsilon$, and let $\alpha_{\epsilon}$ be the multiplier that indicates the number of jobs that contribute stress $\eta$ to the system. Under [1] and the generalization in [3], every uncompleted job in the system contributes stress, thus $\alpha_{\epsilon} = n$.

Under the load balancing scheme, $\alpha_{\epsilon} = \lfloor\frac{n+1}{K}\rfloor$, where $K$ is the number of channels in the server. This is due to the allocation scheme’s attempts to evenly distribute jobs across channels. Thus, for Example 1, $n=3$, and $K=2$, meaning $\alpha_{\epsilon} = 2$, as illustrated in Figure 2(b) and for Example 2, $\alpha_{\epsilon} = \lfloor\frac{3+1}{2}\rfloor = 1$, as in Figure 3(b).

Then, the stochastic breakdown rate process under the load balancing allocation scheme is given by
$$\mathcal{B}(t) = r_{0}(t) + \eta\sum_{\epsilon \in E}\alpha_{\epsilon}\mathbf{1}_{I_{1}^{\epsilon_{1}}\cap I_{2}^{\epsilon_{2}} \cap I_{N(t)}^{\epsilon_{N(t)}}}(t)$$

Under this expression, only one indicator function will be nonzero at any given point in time, since all atoms are disjoint. Now, $I_{1}^{\epsilon_{1}} \cap I_{2}^{\epsilon_{2}} \cap \ldots \cap I_{N(t)}^{\epsilon_{N(t)}}$ may be expressed as one interval $[L_{\epsilon},R_{\epsilon}]$, where

\begin{aligned}L_{\epsilon}&=\max\left(\{T_{j}\mathrel{:}\epsilon_{j} = \emptyset\}_{j=1}^{N(t)}\right);\\R_{\epsilon}&=\min\left(\{T_{j} + W_{j}\mathrel{:}\epsilon_{j} = \emptyset \}_{j=1}^{N(t)},\{T_{j}\mathrel{:}\epsilon_{j} = c\}_{j=1}^{N(t)}\right)\end{aligned}

Thus, for a server with $K$ channels under a load balancing routing scheme with all jobs bringing constant stress $\eta$, the breakdown rate process $\mathcal{B}(t)$ may be expressed as

$$\mathcal{B}(t) = r_{0}(t) + \eta\sum_{\epsilon \in E}\alpha_{\epsilon}\mathbf{1}_{[L_{\epsilon}, R_{\epsilon}]}(t)\qquad(*)$$

Thus, the conditional survival function under the load balancing scheme is given by
\begin{aligned}S_{Y|\mathfrak{T},\mathfrak{W}, N(t)}(t|\mathfrak{t},\mathfrak{w}, n) &= e^{-\int_{0}^{t}\mathcal{B}(s)ds}\\&=\bar{F}_{0}(t)\exp\left(-\eta\int_{0}^{t}\sum_{\epsilon \in E}\alpha_{\epsilon}\mathbf{1}_{[L_{\epsilon}, R_{\epsilon}]}(s)ds\right)\\&=\bar{F}_{0}(t)\exp\left(-\eta\sum_{\epsilon\in E}\alpha_{\epsilon}\min(t-L_{\epsilon},R_{\epsilon})\right) \end{aligned}

### Remarks

Finding the survival function of the single-channel environment relied on the independence of the set of arrival times and service times. From $(*)$, the independence is clearly lost. As noted before, the random breakdown process has full memory, and thus is completely dependent upon the entire trajectory up to $t = t_{\text{obs}}$.

## Clustered Tasks in a Multichannel Server

###### Figure 4. Clustered Tasks in a Multichannel Server

The previous multichannel server model in Section 1 implicitly assumed each job comes with one task, and all channels are identical in their ability to serve any task brought by a job. A classic illustration is a block of registers at a retail establishment. Each customer will survey the length of the various queues at each register before choosing the shortest queue. Viewing each of these separate registers as a channel in a single server under these conditions gave rise to the load balancing allocation model detailed in the previous section. This section presents a different interpretation of a multichannel, single-server model.

Suppose a server has multiple channels $Q_{1},\ldots,Q_{K}$, but each channel serves a different type of task. A customer arrives to the server and may select any number from $0$ to $K$ tasks for the server to perform. Said customer will select each possible task $j$ with probability $p_{j}$. Figure 4 illustrates an example of such a situation in which three customers visit the server and each customer picks a different number and set of tasks at random. A customer is considered fully serviced (i.e. the job is complete) upon completion of the last task belonging to that particular customer.

### Model Assumptions

The following mathematical assumptions are made for the multichannel server with clustered tasks:

1. Customers arrive to the server with $K$ channels via a nonhomogenous Poisson process (NHPP) with intensity $\lambda(t)$.
2. The breakdown rate of the idle server is given by $r_{0}(t)$.
3. Each channel corresponds to a different task the server can perform.
4. The selection of each task is a Bernoulli random variable with probability $p_{k}$. Thus the number of tasks selected by each customer is a binomial random variable.
5. The workload stress to the server is a constant multiple $\eta$ of the number of tasks requested by the customer, i.e. the additional stress is given by $\eta N$, where $N$ is the number of tasks requested.
6. The PDF of each channel’s service time is given by $g_{i}(w)$, $i = 1,\ldots ,K$. Since the customer’s service is not complete until all requested tasks have finished, the service life distribution for the customers is given by $\max_{i}G_{i}(w)$.

Under these assumptions, this model is a special interpretation of the random stress environment developed in [3]. In this case, the random workload stress is $\eta N$, where $N$ is a binomial random variable, and the service life distribution $G_{W}(w) = \max_{i}G_{i}(w)$, which may be easily obtained through the mathematical properties of order statistics. Two variations are considered in the  next section: independent channels and correlated channels.

### Independent Channels in a Clustered Task Server

Suppose the selection probabilities for each task in the server are identical, that is, $p_{1}=p_{2}=\ldots=p_{K}=p$. Then $N\sim\text{Bin}(K,p)$. Using Theorem 3 in [3], the survival function of the multichannel server is given in the following theorem.

Theorem 1 (Survival Function of Multichannel Server with Clustered Tasks and Independent Channels). Suppose model assumptions (1)-(6) above are satisfied. In addition, assume $p_{1}=p_{2}=\ldots=p_{K}=p$. Then the survival function of the server is given by
\begin{aligned}S_{Y}(t)&=\bar{F}_{0}(t)\exp\left(-K\eta\left[e^{-\eta t}\left(1-p+pe^{-\eta t}\right)^{K-1}-p(1-p)^{K-1}\right]\right.\\&\qquad\left.\times\int_{0}^{t}m(t-w)\bar{G}_{W}(w)dw\right)\end{aligned}

where $m(x) = \int_{0}^{x}\lambda(s)ds$, $\bar{F}_{0}(t) = e^{-\int_{0}^{t}r_{0}(s)ds}$, $\bar{G}_{W}(w) = 1-G_{W}(w)$, and $G_{W}(w) = \max_{i}G_{i}(w)$.

### Correlated Channels in a Cluster Server

Now suppose the server tasks are correlated, in that the selection of one particular task may affect the selection of any or all of the other tasks. Thus the channels are a sequence of dependent Bernoulli random variables. The construction of dependent Bernoulli random variables is given in [2], and a summary is given.

#### Dependent Bernoulli Random Variables and the Generalized Binomial Distribution

Korzenwioski [2] constructs a sequence of dependent Bernoulli random variables using a binary tree that distributes probability mass over dyadic partitions of $[0,1]$. Let $0\leq\delta\leq1$, $0, and $q=1-p$. Then define the following quantities:
\begin{aligned}q^{+}\mathrel{:=}q+\delta p&\qquad p^{+}\mathrel{:=}p+\delta q\\q^{-}\mathrel{:=}q(1-\delta)&\qquad p^{-}\mathrel{:=}p(1-\delta)\end{aligned}

The quantities above satisfy the following conditions:
\begin{aligned}q^{+}+p^{-}=q^{-}&+p^{+}=q+p=1\\qq^{+}+pq^{-}=q&,\quad qp^{-}+pp^{+}=1\end{aligned}

###### Figure 5. Construction of Dependent Bernoulli Random Variables

Figure 5 shows the construction shows the dependencies. The following examples using coin flips illustrate the effect of the dependency coefficient $\delta$:

Example 1 $(\delta=1)$. For $\delta = 1$, $q^{+} = q+p=1, q^{-} = 0$, $p^{+} = p+q = 1$, and $p^{-} = 0$. Supposing the first coin flip $\epsilon_{1} = 1$. Then every successive $\epsilon_{i}$ will also be $1$. Similarly if $\epsilon_{1} = 0$. Thus the result of the first coin flip completely determines the outcomes of all the rest.

Example 2 $(\delta = 0)$For $\delta = 0$, $q^{+} = q^{-} = q$, and $p^{+} = p^{-} = p$. Thus, the first coin flip (and all subsequent ones) have no effect on the ones that follow.

Example 3 $(\delta = \frac{1}{4})$Suppose $p=q=\frac{1}{2}$. Then $p^{+} = q^{+} = \frac{5}{8}$, and $p^{-} = q^{-} = \frac{3}{8}$. Then the subsequent outcomes $\epsilon_{i}$, $i \geq 2$ are more likely to match the outcomes of $\epsilon_{1}$ than not.

Now suppose $p = \frac{1}{4}$, $q = \frac{3}{4}$. Then $p^{+} = \frac{7}{16}$$p^{-} = \frac{3}{16}$, $q^{+} = \frac{13}{16}$, and $q^{-} = \frac{9}{16}$. In this example of an unfair coin, the dependency coefficient $\delta$ still attempts to skew the results following the first coin flip in favor of the outcome of $\epsilon_{1}$. However, the dependency here heightens the effect of $\epsilon_{1} = 0$ on subsequent flips, and cannot overcome the discrepancy between the probability of success and failure to skew $\epsilon_{i}$, $i \geq 2$ in favor of a $1$ following the outcome of $\epsilon_{1} = 1$.

Using these dependent Bernoulli random variables, [2] presents a Generalized Binomial Distribution for identically distributed but dependent Bernoulli random variables.

Generalized Binomial Distribution

Let $X = \sum_{i=1}^{n}\epsilon_{i}$, where $\epsilon_{i}$, $i = 1,\ldots ,n$ are identically distributed Bernoulli random variables with probability of success $p$ and dependency coefficient $\delta$.  Then

$$P(X=k) = q\tbinom{n-1}{k}(p^{-})^{k}(q^{+})^{n-1-k} + p\tbinom{n-1}{k-1}(p^{+})^{k-1}(q^{-})^{n-1-(k-1)}$$

#### Survival Function of Correlated Channels in a Cluster Server

Suppose the selection of tasks may be modeled by the dependent Bernoulli random variables given in the previous section. That is, suppose the customer selects Tasks 1-$K$ in sequence, and the selection or rejection of Task 1 affects all subsequent tasks by a dependency coefficient $\delta$. From [2], the correlation between task selections $\epsilon_{i}$ and $\epsilon_{j}$ is given by
$$\rho = \text{Cor}(\epsilon_{i},\epsilon_{j})=\begin{cases}\delta,&i=1;j=2,\ldots,K\\\delta^{2},&i\neq j;i,j\geq 2\end{cases}$$

This illustrates the dependency of Tasks 2-$K$ on the outcome of Task 1, and notes that while Tasks 2-$K$ are still correlated with each other, the dependency is much lower. In a similar fashion to the independent channel server, the survival function is derived.

Theorem 2 (Survival Function of Multichannel Server with Clustered Tasks and Dependent Channels). Suppose model assumptions 1-6 are satisfied. In addition, suppose the selection of channels 1-$K$ are determined by identically distributed Bernoulli random variables with dependency coefficient $\delta$ as defined in [2]. Then the survival function of the server is given by
$$S_{Y}(t)=\bar{F}_{0}(t)\exp\left(-\eta\int_{0}^{t}m(t-w)\bar{G}_{W}(w)\mathscr{S}(w)dw\right)$$

where $m(x) = \int_{0}^{x}\lambda(s)ds$, and

\begin{aligned}\mathscr{S}(w)&=\sum_{n=0}^{K}e^{-\eta n w}\sum_{j=0}^{K-n-1}\tbinom{K-1}{n-1, j,K-1-n-j}p^{K-1-j}(1-p)^{j+1}\delta^{K-1-n-j}(1-\delta)^{n}\\&\quad+\sum_{n=0}^{K}ne^{-\eta nw}\sum_{i=0}^{n-1}\tbinom{K-1}{K-1-n,i,n-1-i}p^{i+1}(1-p)^{K-n}\delta^{n-1-j}(1-\delta)^{K-n-j}\end{aligned}

## Conclusion

The generalized model of a server under random workload proposed in [3] admits further expansion by way of relaxing the assumption that incoming tasks have exactly one queue to enter on arrival. In considering a server partitioned into several channels, a cost is incurred, namely that additional stress to the server is dependent upon arrival and service times of all previous jobs. However, even under these circumstances, we may obtain a breakdown rate process and satisfactory conditional survival function for the server, and the door is opened to further discussion. By examining the multichannel server, we consider the interrelations of the channels themselves, and derive survival functions to meet the case when the channels are independent as well as when they are correlated.

### References

[1] Ki Hwan Cha and Eui Yong Lee. A stochastic breakdown model for an unreliable web server system and an optimal admission control policy. Journal of Applied Probability, 48(2):453–466, 2011.

[2] Andrzej Korzeniowski. On correlated random graphs. Journal of Probability and Statistical Science, 11:43–58, 2013.

[3] R. Traylor. Stochastic reliability of a server under random workload. Academic Advances of the CTO, 1, 2017.

On Server Efficiency

## Abstract

Editor’s note: This paper comprises the second chapter of the PhD dissertation by Rachel Traylor. Cha and Lee defined a mathematical notion of server performance by measuring efficiency $\psi$ defined as the long run average number of jobs completed per unit time. The service time distribution heavily influences the shape of the server efficiency as a function of a constant arrival rate $\lambda$. Various classes of distributions are studied in order to find sufficient conditions for the existence of a single maximum. The existence of a maximum allows for simple binary control policies to handle traffic and optimize performance.

## Introduction, Motivation, and Background

Cha and Lee [1] studied the reliability of a single server under a constant stress workload, and also defined a notion of server efficiency $\psi$ for a given intensity $\lambda(t)$ as the long-run average number of jobs completed per unit time as a way to measure server performance. With the number of jobs completed as $M$, the efficiency is defined

$$\psi := \lim\limits_{t \to \infty}\frac{E[M(t)]}{t}$$

Upon breakdown and rebooting, the server is assumed to be ‘as good as new’, in that performance of the server does not degrade during subsequent reboots. In addition, the model assumes the arrival process after reboot, denoted $\{N^{*}(t), t \geq 0\}$, is a nonhomogenous Poisson process with the same intensity function $\lambda(t)$ as before, and that $\{N^{*}(t), t \geq 0\}$ is independent of the arrival process before reboot. In a practical setting, this model assumes no ‘bottlenecking’ of arrivals occurs in the queue during server downtime that would cause an initial flood to the rebooted server. In addition, the reboot time is assumed to follow a continuous distribution $H(t)$ with expected value $\nu$. This process is a renewal reward process, with the renewal $\{R_{n}\} = \{M_{n}\}$, the number of jobs completed. The length of a renewal cycle is $Y_{n} + H_{n}$, where $Y_{n}$ is the length of time the server was operational, and $H_{n}$ is the time to reboot after a server crash. Then, by [2],

$$\psi = \frac{E[M]}{E[Y]+ \nu}$$

where $M$ is the number of jobs completed in a particular renewal cycle, $\nu$ is the mean time to reboot of the server, and $Y$ is the length of a particular renewal cycle. Then, using the definition of $\psi$ the following closed form of the efficiency of a server under all assumptions of Cha and Lee’s model is derived.

Theorem 1 (Server Efficiency under Cha/Lee)
Suppose $\{N(t), t \geq 0\}$ is a nonhomogenous Poisson process with intensity $\lambda(t)\geq 0$. Then the efficiency is given by

\begin{aligned}\psi&=\frac{1}{\int_{0}^{\infty}S_{Y}(t)dt + \nu}\left[\exp\left(-\int_{0}^{t}r_{0}(x)dx-\int_{0}^{t}\lambda(x)dx + a(t) + b(t)\right)\right.\\&\qquad\qquad\left.\times\left(r_{0}(t)a(t)+\eta a(t)b(t) \right)\right]\end{aligned} where $a(t) = \int_{0}^{t}e^{-\eta v}g_{W}(v)m(t-v)dv$, $b(t) = \int_{0}^{t}e^{-\eta(t-r)}\bar{G}_{W}(t-r)\lambda(r)dr$, $\bar{G}_{W}(x) = 1-\int_{0}^{x}g_{W}(s)ds$, and $m(x) = \int_{0}^{x}\lambda(s)ds$.

### Numerical Example and Control Policies

As an illustrative example, Cha and Lee considered the case when $\lambda(t) \equiv \lambda$,
$r_{0}(t) \equiv r_{0} = 0.2$, $\eta = 0.01$, $\nu = 1$, and $g_{W}(w) = we^{-w^{2}/2}$ (the PDF of the Rayleigh distribution). As shown in Figure 1, there exists a $\lambda^{*}$ such that $\psi(\lambda)$ is maximized. Thus one may implement the obvious optimal control policy for server control to avoid server overload:

(1) If the real time arrival rate $\lambda < \lambda^{*}$, do not interfere with arrivals.

(2) If $\lambda \geq \lambda^{*}$, facilitate some appropriate measure of interference.

Examples of interference for a web server in particular include rejection of incoming requests or possible re-routing. Cha and Lee give an interference policy of rejection with probability $1-\frac{\lambda^{*}}{\lambda}$.

The Rayleigh distribution used in Figure 1 has applications in physics, typically when the magnitude of a vector is related to its directional components. It is a special case of the Weibull distribution which is widely used in survival analysis, failure analysis, weather forecasting, and communications. These distributions are not typically used to model service times. The exponential distribution is the most common due to its memoryless properties, followed by the Erlang and uniform distributions.

The efficiency, $\psi$, under the Rayleigh distribution example in Figure 1 shows the existence of a $0 < \lambda^{*} < \infty$ such that $\psi(\lambda)$ is maximized at $\lambda^{*}$. This useful feature of $\psi(\lambda)$ in this case allows for the implementation of a simple control policy for arrivals to the server to prevent overload, given above.

Numerical simulations under a variety of possible distribution classes, including convex, concave, exponential, uniform, and Erlang suggest that the mathematical properties of $\psi$ are heavily influenced by the choice and characteristics of service time distribution $g_{W}(w)$. In particular, it is of interest to seek sufficient conditions of $g_{W}(w)$ that will guarantee the existence of a $\lambda^{*}$ that maximizes $\psi$. This is done for the uniform, compact support, and Erlang classes. Furthermore, it is shown under certain conditions, not only does the server efficiency lack a maximum, but $\psi$ increases without bound. This is not representative of real server behavior, and thus of mathematical and practical interest to note for further modeling.

## Efficiency of a Server under Uniform Service Life Distribution

Suppose $\lambda(x) \equiv \lambda$, and suppose $r_{0}(x) \equiv r_{0} = \max_{x \in (0,\infty)}r_{0}(x)$. The efficiency $\psi$ is given for constant $\eta$ and $\lambda$ here:
$\psi(\lambda) = \frac{1}{\int_{0}^{\infty}S_{Y}(t)dt + \nu}\left[\int_{0}^{\infty}\exp\left(-r_{0}t-\lambda t + a(t)+b(t)\right)(r_{0}+b(t))a(t)dt\right]$

where $S_{Y}(t)$ is the survival function of the node, $a(t) = \int_{0}^{t}e^{-\eta v}g(v)(t-v)dv$, $b(t) = \int_{0}^{t}e^{-\eta(t-r)}\bar{G}(t-r)dr$,
$g(v)$ is the pdf of the service time distribution, and $\bar{G}(x) = 1-\int_{0}^{x}g(s)ds$.

The following theorem gives sufficient conditions for the uniform distribution and $\eta$ that guarantee the existence of a finite maximum efficiency.

Theorem 2(Efficiency under Uniform Service Distribution)
Suppose the service life distribution is given by Uniform($c,d$) for some $0. Then if
$\sigma > \frac{ce^{-c\eta}}{\sqrt{12}\phi(-\eta)(1+\eta(c+d))+c\eta - 1}$ where $\phi(-\eta)$ the standard deviation of the service life $W$, $\psi(\lambda)$ has a maximum on $(0,\infty)$ is the moment generating function of a uniform distribution evaluated at $-\eta$.

Numerical simulations suggest that $\psi$ increases without bound for $c=0, d>1$. The following lemma proves this fact.

Lemma
Suppose the service life distribution is given by Uniform(0,d), with $d>1$. Then $\psi$ increases without bound.

This is worth discussing here. It makes no sense whatsoever for the efficiency of a server to increase forever as the arrival rate increases. So what’s happening here? Notice that if the uniform distribution include 0 as an endpoint, then there is a positive probability that the service time for a job will be exactly 0 (in whatever time units you like). This is impossible in reality. A small service time is possible, but it’s still strictly greater than 0. What we see here is that using distributions with positive support at 0 causes issues in the efficiency function; it’s not the fault of the definition of efficiency, but rather a consequence of using distributions that cannot mirror reality. The next section explores this further with a more broad class of distributions.

## Extension of the Uniform Distribution: Compact Support

The ideas and techniques presented in the previous section yield a powerful extension to any service life distribution with compact support away from 0. Supposing $g_{W}(w)$ has compact support $[a,b]$, it may be bounded above by a positively scaled uniform distribution. In practice, service times are finite and nonzero, thus this extension allows for very simple control policies to be implemented for a much larger class of distributions.

Theorem 3 (Efficiency under Compact Support)
Let $g_{W}(w)$ be the pdf of the service times having compact support $[c,d], d > 0$. Let $m = \max_{w}g_{W}(w) < \infty$, and $R = (d-c)$ be the length of the support. Then $\psi(\lambda)$ has a maximum if $m < \frac{c}{R\eta +e^{-b\eta}-e^{-a\eta}+\eta e^{-a\eta}}$.

Example 1

Let $g_{W}(w) = \frac{2}{5}w\mathbb{1}_{[2,3]}(w)$, and let $r_{0} = \nu = \eta = 1$. Then $m = \frac{6}{5}$. By Theorem 2, $m < \frac{2}{e^{-2}+e^{-3}+1}\approx 1.678$. Thus, the existence of a maximum $\psi$ is guaranteed. Figure 2 gives the numerical result with step size of 0.1. The maximum occurs around $\lambda^{*} = 0.5$

Example 2: Sufficient conditions aren’t always necessary conditions
The condition given in Theorem 2 is rather weak and is only a sufficient condition. As an illustration, consider the same increasing density shifted to a support of $[1,2]$. Thus
$g_{W}(w) = \frac{2}{3}w\mathbb{1}_{[1,2]}(w)$. Retain all other assumptions from Example 1. Then
$$m = \frac{4}{3} > \frac{1}{e^{-2}+1} \approx 0.88$$ which violates the condition. However, consider Figure 3 above. Clearly $\psi$ has a maximum at approximately $\lambda^{*} = 0.8$
The condition given in Theorem 2 relies on overestimating $g_{W}(w)$ by a constant. If the variance $\sigma_{W}^{2}$ of $g_{W}(w)$ is large, the maximum of the pdf will decrease and be more comparable to the rest of the distribution. In these cases, bounding $g_{W}(w)$ by its max $m$ over the support $[a,b]$ will give a reasonable approximation. However, in the case of a small support and high enough skew compared to the size and location of the support, much of the mass is concentrated at the right end of the support, and $m$ will be higher. In these cases, bounding $g_{W}(w)$ by $m$ will result in a large amount of overestimation, and thus the condition may fail, but $\psi$ still has a maximum. The numerical example wherein $g_{W}(w) = \frac{2}{3}w\mathbb{1}_{[1,2]}(w)$ illustrates the conservative nature of this type of estimation.

## Efficiency under Erlang Service Distribution

Now suppose $g(v)$ is of the Erlang class but shifted $\delta > 0$ to the right. For motivation, consider that service times can never be 0 in a practical setting. Then the PDF and the complement of the CDF are given by
\begin{aligned}g(v;k,\gamma,\delta) &= \left\{\begin{array}{lr}0, & 0 \leq v \leq \delta \\\frac{\gamma^{k}(v-\delta)^{k-1}e^{-\gamma(v-\delta)}}{(k-1)!}, & v\geq \delta\end{array}\right.\\\bar{G}(v;k,\gamma,\delta) &= \left\{\begin{array}{lr}1, & 0 \leq v \leq \delta \\e^{\gamma(\delta-v)}\sum_{j=0}^{k-1}\frac{\gamma^{j}(v-\delta)^{j}}{j!}, & v\geq \delta\end{array}\right.\end{aligned}

Theorem 4 (Server Efficiency under Erlang Distribution)
Let $\delta > 0, \eta > 0$. Let $\alpha(\delta) = \left(\frac{\gamma}{\gamma + \eta}\right)^{k}e^{-\eta\delta} + \frac{\gamma^{k}e^{\gamma\delta}(k-1)}{(\eta + \gamma)^{k-1}}$, and $0 < \beta(\delta,\eta) < 1$. If the service life distribution is of the $\delta-$shifted Erlang class, then $\psi(\lambda)$ has a maximum in $\lambda$ on $(0,\infty)$ for $\delta,\eta$ such that $\alpha(\delta) + \beta(\delta,\eta) < 1$.

Thus, we see that the Erlang service distribution produces a maximum as long as the support is shifted away from 0.

## Efficiency under Exponential Distribution

The exponential distribution is the most common service distribution assumed for a queue. The M/M/1 queue assumes a Poisson distribution for arrival times, and an exponential distribution or service times. However, the exponential distribution has most of its mass concentrated at or near 0. As we have seen in other distributions, this becomes a problem, both in considering what makes sense in practical applications, and in its effect on the efficiency.
Suppose $g(v)$ is of the exponential class. That is, $g(v) = \gamma e^{-\gamma v}, \gamma > 0$. It will be proven that under certain conditions on $\eta$ and $\gamma$, an exponential $g(v)$ causes $\psi$ to increase without bound.

Theorem 5 (Efficiency under Exponential Distribution)
Suppose $g(v) = \gamma e^{-\gamma v}$. Then if $\frac{2\gamma}{\gamma + \eta} > 1 + \frac{2}{\frac{2}{\gamma} + W\left(-\frac{\gamma}{\gamma +\eta}e^{-2-\frac{2\eta}{\gamma}}\right)}$, $\psi(\lambda) \to \infty$,  as $\lambda \to \infty$ where $W(\cdot)$ is the Lambert W-function.

## Extension to Random Stress and Nonconstant Intensity

All the above theorems regarding the server efficiency  assumed constant stress $\eta$ and constant intensity $\lambda$. This section generalizes the analyses in the previous sections for nonconstant intensity and random stress.

The stochastic reliability models in both [1]  and the previous chapter both assumed a time-dependent intensity $\lambda(t)$. By setting $\lambda \equiv \max_{t}\lambda(t)$, the established efficiency theorems provide a conservative set of conditions under which a maximum efficiency may be obtained. In these cases, $\psi$ is actually a function of all possible $\lambda_{\max}$.

If the job stresses are random, as in Theorem 2 of Traylor (2016), the above sections may still be utilized. Assume the sample space for $\mathcal{H}$ is compact and nonnegative. The efficiency is given by Theorem 2 of Traylor (2016).
WLOG, again suppose the sample space of $\mathcal{H}$ is discrete, given by $\{\eta_{1},...,\eta_{m}\}$ with respective probabilities $p_{i}, i =1,...,m$. Now suppose all mass is concentrated at $\eta_{[m]}$. Let $a_{m}(t) = \lambda\int_{0}^{t}e^{-\eta_{m}v}g(v)(t-v)dv$, and $b_{m}(t) = \lambda\int_{0}^{t}e^{-\eta_{m}(t-r)}\bar{G}_{W}(t-r)dr$. Then, the following are true:
(1) $E_{\mathcal{H}}[a(t)+b(t)] \leq a_{m}(t) + b_{m}(t)$

(2)$E_{\mathcal{H}}[a(t)] \leq a_{m}(t)$

(3) $E_{\mathcal{H}}[\mathcal{H} a(t)b(t)] \leq \eta_{m}a_{m}(t)b_{m}(t)$

Thus, by replacing the expectations in (1)-(3) with their respective upper bounds in the efficiency theorems, analyses of the efficiency for the uniform, compact support, and Erlang classes may proceed as previously detailed. These estimates are conservative but sufficient.

For an exponential service life distribution and random stress, create a lower bound for the expectations in
(1)-(3) by concentrating all mass at $\eta_{[1]}$. Then the conditions from Theorem 5 guarantee an explosion for the exponential distribution.

## Implications and Numerical Illustrations

### Uniform Service Distribution

Click on an image in the gallery to expand and explore the efficiency of a server under different instances of the uniform service distribution.
For the uniform service distribution, both the variance of $g_{W}(w)$ and the location of the support affect the efficiency itself. The gallery above shows $\psi(\lambda)$ for various uniform distributions as an illustration. In all four cases, $r_{0} = \nu =\eta =1$. The variance of a uniform distribution is given by $\sigma^{2} = \frac{d-c}{12}$. With the exception of the fourth gallery image, which illustrates the explosion of $\psi$ when the distribution has positive mass at 0, Table 1 below compares possible values of $\psi$ for different Uniform distributions. Notice that while the variance $\sigma^{2}$ does affect the range of $\psi$ by several orders of magnitude, the location of the support has a much more powerful effect. Thus, if all service times are equally likely, a server is less efficient if it is consistently but mildly slow (Uniform[10,11]) compared to an inconsistent server (Uniform[1,500]).

$$\begin{array}{l|rrrr}g_{W}(w) & \sigma^{2}&\mu& \text{Approximate Range of } \psi & \text{Approximate } \lambda^{*}\\\text{Uniform}[1,2] & 1/12 &1.5 & (0,0.012) & 1.3\\ \text{Uniform}[10,11] & 1/12 &10.5 & (0, 2\times10^{-11}) & 0.1 \\ \text{Uniform}[1,500] & 499/12 & 250.5 & (0,3\times 10^{-5}) & 1.6\end{array}$$

### Increasing but Compact Service Distribution

Click to expand and explore.

$$\begin{array}{l|rrrr}g_{W}(w) & \sigma^{2}&\mu& \text{Approx. Range of } \psi & \text{Approx. } \lambda^{*}\\\small{\frac{2w}{3}}\mathbb{1}_{[1,2]}(w)& \approx 0.0802&\approx 1.56 & (0,7\times10^{-3}) & 0.75\\\small{\frac{2w}{5}}\mathbb{1}_{[2,3]}(w)&\approx 0.0822 &\approx 2.53 & (0, 7\times10^{-4}) & 0.5 \\\small{\frac{2w}{500^{2}-1}}\mathbb{1}_{[1,500]}(w)& \approx 13888& 333.35 & (0,1.3\times 10^{-7})& 1.4\end{array}$$

As an illustration of a distribution on compact support, consider the class of increasing densities $g_{W}(w) = cx\mathbb{1}_{[a,b]}(x)$. Several examples are given in the above gallery table. For both compact supports of length 1, the variance is approximately the same, but the mean changes, producing an order of magnitude decrease in efficiency. Compared to the compact support of length 499, with a much larger mean, the efficiency decreases by 3 orders of magnitude. Notice, however, that the decline after the maximum is much less sharp.

### Erlang Service Distribution

Click on an image to explore.

$$\begin{array}{l|rrrr}g_{W}(w) & \sigma^{2}&\mu& \text{Approx. Range of } \psi & \text{Approx. } \lambda^{*}\\\text{Erlang}(2,1)& 2&2 & (0,0.7)& 9\\\text{Erlang}(9,1)&9 &9 & (0, 4\times10^{-6}) & 0.5 \\\text{Rayleigh}(1)& (4-\pi)/2& \sqrt{\pi/2} & (0,0.9)& 8\end{array}$$ The gallery above gives two examples of $\psi$ under an Erlang distribution. Notice the change in the efficiency as the mean increases. Here, since $\lambda =1$, $\sigma^{2} = \mu$, so the mean likely has the largest effect on $\psi$.

### Comparing all the examples

Comparing all examples, the Rayleigh(1) service distribution imposes the highest maximum efficiency, followed closely by the Erlang(2,1) service distribution, with the Uniform[1,2] service distribution following. $\lambda^{*}$ under the Erlang(2,1) service distribution is larger than for the Rayleigh(1) service distribution, indicating that a server whose service times follow the former distribution can handle a larger arrival intensity before its efficiency begins to decline than the latter.

The means for the Rayleigh(1), Erlang(2,1), and Uniform[1,2] distributions are similar, as shown in the tables, but the Uniform[1,2] distribution has equal probability of any service time in its support with large negative excess kurtosis. It is posulated that kurtosis, skew, and variance play large roles in the behavior and range of $\psi$. Compare the efficiency under the Erlang(2,1) service distribution with the efficiency under the Erlang(9,1) service distribution. Not only is the mean much lower for the Erlang(2,1) distribution, but the distribution is more strongly positive-skewed than the Erlang(9,1). Thus, more mass is concentrated at the left side of the distribution, indicating that the service times are more often shorter.

Finally, to note the effect of the typical stress level $\eta$ on the range of $\psi$, we compare the original Cha/Lee figure with our Rayleigh figure above. The service distribution and all other quantities remain the same, but Cha and Lee’s numerical example set $\eta = 0.01$, whereas the third image in the Erlang gallery shows $\psi$ under $\eta = 1$. The range of $\psi$ decreases by two orders of magnitude with a 100 fold increase in $\eta$, with the shape remaining similar. In addition, the location of the maximum $\lambda^{*}$ also inversely varies by the same magnitude.

Studying the efficiency under various service distributions aids not only in deciding when to implement a server intervention, but also aids in evaluating the performance of various servers given their service times.

## Conclusion

Many questions have been opened up about the various effects of the service distribution and stress on the server efficiency. The conditions given above are merely sufficient conditions, and have not been proven necessary. Studying the efficiency as a function of the arrival rate under various service distributions and stress provides a way to understand the performance of a server and the various factors that can affect it mathematically.

## References

1.  CHA, K. H., AND LEE, E. Y. A stochastic breakdown model for an unreliable web server system and an optimal admission control policy. Journal of Applied Probability 48, 2 (2011), 453–466.
2.  ROSS, S. Stochastic Processes, 2nd edition.
3.  TRAYLOR, R. Stochastic reliability of a server under random workload. Ph.D. thesis (2016).
Stochastic Reliability of a Server under a Random Workload

## Abstract

Editor’s note: This paper is the first chapter from a PhD thesis published in 2016 by Rachel Traylor. We generalize a 2011 model from Cha and Lee that gave a closed form for the survival function of a server under a random workload, and with random service times. In this work, the constant stress assumption of Cha and Lee is relaxed and allowed to be random; that is, every request to the server brings a random stress or workload to the server for the duration of its stay until completion. The efficiency measure of a random stress server is defined as the long run average number of jobs completed per unit time, and provides a way to measure server performance.

## Introduction

There are many types of systems which can be dubbed servers, such as a retail checkout counter, a shipping company, a web server, or a customer service hotline. All of these systems have common general behavior. Requests or customers arrive via a stochastic process, the service times vary randomly, and each request stresses the server if only temporarily. A general stochastic model that describes the reliability of such a server can provide the necessary information for optimal resource allocation and efficient task scheduling, leading to significant cost savings for businesses and improved performance metrics [6]. Such topics have been studied in literature for several decades [1, 2, 3, 20].

Much attention was devoted to reliability principles that model software failures and bug fixes, starting with Jelinski and Moranda in 1972 [11]. The hazard function under this model shows the time between the i-th failure and the (+ 1)st failure. Littlewood (1980) [16] extended this initial reliability model for software by assuming differences in error size [12].

These models have been extended into software testing applications [4, 5, 19] and optimal software release times [7, 8, 17, 22]. The explosion of e-commerce and the resulting increase in internet traffic have led to the development of reliability models for Web applications. Heavy traffic can overload and crash a server; thus, various control policies for refreshing content and admission of page requests were created [10, 13, 15, 18, 23].

In particular, Cha and Lee (2011) [9] proposed a stochastic breakdown model for an unreliable web server whose requests arrive at random times according to a nonhomogenous Poisson process and bring a constant stress factor to the system that dissipates upon service completion. The authors provide a fairly general survival function under any service distribution $g_{W}(w)$, define server efficiency to measure performance, and illustrate a possible admission control policy due to an observed property of the server efficiency under a specific numerical example.

Thus far, no extensions of [9] have been proposed. This work generalizes the model put forth by Cha and Lee in a variety of ways. First, the assumption of constant job stress is relaxed and replaced by a random variable, and a new survival function and efficiency equation are derived. This work, while suitable for IT applications, is general enough for use in almost any industry, including logistics, retail, manufacturing, and engineering systems.

## Background (Cha and Lee’s Model)

### System Description and Survival Function

Cha and Lee considered a web server wherein each request arrives via a nonhomogenous Poisson process $\{N(t): t \geq 0\}$ with intensity function $\lambda(t)$. Each request adds a constant stress $\eta$, increasing the breakdown rate for the duration of service. Suppose $r_{0}(t)$ is the breakdown rate of the idle server. Then the breakdown rate process $\mathscr{B}(t)$ is defined as

$$\mathscr{B}(t) := r_{0}(t) + \eta\sum_{j=1}^{N(t)}\mathbb{1}(T_{j} \leq t \leq T_{j} + W_{j})$$

where $N(t), \{T_{j}\}_{j=1}^{N(t)}, \{W_{j}\}_{j=1}^{N(t)}$ are the random variables that describe the number of arrivals, arrival times, and service times, respectively. It is assumed that $\{T_{j}\}_{j=1}^{N(t)}$ are independent of each other. Furthermore, the $\{W_{j}\}_{j=1}^{N(t)} \sim g_{W}(w)$ are i.i.d. and are mutually independent of all $T_{j}$‘s.

Under these conditions, Cha and Lee proved the following theorem that gives the explicit form of the survival function under this model:

Theorem.

Suppose that $\{N(t), t \geq 0\}$ is a nonhomogenous Poisson process with intensity function $\lambda(t)$, i.e. $m(t) \equiv \int_{0}^{t}\lambda(x)dx$. Assuming the conditional survival function is given by

$$P\left(Y > t \left| N(t), \{T_{j}\}_{j=1}^{N(t)}, \{W_{j}\}_{j=1}^{N(t)}\right.\right) = \bar{F}_{0}(t)\exp\left(-\eta\sum_{j=1}^{N(t)}\min(W_{j}, t-T_{j})\right)$$

and $m(t)$ has an inverse, the survival function of $Y$ is given by

$$S_{Y}(t) = \bar{F}_{0}(t)\exp\left(-\eta\int_{0}^{t}\exp(-\eta w)\bar{G}_{W}(w)m(t-w)dw\right)$$

and the hazard function of $Y$, denoted $r(t)$, is given by

$$r(t) = r_{0}(t) + \eta\int_{0}^{t}e^{-\eta w}\bar{G}_{W}(w)\lambda(t-w)dw$$

### Efficiency of the Server

It is natural to develop some measure of server performance. Cha and Lee measure such performance by defining the efficiency, $\psi$, of the web server as the long-run expected number of jobs completed per unit time. That is, with the number of jobs completed as $M$, the efficiency is defined

$$\psi := \lim\limits_{t \to \infty}\frac{E[M(t)]}{t}$$

Upon breakdown and rebooting, the server is assumed to be ‘as good as new’, in that performance of the server does not degrade during subsequent reboots. In addition, the model assumes the arrival process after reboot, denoted $\{N^{*}(t), t \geq 0\}$, is a nonhomogenous Poisson process with the same intensity function $\lambda(t)$ as before, and that $\{N^{*}(t), t \geq 0\}$ is independent of the arrival process before reboot. In a practical setting, this model assumes no ‘bottlenecking’ of arrivals occurs in the queue during server downtime that would cause an initial flood to the rebooted server. In addition, the reboot time is assumed to follow a continuous distribution $H(t)$ with expected value $\nu$. This process is a renewal reward process, with the renewal $\{R_{n}\} = \{M_{n}\}$, the number of jobs completed. The length of a renewal cycle is $Y_{n} + H_{n}$, where $Y_{n}$ is the length of time the server was operational, and $H_{n}$ is the time to reboot after a server crash. Then, by [21],

$$\psi = \frac{E[M]}{E[Y]+ \nu}$$

where $M$ is the number of jobs completed in a particular renewal cycle, $\nu$ is the mean time to reboot of the server, and $Y$ is the length of a particular renewal cycle. Then, using the definition of efficiency, the following closed form of the efficiency of a server under all assumptions of Cha and Lee’s model is derived.

Theorem. (Efficiency of the Constant Stress Server)
Suppose $\{N(t), t \geq 0\}$ is a nonhomogenous Poisson process with intensity $\lambda(t)\geq 0$. Then the efficiency is given by

\begin{aligned}\psi &= \frac{1}{\int_{0}^{\infty}S_{Y}(t)dt + \nu}\left[\exp\left(-\int_{0}^{t}r_{0}(x)dx-\int_{0}^{t}\lambda(x)dx + a(t) + b(t)\right)\right.\\&\kern{10em}\biggl.\times\left(r_{0}(t)a(t) + \eta a(t)b(t)\right)\biggr]\end{aligned}

where $a(t) = \int_{0}^{t}e^{-\eta v}g_{W}(v)m(t-v)dv$, $b(t) = \int_{0}^{t}e^{-\eta(t-r)}\bar{G}_{W}(t-r)\lambda(r)dr$, $\bar{G}_{W}(x) = 1-\int_{0}^{x}g_{W}(s)ds$, and $m(x) = \int_{0}^{x}\lambda(s)ds$.

### Numerical Examples and Control Policies

As an illustrative example, Cha and Lee considered the case when $\lambda(t) \equiv \lambda$,
$r_{0}(t) \equiv r_{0} = 0.2$, $\eta = 0.01$, $\nu = 1$, and $g_{W}(w) = we^{-w^{2}/2}$ (the PDF of the Rayleigh distribution). As shown in the figure above, there exists a $\lambda^{*}$ such that $\psi(\lambda)$ is maximized. Thus one may implement the obvious optimal control policy for server control to avoid server overload:

1. If the real time arrival rate $\lambda < \lambda^{*}$, do not interfere with arrivals.
2. If $\lambda \geq \lambda^{*}$, facilitate some appropriate measure of interference.

Examples of interference for a web server in particular include rejection of incoming requests or possible re-routing. Cha and Lee give an interference policy of rejection with probability $1-\frac{\lambda^{*}}{\lambda}$. The next section presents a generalization of the above model by relaxing the assumption that the job stress $\eta$ is constant. This yields a far more generalized model that can encompass random stress with any distribution.

## Random Stress Reliability Model

The basic model here is very similar to Cha and Lee’s. The main difference is that we relax the constant stress assumption and allow it to be random.

Assume that each job $j$ coming into the server adds a random stress $\mathcal{H}_{j}$ to the server for the duration of its time in the system. Suppose $\{\mathcal{H}_{j}\}_{j=1}^{N(t)} \sim \mathcal{H}$, and are i.i.d. where WLOG $\mathcal{H}$ is a discrete random variable with a finite sample space $S = \{\eta_{i}: \eta_{i} \in \mathbb{R}^{+}, i = 1,...,m \text{ for } m \in \mathbb{N}\}$ and probability distribution given by

$$P(\mathcal{H} = \eta_{i}) = p_{i}, \quad i = 1,\ldots,m$$

The following assumptions from Cha and Lee are retained:

1. (CL1) Requests arrive via a nonhomogenous Poisson Process $\{N(t), t \geq 0\}$ with intensity $\lambda(t)$.
2. (CL2) Arrival times $\{T_{j}\}_{j=1}^{N(t)}$ are independent.
3. (CL3) Service times $\{W_{j}\}_{j=1}^{N(t)}$ are i.i.d. with pdf $g_{W}(w)$ and mutually independent of all arrival times.

Then, the random stress breakdown rate (RSBR) process $\mathscr{B}(t)$ is given by

$$\mathscr{B}(t) = r_{0}(t) + \sum_{j=1}^{N(t)}\mathcal{H}_{j}\mathbb{1}(T_{j} < t \leq T_{j} + W_{j}), \quad t \geq 0$$

Compare the sample trajectory shown in the figure above to the sample trajectory of breakdown rate under the original model (see page 2). The random stress brought by each request still disappears upon job completion, but the effect on the server is no longer deterministic. Thus, for the same set of arrival times and respective completion times, the realization of the breakdown rate process under the RSBR model has one more element of variation.

Let $Y$ be the random time to breakdown of the web server given the workload from client requests. Let $\mathfrak{T} = \{T_{j}\}_{j=1}^{N(t)}$, $\mathfrak{W} = \{W_{j}\}_{j=1}^{N(t)}$, and $\mathfrak{H} = \{\mathcal{H}_{j}\}_{j=1}^{N(t)}$, with observed values $\mathfrak{t} = \{t_{j}\}_{j=1}^{N(t)}$, $\mathfrak{w} = \{w_{j}\}_{j=1}^{N(t)}$, and $\mathfrak{h} = \{\eta_{i_{j}}\}_{i_{j}=1}^{N(t)}$. Then the conditional survival function of the server, given the arrival process of client requests ($N(t)$), job stresses ($\mathfrak{H}$), service times ($\mathfrak{W}$), and arrival times ($\mathfrak{T}$) is

\begin{aligned}S_{\scriptstyle{Y|N(t), \mathfrak{T}, \mathfrak{W}, \mathfrak{H} }}(t|n, \mathfrak{t}, \mathfrak{w}, \mathfrak{h}) &=e^{ -\int_{0}^{t}\mathscr{B}(x)dx}\\&= \bar{F}_{0}(t)e^{-\sum_{j=1}^{N(t)} \mathcal{H}_{i}\min(W_{j}, t-T_{j})}\end{aligned}

where $\bar{F}_{0}(t) = \exp\left(-\int_{0}^{t}r_{0}(x)dx\right)$.

### Survival Function for the RSBR Server

Under the RSBR generalization, the survival function of the server is given in the following theorem.

Theorem 1. (Survival Function for a Random Stress Server)
Suppose that jobs arrive to a server according to a nonhomogenous Poisson process
$\{N(t), t \geq 0\}$ with intensity function $\lambda(t) \geq 0$ and $m(t) \equiv E[N(t)] = \int_{0}^{t}\lambda(x)dx$. Let the arrival times $\{T_{j}\}_{j=1}^{N(t)}$ be independent, and let the service times $\{W_{j}\}_{j=1}^{N(t)} g_{W}(w)$ be i.i.d. and mutually independent of all arrival times. Assume the random job stresses $\mathcal{H}_{j} \sim \mathcal{H}$. Then

$$S_{Y}(t) = \bar{F}_{0}(t)\exp\left(- E_{\mathcal{H}}\left[\mathcal{H}\int_{0}^{t}e^{-\mathcal{H} w}m(t-w)\bar{G}_{W}(w)dw\right]\right)$$

where $\bar{F}_{0}(t) = \exp\left(-\int_{0}^{t}r_{0}(s)ds\right)$.

The compound failure rate function (or hazard function) $r(t)$ is given by

$$r(t) = -\frac{d}{dt}\ln(S_{Y}(t)) = r_{0}(t) + E_{\mathcal{H}}\left[\mathcal{H}\int_{0}^{t}e^{-\mathcal{H} w}m(t-w)\bar{G}_{W}(w)dw\right]$$

See [14] or [21].

## Efficiency Measure of a Server under RSBR

Upon server crash, the server must be rebooted. This section gives the server efficiency as defined in [9], but for the RSBR model. The server efficiency given in Theorem 1 holds quite a few similarities to that of [9], except that, like Theorem 1, the expectation must be taken over the random stress variable. The following assumptions from the original model are retained:

1. (E1) The arrival process after rebooting, $\{N^{rb}(t), t \geq 0\}$, remains a nonhomogenous Poisson process with the same intensity function $\lambda(t), t \geq 0$ as before.
2. (E2) $\{N^{rb}(t), t \geq 0\}$ is independent of the arrival process of client requests before rebooting. Hence, $\{N^{rb}(t), t \geq 0\} = \{N(t), t \geq 0\}$, since it retains all the same characteristics as before.
3. (E3) The time to reboot the server follows a continuous distribution $H(t)$ with mean $\nu$.

Recall that $M(t)$ is defined as the total number of jobs completed by the server during the time $(0,t]$. Also, recall the definition of server efficiency from [9]:

$$\psi \equiv \lim\limits_{t \to \infty}\frac{E[M(t)]}{t}$$

The efficiency of the server under a random stress environment is given in the following theorem.

Theorem (Server Efficiency under Random Stress Environment)
Suppose that $\{N(t): t \geq 0\}$ is a nonhomogenous Poisson process with intensity function $\lambda(t), t \geq 0$. Suppose also the conditions of Theorem 1 and the conditions (E1)-(E3) are met. Then the efficiency of the server is given by

\begin{aligned}\psi &= \frac{1}{\int_{0}^{\infty}S_{Y}(t)dt + \nu}\\&\kern{2em}\times\left\{\int_{0}^{\infty}e^{-\int_{0}^{t}r_{0}(x)-\int_{0}^{t}\lambda(x)dx+E_{\mathcal{H}}[a(t)+b(t)]}(r_{0}(t)E_{\mathcal{H}}[a(t)] + E_{\mathcal{H}}[\mathcal{H} a(t)b(t)])dt \right\}\end{aligned}

where

$$a(t) = \int_{0}^{t}e^{-\mathcal{H} v}g_{W}(v)m(t-v)dv$$

and

$$b(t) = \int_{0}^{t}e^{-\mathcal{H}(t-r)}\bar{G}_{W}(t-r)\lambda(r)dr.$$

## Remarks and Implications

Note that $\mathcal{H}$ was assumed discrete, but the proofs of Theorems 1 and 2 are unaffected if $\mathcal{H}$ is continuous. Thus the generality of this model is significantly stronger than in [9]. In practical considerations, these integrals will need to be numerically evaluated.

For certain distributions of $\mathcal{H}$, $S_{Y}(t)$ has a fairly compact form. Subsequent installments will examine the case where $\mathcal{H}$ has a binomial distribution, formed from both independent and dependent Bernoulli trials. We will also explore the effects of various service life distributions on $\psi$.

## Conclusion

This paper has given a simple but large generalization of [9]. The random stress model allows for the reality that we likely will not know the stress a request will bring to the server, but may know its distribution. Subsequent installments will further study the properties of the RSBR model and extend it to other single-server models, and networks.