Abstract: We construct rings of typed ordered fuzzy numbers whose component functions are of a common form. As this ring also contains improper fuzzy numbers (OFNs whose membership "functions" are actually just relations), we develop a set of operations to convert an improper fuzzy number to a proper one of the same type.
Abstract: This paper proposes a new trapezoidal ordered fuzzy number representation of windowed time series based on the idea of a Japanese candlestick to those previously proposed in the literature. We define and illustrate several descriptive statistics based on the information contained in the ordered fuzzy numbers. We utilize financial trading data from three automotive companies as a case study. Further expansion can be applied to any other forms of time series data to offer further insight into many other data driven situations.
Abstract: This paper generalizes the negative binomial random variable by generating it from a sequence of first- kind dependent Bernoulli trials under the identity permutation. The PMF, MGF, and various moments are provided, and it is proven that the distribution is indeed an extension of the standard negative binomial random variable. We examine the effect of complete dependence of the Bernoulli trials on the generalized negative binomial random variable. We also show that the generalized geometric random variable is a special case of the generalized negative binomial random variable, but the generalized negative binomial random variable cannot be generated from a sum of i.i.d. generalized geometric random variables.
Abstract: This paper discusses the notion of horizontal dependency in sequences of first-kind dependent categorical random variables. We examine the necessary and sufficient conditions for a sequence of first-kind dependent categorical random variables to be identically distributed when the conditional probability distribution of subsequent variables after the first are permuted from the identity permutation used in previous works.
Abstract: This paper generalizes the notion of the geometric distribution to allow for dependent Bernoulli trials generated from dependency generators as defined in Traylor and Hathcock’s previous work. The generalized geometric distribution describes a random variable X that counts the number of dependent Bernoulli trials until the first success. The main result of the paper is X can count dependent Bernoulli trials from any dependency structure and retain the same distribution. Other characterizations and properties of the generalized geometric distribution are given, including the MGF, mean, variance, skew, and entropy.
Abstract: Traffic to any server is rarely constant over time. In addition, the workload brought by each service request is typically unknown in advance, and each request may bring a different workload to the server. Cha and Lee (2011) proposed a reliability model where each request brings an identical and constant workload. In this paper, we generalize the model to allow for requests to bring an unknown random stress to the server. Jobs arrive to the server via a nonhomogeneous Poisson process. Service times are random and i.i.d. Each job adds a random stress $H_{j} ~ H$ to the breakdown rate of the server until the job is completed. The survival function of such a server and the efficiency of the server are derived.
Abstract: This paper develops a more general theory of sequences of dependent categorical random variables, extending the works of Korzeniowski (2013) and Traylor (2017) that studied first-kind dependency in sequences of Bernoulli and categorical random variables, respectively. A more natural form of dependency, sequential dependency, is defined and shown to retain the property of identically distributed but dependent elements in the sequence. The cross-covariance of sequentially dependent categorical random variables is proven to decrease exponentially in the dependency coefficient δ as the distance between the variables in the sequence increases. We then generalize the notion of vertical dependency to describe the relationship between a categorical random variable in a sequence and its predecessors, and define a class of generating functions for such dependency structures. The main result of the paper is that any sequence of dependent categorical random variables generated from a function in the class $C_{\delta}$ that is dependency continuous yields identically distributed but dependent random variables. Finally, a graphical interpretation is given and several examples from the generalized vertical dependency class are illustrated.
Abstract: Server resource allocation and traffic management is a large area of research and business concern in order to ensure proper functionality and maintenance procedures. As a result, good server reliability models that can incorporate workload and traffic stress are necessary. This paper generalizes previous dynamic server reliability models for partitioned servers with clustered-task selection by relaxing the assumption that the correlation between channels in the server remain constant. We allow the correlation to vary deterministically with time, or as a function of a random process in discrete or continuous time. The explicit form of the survival function is derived in such cases. Numerical illustrations demonstrate the dangers of erroneously assuming independence among channels, which can lead to costly and unnecessary interventions in the system. In addition, we numerically explore the effects of a variable correlation on the survival function.
Abstract: Categorical random variables are a common staple in machine learning methods and other applications across disciplines. Many times, correlation within categorical predictors exists, and has been noted to have an effect on various algorithm effectiveness, such as feature ranking and random forests. We present a mathematical construction of a sequence of identically distributed but dependent categorical random variables, and give a generalized multinomial distribution to model the probability of counts of such variables.
Abstract: Suppose a single server has K channels, each of which performs a different task. Customers arrive to the server via a nonhomogenous Poisson process with intensity $\lambda(t)$ and select 0 to $K$ tasks for the server to perform. Each channel services the tasks in its queue independently, and the customer’s job is complete when the last task selected is complete. The stress to the server is a constant multiple $\eta$ of the number of tasks selected by each customer, and thus the stress added to the server by each customer is random. Under this model, we provide the survival function for such a server in both the case of independently selected channels and correlated channels. A numerical comparison of expected lifetimes for various arrival rates is given, and the relationship between the dependency of channel selection and expected server lifetime is presented.
Abstract: Modern storage systems orchestrate a group of disks to achieve their performance and reliability goals. Even though such systems are designed to withstand the failure of individual disks, failure of multiple disks poses a unique set of challenges. We empirically investigate disk failure data from a large number of production systems, specifically focusing on the impact of disk failures on RAID storage systems. Our data covers about one million SATA disks from six disk models for periods up to 5 years. We show how observed disk failures weaken the protection provided by RAID. The count of reallocated sectors correlates strongly with impending failures. With these findings we designed RAIDShield, which consists of two components. First, we have built and evaluated an active defense mechanism that monitors the health of each disk and replaces those that are predicted to fail imminently. This proactive protection has been incorporated into our product and is observed to eliminate 88% of triple disk errors, which are 80% of all RAID failures. Second, we have designed and simulated a method of using the joint failure probability to quantify and predict how likely a RAID group is to face multiple simultaneous disk failures, which can identify disks that collectively represent a risk of failure even when no individual disk is flagged in isolation. We find in simulation that RAID-level analysis can effectively identify most vulnerable RAID-6 systems, improving the coverage to 98% of triple errors. We conclude with discussions of operational considerations in deploying RAIDShield more broadly and new directions in the analysis of disk errors. One interesting approach is to combine multiple metrics, allowing the values of different indicators to be used for predictions. Using newer field data that reports an additional metric, medium errors, we find that the relative efficacy of reallocated sectors and medium errors varies across disk models, offering an additional way to predict failures.
Abstract: There are many types of systems which can be dubbed servers, i.e. a retail checkout counter, a shipping company, a web server, or a customer service hotline. All of these systems have common general behavior: requests or customers arrive via a stochastic process, the service times vary randomly, and each request increases the stress on the server for some interval of time. A general stochastic model that describes the reliability of a server can provide the necessary informationfor optimal resource allocation and efficient task scheduling, leading to significant cost savings and improved performance metrics. In this work, we consider several generalizations of existing stochastic reliability models that incorporate random workloads, load-balancing allocation, and clustered tasks. The efficiency of the described servers is studied extensively in order to facilitate the design and implementation of control policies for fast-paced environments such as IT applications. Finally, a method to determine the reliability of any network of general servers, both correlated and uncorrelated, is presented.