# Vertical Dependency in Sequences of Categorical Random Variables

##### For the full text, including proofs, download the pdf here.

We repeat a section from A Generalized Multinomial Distribution from Dependent Categorical Random Variables in order to give a review of the original first-kind (FK) dependency created by Korzeniowski. For the full details, see the previous work on the topic.

Korzeniowski defined the notion of dependence in a way we will refer to here as *dependence of the first kind* (FK dependence). Suppose (\epsilon_{1},...,\epsilon_{N}) is a sequence of Bernoulli random variables, and P(\epsilon_{1} = 1) = p. Then, for \epsilon_{i}, i \geq 2, we weight the probability of each binary outcome toward the outcome of \epsilon_{1}, adjusting the probabilities of the remaining outcomes accordingly.

Formally, let 0 \leq \delta \leq 1, and q = 1-p. Then define the following quantities

\begin{aligned}p^{+} := P(\epsilon_{i} = 1 | \epsilon_{1} = 1) = p + \delta q &\qquad p^{-} :=P(\epsilon_{i} = 0 | \epsilon_{1} = 1) = q -\delta q\\q^{+} := P(\epsilon_{i} = 1 | \epsilon_{1} = 0) = p-\delta p&\qquad q^{-} := P(\epsilon_{i} = 0 | \epsilon_{1} = 0) = q + \delta p\end{aligned}

Given the outcome i of \epsilon_{1}, the probability of outcome i occurring in the subsequent Bernoulli variables \epsilon_{2}, \epsilon_{3},..., \epsilon_{n} is p^{+}, i = 1 or q^{+}, i=0. The probability of the opposite outcome is then decreased to q^{-} and p^{-}, respectively.

The figure above illustrates the possible outcomes of a sequence of such dependent Bernoulli variables. Korzeniowski showed that, despite this conditional dependency, P(\epsilon_{i} = 1) = p \quad\forall i. That is, the sequence of Bernoulli variables is identically distributed, with correlation shown to be

\text{Cor}(\epsilon_{i}, \epsilon_{j}) = \left\{\begin{array}{lr}\delta, & i=1 \\\delta^{2}, &i \neq j, \quad i,j \geq 2\end{array}\right.

These identically distributed but correlated Bernoulli random variables yield a Generalized Binomial distribution with a similar form to the standard binomial distribution.

In the previous work, the concept of Bernoulli FK dependence was extended to categorical random variables. That is, given a sequence of categorical random variables with K categories, P(\epsilon_{1} = i) = p_{i}, i = 1,\ldots,K, \begin{aligned}P(\epsilon_{j} = i | \epsilon_{1} = i) &= p_{i}^{+} = p_{i} + \delta(1-p_{i});\\P(\epsilon_{j} = k | \epsilon_{1} = i) &= p_{k}^{-} = p_{k}-\delta p_{k}, i \neq k, \:\:k = 1,\ldots,K.\end{aligned}

Traylor proved that FK dependent categorical random variables remained identically distributed, and showed that the cross-covariance matrix of categorical random variables has the same structure as the correlation between FK dependent Bernoulli random variables. In addition, the concept of a generalized binomial distribution was extended to a generalized multinomial distribution.

In the next section, we will explore a different type of dependency structure, *sequential dependency*.