Vertical Dependency in Sequences of Categorical Random Variables

Vertical Dependency in Sequences of Categorical Random Variables

For the full text, including proofs, download the pdf here.

Cross-Covariance Matrix

The K \times K cross-covariance matrix \Lambda^{m,n} of \epsilon_{m} and \epsilon_{n} in a sequentially dependent categorical sequence has entries \Lambda_{i,j}^{m,n} that give the cross-covariance \text{Cov}([\epsilon_{m} = i], [\epsilon_{n} = j]), where [\cdot] denotes an Iverson bracket. In the FK-dependent case, the entries of \Lambda^{m,n} are given by

\Lambda^{1,n}_{ij}=\left\{\begin{array}{lc}\delta p_{i}(1-p_{i}), & i = j \\-\delta p_{i}p_{j}, & i \neq j,\end{array}\right.\quad\,n \geq 2,


\Lambda^{m,n}_{ij}=\left\{\begin{array}{lc}\delta^{2}p_{i}(1-p_{i}), & i = j \\-\delta^{2}p_{i}p_{j}, & i \neq j,\end{array}\right. \quad\,n > m; m\neq 1

Thus, the cross covariance between any two \epsilon_{m} and \epsilon_{n} in the FK-dependent case is never smaller than δ 2 times the independent cross-covariance. In the sequentially dependent case, the cross-covariances of \epsilon_{m} and \epsilon_{n} decrease in powers of δ as the distance between the two variables in the sequence increases.

Theorem 1 (Cross-Covariance of Dependent Categorical Random Variables).

Denote \Lambda^{m,n} as the K \times K cross-covariance matrix of \epsilon_{m} and \epsilon_{n} in a sequentially dependent sequence of categorical random variables of length N, m \leq n, and n \leq N, defined as \Lambda^{m,n} = E[(\epsilon_{m} - E[\epsilon_{m}])(\epsilon_{n} - E[\epsilon_{n}])]. Then the entries of the matrix are given by

\Lambda^{m,n}_{ij} = \left\{\begin{array}{lc}\delta^{n-m}p_{i}(1-p_{i}), & i = j \\-\delta^{n-m} p_{i}p_{j}, & i \neq j.\end{array}\right.

The pairwise covariance between two Bernoulli variables in a sequentially dependent sequence is given in the following corollary

Corollary 1

Denote P(\epsilon_{i} = 1) = p; i = 1,\ldots,n, and let q=1-p. Under sequential dependence,
\text{Cov}(\epsilon_{m}, \epsilon_{n}) = pq\delta^{n-m}.

We give some examples to illustrate.


Example 1 (Bernoulli Random Variables).

If we want to find the covariance between \epsilon_{2} and \epsilon_{3}, then we note that the set S = \{\omega \in \Omega_{3}^{2}\,:\,\omega_{2} = 1, \omega_{3} = 1\} is given by S = \{(1,1,1),(0,1,1)\}. P(\epsilon_{2} = 1, \epsilon_{3} = 1) = P(S). Thus,

\begin{array}{rl}\text{Cov}(\epsilon_{2},\epsilon_{3})&=P(\epsilon_{2} = 1, \epsilon_{3} = 1)-P(\epsilon_{2} = 1)P(\epsilon_{3} = 1)\\&=pp^{+}p^{+} + qp^{-}p^{+}-p^{2}\\&= p^{+}(pp^{+} + qp^{-})-p^{2}\\&= p(p^{+}-p)\\&= pq\delta\end{array}

\begin{array}{rl}\text{Cov}(\epsilon_{1}, \epsilon_{3}) &= P(\epsilon_{1} = 1, \epsilon_{3} = 1)-P(\epsilon_{1} = 1)P(\epsilon_{3} = 1)\\&= (pp^{+}p^{+} + pq^{-}p^{-})-p^{2} \\&= p((p+\delta q)^{2} + pq(1-\delta)^{2})-p^{2}\\&= p(p^{2} + pq+\delta^{2}q^{2} + \delta^{2}pq)-p^{2}\\&= p(p(p+q) + \delta^{2}q(q+p)- p^{2} \\&= p(p + \delta^{2}q)-p^{2}\\&= pq\delta^{2}\end{array}


Example 2 (Categorical Random Variables).

Suppose we have a sequence of categorical random variables, where K = 3. Then
[\epsilon_{m} = i] is the Bernoulli random variable with the binary outcome of 1 if \epsilon_{m} = i and 0 if not. Thus,

\text{Cov}([\epsilon_{m} = i], [\epsilon_{n} = j]) = P(\epsilon_{m} = i \vee \epsilon_{n}= j)-P(\epsilon_{m} = i)P(\epsilon_{n} = j).

We have shown that every \epsilon_{n} in the sequence is identically distributed, so P(\epsilon_{m} = i) = p_{i} and P(\epsilon_{n} = j) = p_{j}.

\begin{array}{rl}\text{Cov}([\epsilon_{2} = 1], [\epsilon_{3} = 1]) &=(p_{1}p_{1}^{+}p_{1}^{+} + p_{2}p_{1}^{-}p_{1}^{+} + p_{3}p_{1}^{-}p{1}^{+})-p_{1}^{2} \\&= p_{1}^{+}(p_{1}p_{1}^{+} + p_{2}p_{1}^{-} + p_{3}p_{1}^{-})-p_{1}^{2} \\&= p_{1}^{+}p_{1}-p_{1}^{2} \\&=\delta p_{1}(1-p_{1})\end{array} \begin{array}{rl}\text{Cov}([\epsilon_{2} = 1], [\epsilon_{3} = 2]) &=(p_{1}p_{1}^{+}p_{2}^{-} + p_{2}p_{1}^{-}p_{2}^{-}+p_{3}p_{1}^{-}p_{2}^{-})-p_{1}p_{2} \\&= p_{2}^{-}(p_{1}p_{1}^{+} + p_{2}p_{1}^{-} + p_{3}p_{1}^{-})-p_{1}p_{2}\\&= p_{1}p_{2}^{-}-p_{1}p_{2} \\&=-\delta p_{1}p_{2}\end{array}

We may obtain the other entries of the matrix in a similar fashion. So, the cross-covariance matrix for a \epsilon_{2} and \epsilon_{3} with K = 3 categories is given by

\Lambda^{2,3}=\delta\begin{pmatrix}p_{1}(1-p_{1})&-p_{1}p_{2}&-p_{1}p_{3}\\-p_{1}p_{2} & p_{2}(1-p_{2}) &-p_{2}p_{3}\\-p_{1}p_{3} &-p_{2}p_{3} & p_{3}(1-p_{3})\end{pmatrix}

Note that if \epsilon_{2} and \epsilon_{3} are independent, then the cross-covariance matrix is all zeros, because δ = 0.