Vertical Dependency in Sequences of Categorical Random Variables

# Vertical Dependency in Sequences of Categorical Random Variables

## Cross-Covariance Matrix

The $K \times K$ cross-covariance matrix $\Lambda^{m,n}$ of $\epsilon_{m}$ and $\epsilon_{n}$ in a sequentially dependent categorical sequence has entries $\Lambda_{i,j}^{m,n}$ that give the cross-covariance $\text{Cov}([\epsilon_{m} = i]$, $[\epsilon_{n} = j])$, where $[\cdot]$ denotes an Iverson bracket. In the FK-dependent case, the entries of $\Lambda^{m,n}$ are given by

$$\Lambda^{1,n}_{ij}=\left\{\begin{array}{lc}\delta p_{i}(1-p_{i}), & i = j \\-\delta p_{i}p_{j}, & i \neq j,\end{array}\right.\quad\,n \geq 2,$$

and

$$\Lambda^{m,n}_{ij}=\left\{\begin{array}{lc}\delta^{2}p_{i}(1-p_{i}), & i = j \\-\delta^{2}p_{i}p_{j}, & i \neq j,\end{array}\right. \quad\,n > m; m\neq 1$$

Thus, the cross covariance between any two $\epsilon_{m}$ and $\epsilon_{n}$ in the FK-dependent case is never smaller than δ 2 times the independent cross-covariance. In the sequentially dependent case, the cross-covariances of $\epsilon_{m}$ and $\epsilon_{n}$ decrease in powers of δ as the distance between the two variables in the sequence increases.

Theorem 1 (Cross-Covariance of Dependent Categorical Random Variables).

Denote $\Lambda^{m,n}$ as the $K \times K$ cross-covariance matrix of $\epsilon_{m}$ and $\epsilon_{n}$ in a sequentially dependent sequence of categorical random variables of length N, $m \leq n$, and $n \leq N$, defined as $\Lambda^{m,n} = E[(\epsilon_{m} - E[\epsilon_{m}])(\epsilon_{n} - E[\epsilon_{n}])]$. Then the entries of the matrix are given by

$$\Lambda^{m,n}_{ij} = \left\{\begin{array}{lc}\delta^{n-m}p_{i}(1-p_{i}), & i = j \\-\delta^{n-m} p_{i}p_{j}, & i \neq j.\end{array}\right.$$

The pairwise covariance between two Bernoulli variables in a sequentially dependent sequence is given in the following corollary

Corollary 1

Denote $P(\epsilon_{i} = 1) = p$; $i = 1,\ldots,n$, and let $q=1-p$. Under sequential dependence,
$$\text{Cov}(\epsilon_{m}, \epsilon_{n}) = pq\delta^{n-m}.$$

We give some examples to illustrate.

Example 1 (Bernoulli Random Variables).

If we want to find the covariance between $\epsilon_{2}$ and $\epsilon_{3}$, then we note that the set $S = \{\omega \in \Omega_{3}^{2}\,:\,\omega_{2} = 1, \omega_{3} = 1\}$ is given by $S = \{(1,1,1),(0,1,1)\}$. $P(\epsilon_{2} = 1, \epsilon_{3} = 1) = P(S)$. Thus,

$$\begin{array}{rl}\text{Cov}(\epsilon_{2},\epsilon_{3})&=P(\epsilon_{2} = 1, \epsilon_{3} = 1)-P(\epsilon_{2} = 1)P(\epsilon_{3} = 1)\\&=pp^{+}p^{+} + qp^{-}p^{+}-p^{2}\\&= p^{+}(pp^{+} + qp^{-})-p^{2}\\&= p(p^{+}-p)\\&= pq\delta\end{array}$$

Similarly,
$$\begin{array}{rl}\text{Cov}(\epsilon_{1}, \epsilon_{3}) &= P(\epsilon_{1} = 1, \epsilon_{3} = 1)-P(\epsilon_{1} = 1)P(\epsilon_{3} = 1)\\&= (pp^{+}p^{+} + pq^{-}p^{-})-p^{2} \\&= p((p+\delta q)^{2} + pq(1-\delta)^{2})-p^{2}\\&= p(p^{2} + pq+\delta^{2}q^{2} + \delta^{2}pq)-p^{2}\\&= p(p(p+q) + \delta^{2}q(q+p)- p^{2} \\&= p(p + \delta^{2}q)-p^{2}\\&= pq\delta^{2}\end{array}$$

Example 2 (Categorical Random Variables).

Suppose we have a sequence of categorical random variables, where K = 3. Then
$[\epsilon_{m} = i]$ is the Bernoulli random variable with the binary outcome of 1 if $\epsilon_{m} = i$ and 0 if not. Thus,

$$\text{Cov}([\epsilon_{m} = i], [\epsilon_{n} = j]) = P(\epsilon_{m} = i \vee \epsilon_{n}= j)-P(\epsilon_{m} = i)P(\epsilon_{n} = j).$$

We have shown that every $\epsilon_{n}$ in the sequence is identically distributed, so $P(\epsilon_{m} = i) = p_{i}$ and $P(\epsilon_{n} = j) = p_{j}$.

$$\begin{array}{rl}\text{Cov}([\epsilon_{2} = 1], [\epsilon_{3} = 1]) &=(p_{1}p_{1}^{+}p_{1}^{+} + p_{2}p_{1}^{-}p_{1}^{+} + p_{3}p_{1}^{-}p{1}^{+})-p_{1}^{2} \\&= p_{1}^{+}(p_{1}p_{1}^{+} + p_{2}p_{1}^{-} + p_{3}p_{1}^{-})-p_{1}^{2} \\&= p_{1}^{+}p_{1}-p_{1}^{2} \\&=\delta p_{1}(1-p_{1})\end{array}$$ $$\begin{array}{rl}\text{Cov}([\epsilon_{2} = 1], [\epsilon_{3} = 2]) &=(p_{1}p_{1}^{+}p_{2}^{-} + p_{2}p_{1}^{-}p_{2}^{-}+p_{3}p_{1}^{-}p_{2}^{-})-p_{1}p_{2} \\&= p_{2}^{-}(p_{1}p_{1}^{+} + p_{2}p_{1}^{-} + p_{3}p_{1}^{-})-p_{1}p_{2}\\&= p_{1}p_{2}^{-}-p_{1}p_{2} \\&=-\delta p_{1}p_{2}\end{array}$$

We may obtain the other entries of the matrix in a similar fashion. So, the cross-covariance matrix for a $\epsilon_{2}$ and $\epsilon_{3}$ with K = 3 categories is given by

$$\Lambda^{2,3}=\delta\begin{pmatrix}p_{1}(1-p_{1})&-p_{1}p_{2}&-p_{1}p_{3}\\-p_{1}p_{2} & p_{2}(1-p_{2}) &-p_{2}p_{3}\\-p_{1}p_{3} &-p_{2}p_{3} & p_{3}(1-p_{3})\end{pmatrix}$$

Note that if $\epsilon_{2}$ and $\epsilon_{3}$ are independent, then the cross-covariance matrix is all zeros, because δ = 0.