Vertical Dependency in Sequences of Categorical Random Variables

Vertical Dependency in Sequences of Categorical Random Variables

For the full text, including proofs, download the pdf here.

Sequentially Dependent Random Variables

Figure 2. Probability Mass Flow of Sequentially Dependent Categorical Random Variables, K=3.
Figure 3. Probability Mass Flow of FK Dependent Categorical Random Variables, K=3.

While FK dependence yielded some interesting results, a more realistic type of dependence is sequential dependence, where the outcome of a categorical random variable depends with coefficient δ on the outcome of the variable immediately preceeding it in the sequence. Put formally, if we let \mathcal{F}_{n} = \{\epsilon_{1},\ldots,\epsilon_{n-1}\}, then P(\epsilon_{n}|\epsilon_{1},\ldots,\epsilon_{n-1}) = P(\epsilon_{n}|\epsilon_{n-1}) \neq P(\epsilon_{n}). That is, \epsilon_{n} only has direct dependence on the previous variable \epsilon_{n-1}. We keep the same weighting as for FK-dependence. That is,

\begin{array}{lr}P(\epsilon_{n} = j | \epsilon_{n-1} = j) = p_{j}^{+} = p_{j} + \delta(1-p_{j}),&\\ P(\epsilon_{n} = j | \epsilon_{n-1} = i) = p_{j}^{-} = p_{j}-\delta p_{j}, &\:\: j = 1,\ldots,K;\:\:i \neq j\end{array} As a comparison, for FK dependence, P(\epsilon_{n}|\epsilon_{1},\ldots,\epsilon_{n-1}) = P(\epsilon_{n}|\epsilon_{1}) \neq P(\epsilon_{n}). That is, \epsilon_{n} only has direct dependence on \epsilon_{1}, and

\begin{array}{lr}P(\epsilon_{n} = j | \epsilon_{1} = j) = p_{j}^{+} = p_{j} + \delta(1-p_{j}),&\\ P(\epsilon_{n} = j | \epsilon_{1} = i) = p_{j}^{-} = p_{j}-\delta p_{j},&\:\:j = 1,\ldots,K;\:\: i \neq j\end{array} Let \epsilon = (\epsilon_{1},\ldots,\epsilon_{n}) be a sequence of categorical random variables of length n (either independent or dependent) where the number of categories for all \epsilon_{i} is K. Denote \Omega_{n}^{K} as the sample space of this random sequence. For example,

\Omega_{3}^{3} = \{(1,1,1), (1,1,2), (1,1,3), (1,2,1), (1,2,2),\ldots,(3,3,1),(3,3,2),(3,3,3)\}

Dependency structures like FK-dependence and sequential dependence change the probability of a sequence \epsilon of length n taking a particular \omega = (\omega_{1},\ldots,\omega_{n}) \in \Omega_{n}^{K}. The probability of a particular \omega \in \Omega_{n}^{K} is given by the dependency structure. For example, if the variables are independent, P((1,2,1)) = p_{1}^{2}p_{2}. Under FK-dependence, P((1,2,1)) = p_{1}p_{2}^{-}p_{1}^{+}, and under sequential dependence, P((1,2,1)) = p_{1}p_{2}^{-}p_{1}^{-}. See Figures 2 and 3 for a comparison of the probability mass flows of sequential dependence and FK dependence. Sequentially dependent sequences of categorical random variables remain identically distributed but dependent, just like FK-dependent sequences. That is,


Lemma 1

Let \epsilon = (\epsilon_{1},\ldots,\epsilon_{n}) be a sequentially dependent categorical sequence of length n with K categories. Then

P(\epsilon_{j} = i) = p_{i}; \qquad i = 1,\ldots,K;\:\:j = 1,\ldots,n;\:\: n \in \mathbb{N}