40 Latent Variable Models

40.1 Definition

Latent variables (or hidden variables) are random variables that are present in the model, but unobserved.

We will denote latent variables by \(Z\), and we will assume \[(X_1, Z_1), (X_2, Z_2), \ldots, (X_n, Z_n) {\; \stackrel{\text{iid}}{\sim}\;}F_{{\boldsymbol{\theta}}}.\] A realized value of \(Z\) is \(z\), \({\boldsymbol{Z}}= (Z_1, Z_2, \ldots, Z_n)^T\), etc.

The EM algorithm and variational inference involve latent variables.

Bayesian models are a special case of latent variable models: the unobserved random parameters are latent variables.

40.2 Empirical Bayes Revisited

In the earlier EB example, we supposed that \(X_i | \mu_i \sim \mbox{Normal}(\mu_i, 1)\) for \(i=1, 2, \ldots, n\) where these rv’s are independent, and also that \(\mu_i {\; \stackrel{\text{iid}}{\sim}\;}\mbox{Normal}(a, b^2)\).

The unobserved parameters \(\mu_1, \mu_2, \ldots, \mu_n\) are latent variables. In this case, \({\boldsymbol{\theta}}= (a, b^2)\).

40.3 Normal Mixture Model

Suppose \({X_1, X_2, \ldots, X_n}{\; \stackrel{\text{iid}}{\sim}\;}F_{{\boldsymbol{\theta}}}\) where \({\boldsymbol{\theta}}= (\pi_1, \ldots, \pi_K, \mu_1, \ldots, \mu_K, \sigma^2_1, \ldots, \sigma^2_K)\) with pdf

\[ f({\boldsymbol{x}}; {\boldsymbol{\theta}}) = \prod_{i=1}^n \sum_{k=1}^K \pi_k \frac{1}{\sqrt{2\pi\sigma^2_k}} \exp \left\{ -\frac{(x_i - \mu_k)^2}{2 \sigma^2_k} \right\}. \]

The MLEs of the unknown paramaters cannot be found analytically. This is a mixture common model to work with in applications, so we need to be able to estimate the parameters.

There is a latent variable model that produces the same maerginal distribution and likelihood function. Let \({\boldsymbol{Z}}_1, {\boldsymbol{Z}}_2, \ldots, {\boldsymbol{Z}}_n {\; \stackrel{\text{iid}}{\sim}\;}\mbox{Multinomial}_K(1, {\boldsymbol{\pi}})\) where \({\boldsymbol{\pi}}= (\pi_1, \ldots, \pi_K)\). Note that \(Z_{ik} \in \{0, 1\}\) and \(\sum_{k=1}^K Z_{ik} = 1\). Let \([X_i | Z_{ik} = 1] \sim \mbox{Normal}(\mu_k, \sigma^2_k)\), where \(\{X_i | {\boldsymbol{Z}}_i\}_{i=1}^{n}\) are jointly independent.

The joint pdf is

\[ f({\boldsymbol{x}}, {\boldsymbol{z}}; {\boldsymbol{\theta}}) = \prod_{i=1}^n \prod_{k=1}^K \left[ \pi_k \frac{1}{\sqrt{2\pi\sigma^2_k}} \exp \left\{ -\frac{(x_i - \mu_k)^2}{2 \sigma^2_k} \right\} \right]^{z_{ik}}. \]

Note that

\[ f({\boldsymbol{x}}, {\boldsymbol{z}}; {\boldsymbol{\theta}}) = \prod_{i=1}^n f(x_i, {\boldsymbol{z}}_i; {\boldsymbol{\theta}}). \] It can be verified that \(f({\boldsymbol{x}}; {\boldsymbol{\theta}})\) is the marginal distribution of this latent variable model:

\[ f(x_i ; {\boldsymbol{\theta}}) = \sum_{{\boldsymbol{z}}_i} f(x_i, {\boldsymbol{z}}_i; {\boldsymbol{\theta}}) = \sum_{k=1}^K \pi_k \frac{1}{\sqrt{2\pi\sigma^2_k}} \exp \left\{ -\frac{(x_i - \mu_k)^2}{2 \sigma^2_k} \right\}. \]

40.4 Bernoulli Mixture Model

Suppose \({X_1, X_2, \ldots, X_n}{\; \stackrel{\text{iid}}{\sim}\;}F_{{\boldsymbol{\theta}}}\) where \({\boldsymbol{\theta}}= (\pi_1, \ldots, \pi_K, p_1, \ldots, p_K)\) with pdf

\[ f({\boldsymbol{x}}; {\boldsymbol{\theta}}) = \prod_{i=1}^n \sum_{k=1}^K \pi_k p_k^{x_i} (1-p_k)^{1-x_i}. \]

As in the Normal mixture model, the MLEs of the unknown paramaters cannot be found analytically.

As before, there is a latent variable model that produces the same maerginal distribution and likelihood function. Let \({\boldsymbol{Z}}_1, {\boldsymbol{Z}}_2, \ldots, {\boldsymbol{Z}}_n {\; \stackrel{\text{iid}}{\sim}\;}\mbox{Multinomial}_K(1, {\boldsymbol{\pi}})\) where \({\boldsymbol{\pi}}= (\pi_1, \ldots, \pi_K)\). Note that \(Z_{ik} \in \{0, 1\}\) and \(\sum_{k=1}^K Z_{ik} = 1\). Let \([X_i | Z_{ik} = 1] \sim \mbox{Bernoulli}(p_k)\), where \(\{X_i | {\boldsymbol{Z}}_i\}_{i=1}^{n}\) are jointly independent.

The joint pdf is

\[ f({\boldsymbol{x}}, {\boldsymbol{z}}; {\boldsymbol{\theta}}) = \prod_{i=1}^n \prod_{k=1}^K \left[ p_k^{x_i} (1-p_k)^{1-x_i} \right]^{z_{ik}}. \]