# 40 Latent Variable Models

## 40.1 Definition

Latent variables (or hidden variables) are random variables that are present in the model, but unobserved.

We will denote latent variables by $$Z$$, and we will assume $(X_1, Z_1), (X_2, Z_2), \ldots, (X_n, Z_n) {\; \stackrel{\text{iid}}{\sim}\;}F_{{\boldsymbol{\theta}}}.$ A realized value of $$Z$$ is $$z$$, $${\boldsymbol{Z}}= (Z_1, Z_2, \ldots, Z_n)^T$$, etc.

The EM algorithm and variational inference involve latent variables.

Bayesian models are a special case of latent variable models: the unobserved random parameters are latent variables.

## 40.2 Empirical Bayes Revisited

In the earlier EB example, we supposed that $$X_i | \mu_i \sim \mbox{Normal}(\mu_i, 1)$$ for $$i=1, 2, \ldots, n$$ where these rv’s are independent, and also that $$\mu_i {\; \stackrel{\text{iid}}{\sim}\;}\mbox{Normal}(a, b^2)$$.

The unobserved parameters $$\mu_1, \mu_2, \ldots, \mu_n$$ are latent variables. In this case, $${\boldsymbol{\theta}}= (a, b^2)$$.

## 40.3 Normal Mixture Model

Suppose $${X_1, X_2, \ldots, X_n}{\; \stackrel{\text{iid}}{\sim}\;}F_{{\boldsymbol{\theta}}}$$ where $${\boldsymbol{\theta}}= (\pi_1, \ldots, \pi_K, \mu_1, \ldots, \mu_K, \sigma^2_1, \ldots, \sigma^2_K)$$ with pdf

$f({\boldsymbol{x}}; {\boldsymbol{\theta}}) = \prod_{i=1}^n \sum_{k=1}^K \pi_k \frac{1}{\sqrt{2\pi\sigma^2_k}} \exp \left\{ -\frac{(x_i - \mu_k)^2}{2 \sigma^2_k} \right\}.$

The MLEs of the unknown paramaters cannot be found analytically. This is a mixture common model to work with in applications, so we need to be able to estimate the parameters.

There is a latent variable model that produces the same maerginal distribution and likelihood function. Let $${\boldsymbol{Z}}_1, {\boldsymbol{Z}}_2, \ldots, {\boldsymbol{Z}}_n {\; \stackrel{\text{iid}}{\sim}\;}\mbox{Multinomial}_K(1, {\boldsymbol{\pi}})$$ where $${\boldsymbol{\pi}}= (\pi_1, \ldots, \pi_K)$$. Note that $$Z_{ik} \in \{0, 1\}$$ and $$\sum_{k=1}^K Z_{ik} = 1$$. Let $$[X_i | Z_{ik} = 1] \sim \mbox{Normal}(\mu_k, \sigma^2_k)$$, where $$\{X_i | {\boldsymbol{Z}}_i\}_{i=1}^{n}$$ are jointly independent.

The joint pdf is

$f({\boldsymbol{x}}, {\boldsymbol{z}}; {\boldsymbol{\theta}}) = \prod_{i=1}^n \prod_{k=1}^K \left[ \pi_k \frac{1}{\sqrt{2\pi\sigma^2_k}} \exp \left\{ -\frac{(x_i - \mu_k)^2}{2 \sigma^2_k} \right\} \right]^{z_{ik}}.$

Note that

$f({\boldsymbol{x}}, {\boldsymbol{z}}; {\boldsymbol{\theta}}) = \prod_{i=1}^n f(x_i, {\boldsymbol{z}}_i; {\boldsymbol{\theta}}).$ It can be verified that $$f({\boldsymbol{x}}; {\boldsymbol{\theta}})$$ is the marginal distribution of this latent variable model:

$f(x_i ; {\boldsymbol{\theta}}) = \sum_{{\boldsymbol{z}}_i} f(x_i, {\boldsymbol{z}}_i; {\boldsymbol{\theta}}) = \sum_{k=1}^K \pi_k \frac{1}{\sqrt{2\pi\sigma^2_k}} \exp \left\{ -\frac{(x_i - \mu_k)^2}{2 \sigma^2_k} \right\}.$

## 40.4 Bernoulli Mixture Model

Suppose $${X_1, X_2, \ldots, X_n}{\; \stackrel{\text{iid}}{\sim}\;}F_{{\boldsymbol{\theta}}}$$ where $${\boldsymbol{\theta}}= (\pi_1, \ldots, \pi_K, p_1, \ldots, p_K)$$ with pdf

$f({\boldsymbol{x}}; {\boldsymbol{\theta}}) = \prod_{i=1}^n \sum_{k=1}^K \pi_k p_k^{x_i} (1-p_k)^{1-x_i}.$

As in the Normal mixture model, the MLEs of the unknown paramaters cannot be found analytically.

As before, there is a latent variable model that produces the same maerginal distribution and likelihood function. Let $${\boldsymbol{Z}}_1, {\boldsymbol{Z}}_2, \ldots, {\boldsymbol{Z}}_n {\; \stackrel{\text{iid}}{\sim}\;}\mbox{Multinomial}_K(1, {\boldsymbol{\pi}})$$ where $${\boldsymbol{\pi}}= (\pi_1, \ldots, \pi_K)$$. Note that $$Z_{ik} \in \{0, 1\}$$ and $$\sum_{k=1}^K Z_{ik} = 1$$. Let $$[X_i | Z_{ik} = 1] \sim \mbox{Bernoulli}(p_k)$$, where $$\{X_i | {\boldsymbol{Z}}_i\}_{i=1}^{n}$$ are jointly independent.

The joint pdf is

$f({\boldsymbol{x}}, {\boldsymbol{z}}; {\boldsymbol{\theta}}) = \prod_{i=1}^n \prod_{k=1}^K \left[ p_k^{x_i} (1-p_k)^{1-x_i} \right]^{z_{ik}}.$