38 Empirical Bayes

38.1 Rationale

Under the scenario that $\boldsymbol{X} | \theta {\; \stackrel{\text{iid}}{\sim}\;}F_{\theta}$ with prior distribution $\theta \sim F_{\tau}$ , we have to determine values for $\tau$ .

The empirical Bayes approach uses the observed data to estimate the prior parameter(s), $\tau$ .

This is especially useful for high-dimensional data when many parameters are simultaneously drawn from a prior with multiple observations drawn per parameter realization.

38.2 Approach

The usual approach is to first integrate out the parameter to obtain

$f(\boldsymbol{x} ; \tau) = \int f(\boldsymbol{x} | \theta) f(\theta ; \tau) d\theta.$

An estimation method (such as MLE) is then applied to estimate $\tau$ . Then inference proceeds as usual under the assumption that $\theta \sim f(\theta ; \hat{\tau})$ .

38.3 Example: Normal

Suppose that $X_i | \mu_i \sim \mbox{Normal}(\mu_i, 1)$ for $i=1, 2, \ldots, n$ where these rv’s are independent. Also suppose that $\mu_i {\; \stackrel{\text{iid}}{\sim}\;}\mbox{Normal}(a, b^2)$ .

$f(x_i ; a, b) = \int f(x_i | \mu_i) f(\mu_i; a, b) d\mu_i \sim \mbox{Normal}(a, 1+b^2).$

$\implies \hat{a} = \overline{x}, \ 1+\hat{b}^2 = \frac{\sum_{k=1}^n (x_k - \overline{x})^2}{n}$

$\begin{align*} \operatorname{E}[\mu_i | x_i] & = \frac{1}{1+b^2}a + \frac{b^2}{1+b^2}x_i \implies \\ & \\ \hat{\operatorname{E}}[\mu_i | x_i] & = \frac{1}{1+\hat{b}^2}\hat{a} + \frac{\hat{b}^2}{1+\hat{b}^2}x_i \\ & = \frac{n}{\sum_{k=1}^n (x_k - \overline{x})^2} \overline{x} + \left(1-\frac{n}{\sum_{k=1}^n (x_k - \overline{x})^2}\right) x_i \end{align*}$