34 Estimation

34.1 Assumptions

We will assume that \((X_1, X_2, \ldots, X_n) | \theta {\; \stackrel{\text{iid}}{\sim}\;}F_{\theta}\) with prior distribution \(\theta \sim F_{\tau}\) unless stated otherwise. Shorthand for the former is \(\boldsymbol{X} | \theta {\; \stackrel{\text{iid}}{\sim}\;}F_{\theta}\).

We will write the pdf or pmf of \(X\) as \(f(x | \theta)\) as opposed to \(f(x ; \theta)\) because in the Bayesian framework this actually represents conditional probability.

We will write the pdf or pmf of \(\theta\) as \(f(\theta)\) or \(f(\theta ; \tau)\) or \(f(\theta | \tau)\). Always remember that prior distributions require paramater values, even if we don’t explicitly write them.

34.2 Posterior Distribution

The posterior distribution of \(\theta | \boldsymbol{X}\) is obtained through Bayes theorem:

\[\begin{align*} f(\theta | \boldsymbol{x}) & = \frac{f(\boldsymbol{x} | \theta) f(\theta)}{f(\boldsymbol{x})} = \frac{f(\boldsymbol{x} | \theta) f(\theta)}{\int f(\boldsymbol{x} | \theta^*) f(\theta^*) d\theta^*} \\ & \propto L(\theta ; \boldsymbol{x}) f(\theta) \end{align*}\]

34.3 Posterior Expectation

A very common point estimate of \(\theta\) in Bayesian inference is the posterior expected value:

\[\begin{align*} \operatorname{E}[\theta | \boldsymbol{x}] & = \int \theta f(\theta | \boldsymbol{x}) d\theta \\ & = \frac{\int \theta L(\theta ; \boldsymbol{x}) f(\theta) d\theta}{\int L(\theta ; \boldsymbol{x}) f(\theta) d\theta} \end{align*}\]

34.4 Posterior Interval

The Bayesian analog of the frequentist confidence interval is the \(1-\alpha\) posterior interval, where \(C_{\ell}\) and \(C_{u}\) are determined so that:

\[ 1-\alpha = \Pr(C_\ell \leq \theta \leq C_u | \boldsymbol{x}) \]

34.5 Maximum A Posteriori Probability

The maximum a posteriori probability (MAP) is the value (or values) of \(\theta\) that maximize the posterior pdf or pmf:

\[\begin{align*} \hat{\theta}_{\text{MAP}} & = \operatorname{argmax}_\theta \Pr(\theta | \boldsymbol{x}) \\ & = \operatorname{argmax}_\theta L(\theta ; \boldsymbol{x}) f(\theta) \end{align*}\]

This is a frequentist-esque use of the Bayesian framework.

34.6 Loss Functions

Let \(\mathcal{L}\left(\theta, \tilde{\theta}\right)\) be a loss function for a given estimator \(\tilde{\theta}\). Examples are

\[ \mathcal{L}\left(\theta, \tilde{\theta}\right) = \left(\theta - \tilde{\theta}\right)^2 \mbox{ or } \mathcal{L}\left(\theta, \tilde{\theta}\right) = \left|\theta - \tilde{\theta}\right|. \]

Note that, where the expected value is over \(f(\boldsymbol{x}; \theta)\):

\[\begin{align*} \operatorname{E}\left[\left(\theta - \tilde{\theta}\right)^2\right] & = \left(\operatorname{E}\left[\tilde{\theta}\right] - \theta\right)^2 + \operatorname{Var}\left(\tilde{\theta}\right) \\ & = \mbox{bias}^2 + \mbox{variance} \end{align*}\]

34.7 Bayes Risk

The Bayes risk, \(R\left(\theta, \tilde{\theta}\right)\), is the expected loss with respect to the posterior:

\[ {\operatorname{E}}\left[ \left. \mathcal{L}\left(\theta, \tilde{\theta}\right) \right| \boldsymbol{x} \right] = \int \mathcal{L}\left(\theta, \tilde{\theta}\right) f(\theta | \boldsymbol{x}) d\theta \]

34.8 Bayes Estimators

The Bayes estimator minimizes the Bayes risk.

The posterior expectation \({\operatorname{E}}\left[ \left. \theta \right| \boldsymbol{x} \right]\) minimizes the Bayes risk of \(\mathcal{L}\left(\theta, \tilde{\theta}\right) = \left(\theta - \tilde{\theta}\right)^2\).

The median of \(f(\theta | \boldsymbol{x})\), calculated by \(F^{-1}_{\theta | \boldsymbol{x}}(1/2)\), minimizes the Bayes risk of \(\mathcal{L}\left(\theta, \tilde{\theta}\right) = \left|\theta - \tilde{\theta}\right|\).