# 36 Priors

## 36.1 Conjugate Priors

A conjugate prior is a prior distribution for a data generating distribution so that the posterior distribution is of the same type as the prior.

Conjugate priors are useful for obtaining stratightforward calculations of the posterior.

There is a systematic method for calculating conjugate priors for exponential family distributions.

## 36.2 Example: Beta-Bernoulli

Suppose $$\boldsymbol{X} | \mu {\; \stackrel{\text{iid}}{\sim}\;}\mbox{Bernoulli}(p)$$ and suppose that $$p \sim \mbox{Beta}(\alpha, \beta)$$.

\begin{align*} f(p | \boldsymbol{x}) & \propto L(p ; \boldsymbol{x}) f(p) \\ & = p^{\sum x_i} (1-p)^{\sum (1-x_i)} p^{\alpha - 1} (1-p)^{\beta-1} \\ & = p^{\alpha - 1 + \sum x_i} (1-p)^{\beta - 1 + \sum (1-x_i)} \\ & \propto \mbox{Beta}(\alpha + \sum x_i, \beta + \sum (1-x_i)) \end{align*}

Therefore, ${\operatorname{E}}[p | \boldsymbol{x}] = \frac{\alpha + \sum x_i}{\alpha + \beta + n}.$

## 36.3 Example: Normal-Normal

Suppose $$\boldsymbol{X} | \mu {\; \stackrel{\text{iid}}{\sim}\;}\mbox{Normal}(\mu, \sigma^2)$$, where $$\sigma^2$$ is known, and suppose that $$\mu \sim \mbox{Normal}(a, b^2)$$.

Then it can be shown that $$\mu | \boldsymbol{x} \sim \mbox{Normal}({\operatorname{E}}[\mu | \boldsymbol{x}], {\operatorname{Var}}(\mu | \boldsymbol{x}))$$ where

${\operatorname{E}}[\mu | \boldsymbol{x}] = \frac{b^2}{\frac{\sigma^2}{n} + b^2} \overline{x} + \frac{\frac{\sigma^2}{n}}{\frac{\sigma^2}{n} + b^2} a$

${\operatorname{Var}}(\mu | \boldsymbol{x}) = \frac{b^2 \frac{\sigma^2}{n}}{\frac{\sigma^2}{n} + b^2}$

## 36.4 Example: Dirichlet-Multinomial

This is a problem on Homework 3!

## 36.5 Example: Gamma-Poisson

This is a problem on Homework 3!

## 36.6 Jeffreys Prior

If we do inference based on prior $$\theta \sim F_{\tau}$$ to obtain $$f(\theta | \boldsymbol{x}) \propto L(\theta; \boldsymbol{x}) f(\theta)$$, it follows that this inference may not be invariant to transformations of $$\theta$$, such as $$\eta = g(\theta)$$.

If we utilize a Jeffreys prior, which means it is such that

$f(\theta) \propto \sqrt{I(\theta)}$

then the prior will be invariant to transformations of $$\theta$$. We would want to show that $$f(\theta) \propto \sqrt{I(\theta)}$$ implies $$f(\eta) \propto \sqrt{I(\eta)}$$.

## 36.7 Examples: Jeffreys Priors

Normal$$(\mu, \sigma^2)$$, $$\sigma^2$$ known: $$f(\mu) \propto 1$$

Normal$$(\mu, \sigma^2)$$, $$\mu$$ known: $$f(\sigma) \propto \frac{1}{\sigma}$$

Poisson$$(\lambda)$$: $$f(\lambda) \propto \frac{1}{\sqrt{\lambda}}$$

Bernoulli$$(p)$$: $$f(p) \propto \frac{1}{\sqrt{p(1-p)}}$$

## 36.8 Improper Prior

An improper prior is a prior such that $$\int f(\theta) d\theta = \infty$$. Nevertheless, sometimes it still may be the case that $$f(\theta | \boldsymbol{x}) \propto L(\theta; \boldsymbol{x}) f(\theta)$$ yields a probability distribution.

Take for example the case where $$\boldsymbol{X} | \mu {\; \stackrel{\text{iid}}{\sim}\;}\mbox{Normal}(\mu, \sigma^2)$$, where $$\sigma^2$$ is known, and suppose that $$f(\mu) \propto 1$$. Then $$\int f(\theta) d\theta = \infty$$, but

$f(\theta | \boldsymbol{x}) \propto L(\theta; \boldsymbol{x}) f(\theta) \sim \mbox{Normal}\left(\overline{x}, \sigma^2/n\right)$

which is a proper probability distribution.