# 35 Classification

## 35.1 Assumptions

Let $$(X_1, X_2, \ldots, X_n) | \theta {\; \stackrel{\text{iid}}{\sim}\;}F_\theta$$ where $$\theta \in \Theta$$ and $$\theta \sim F_{\tau}$$. Let $$\Theta_0, \Theta_1 \subseteq \Theta$$ so that $$\Theta_0 \cap \Theta_1 = \varnothing$$ and $$\Theta_0 \cup \Theta_1 = \Theta$$.

Given observed data $$\boldsymbol{x}$$, we wish to classify whether $$\theta \in \Theta_0$$ or $$\theta \in \Theta_1$$.

This is the Bayesian analog of hypothesis testing.

## 35.2 Prior Probability on H

Let $$H$$ be a rv such that $$H=0$$ when $$\theta \in \Theta_0$$ and $$H=1$$ when $$\theta \in \Theta_1$$.

From the prior distribution on $$\theta$$, we can calculate

$\Pr(H=0) = \int_{\theta \in \Theta_0} f(\theta) d\theta$

and $$\Pr(H=1) = 1-\Pr(H=0)$$.

## 35.3 Posterior Probability

Using Bayes theorem, we can also calculate

\begin{align*} \Pr(H=0 | \boldsymbol{x}) & = \frac{f(\boldsymbol{x} | H=0) \Pr(H=0)}{f(\boldsymbol{x})} \\ & = \frac{\int_{\theta \in \Theta_0} f(\boldsymbol{x} | \theta) f(\theta) d\theta}{\int_{\theta \in \Theta} f(\boldsymbol{x} | \theta) f(\theta) d\theta} \end{align*}

where note that $$\Pr(H=1 | \boldsymbol{x}) = 1-\Pr(H=0 | \boldsymbol{x})$$.

## 35.4 Loss Function

Let $$\mathcal{L}\left(\tilde{H}, H\right)$$ be such that

\begin{align*} \mathcal{L}\left(\tilde{H}=1, H=0 \right) & = c_{I}\\ \mathcal{L}\left(\tilde{H}=0, H=1 \right) & = c_{II} \end{align*}

for some $$c_{I}, c_{II} > 0$$.

## 35.5 Bayes Risk

The Bayes risk, $$R\left(\tilde{H}, H\right)$$, is

\begin{align*} \operatorname{E}\left[ \left. \mathcal{L}\left(\theta, \tilde{\theta}\right) \right| \boldsymbol{x} \right] & = c_{I} \Pr(\tilde{H}=1, H=0) + c_{II} \Pr(\tilde{H}=0, H=1) \\ & = c_{I} \Pr(\tilde{H}=1 | H=0) \Pr(H=0) \\ & \quad\quad + c_{II} \Pr(\tilde{H}=0 | H=1) \Pr(H=1) \end{align*}

Notice how this balances what frequentists call Type I error and Type II error.

## 35.6 Bayes Rule

The estimate $$\tilde{H}$$ that minimizes $$R\left(\tilde{H}, H\right)$$ is

$\tilde{H}=1 \mbox{ when } \Pr(H=1 | \boldsymbol{x}) \geq \frac{c_{I}}{c_{I} + c_{II}}$

and $$\tilde{H}=0$$ otherwise.