28 MLE Examples: Two-Samples

28.1 Comparing Two Populations

So far we have concentrated on analyzing \(n\) observations from a single population.

However, suppose that we want to do inference to compare two populations?

The framework we have described so far is easily extended to accommodate this.

28.2 Two RVs

If \(X\) and \(Y\) are independent rv’s then:

\[{\rm E}[X - Y] = {\rm E}[X] - {\rm E}[Y]\]

\[{\rm Var}(X-Y) = {\rm Var}(X) + {\rm Var}(Y)\]

28.3 Two Sample Means

Let \(X_1, X_2, \ldots, X_{n_1}\) be iid rv’s with population mean \(\mu_1\) and population variance \(\sigma^2_1\).

Let \(Y_1, Y_2, \ldots, Y_{n_2}\) be iid rv’s with population mean \(\mu_2\) and population variance \(\sigma^2_2\).

Assume that the two sets of rv’s are independent. Then when the CLT applies to each set of rv’s, as \(\min(n_1, n_2) \rightarrow \infty\), it follows that

\[ \frac{\overline{X} - \overline{Y} - (\mu_1 - \mu_2)}{\sqrt{\frac{\sigma^2_1}{n_1} + \frac{\sigma^2_2}{n_2}}} \stackrel{D}{\longrightarrow} \mbox{Normal}(0,1)\]

28.4 Two MLEs

Suppose \(X_1, X_2, \ldots, X_n \stackrel{{\rm iid}}{\sim} F_{\theta}\) and \(Y_1, Y_2, \ldots, Y_m \stackrel{{\rm iid}}{\sim} F_{\gamma}\) with MLEs \(\hat{\theta}_n\) and \(\hat{\gamma}_m\), respectively. Then as \(\min(n, m) \rightarrow \infty\),

\[\frac{\hat{\theta}_n - \hat{\gamma}_m - (\theta - \gamma)}{\sqrt{\hat{{\operatorname{se}}}(\hat{\theta}_n)^2 + \hat{{\operatorname{se}}}(\hat{\gamma}_m)^2}} \stackrel{D}{\longrightarrow} \mbox{Normal}(0,1).\]

28.5 Poisson

Let \(X_1, X_2, \ldots, X_{n_1}\) be iid \(\mbox{Poisson}(\lambda_1)\) and \(Y_1, Y_2, \ldots, Y_{n_2}\) be iid \(\mbox{Poisson}(\lambda_2)\).

We have \(\hat{\lambda}_1 = \overline{X}\) and \(\hat{\lambda}_2 = \overline{Y}\). As \(\min(n_1, n_2) \rightarrow \infty\),

\[ \frac{\hat{\lambda}_1 - \hat{\lambda}_2 - (\lambda_1 - \lambda_2)}{\sqrt{\frac{\hat{\lambda}_1}{n_1} + \frac{\hat{\lambda}_2}{n_2}}} \stackrel{D}{\longrightarrow} \mbox{Normal}(0,1). \]

28.6 Normal (Unequal Variances)

Let \(X_1, X_2, \ldots, X_{n_1}\) be iid \(\mbox{Normal}(\mu_1, \sigma^2_1)\) and \(Y_1, Y_2, \ldots, Y_{n_2}\) be iid \(\mbox{Normal}(\mu_2, \sigma^2_2)\).

We have \(\hat{\mu}_1 = \overline{X}\) and \(\hat{\mu}_2 = \overline{Y}\). As \(\min(n_1, n_2) \rightarrow \infty\),

\[ \frac{\hat{\mu}_1 - \hat{\mu}_2 - (\mu_1 - \mu_2)}{\sqrt{\frac{\hat{\sigma}^2_1}{n_1} + \frac{\hat{\sigma}^2_2}{n_2}}} \stackrel{D}{\longrightarrow} \mbox{Normal}(0,1). \]

28.7 Normal (Equal Variances)

Let \(X_1, X_2, \ldots, X_{n_1}\) be iid \(\mbox{Normal}(\mu_1, \sigma^2)\) and \(Y_1, Y_2, \ldots, Y_{n_2}\) be iid \(\mbox{Normal}(\mu_2, \sigma^2)\).

We have \(\hat{\mu}_1 = \overline{X}\) and \(\hat{\mu}_2 = \overline{Y}\). As \(\min(n_1, n_2) \rightarrow \infty\),

\[ \frac{\hat{\mu}_1 - \hat{\mu}_2 - (\mu_1 - \mu_2)}{\sqrt{\frac{\hat{\sigma}^2}{n_1} + \frac{\hat{\sigma}^2}{n_2}}} \stackrel{D}{\longrightarrow} \mbox{Normal}(0,1) \]

where

\[ \hat{\sigma}^2 = \frac{\sum_{i=1}^{n_1}(X_i - \overline{X})^2 + \sum_{i=1}^{n_2}(Y_i - \overline{Y})^2}{n_1 + n_2} \]

28.8 Binomial

Let \(X \sim \mbox{Binomial}(n_1, p_1)\) and \(Y \sim \mbox{Binomial}(n_2, p_2)\).

We have \(\hat{p}_1 = X/n_1\) and \(\hat{p}_2 = Y/n_2\). As \(\min(n_1, n_2) \rightarrow \infty\),

\[ \frac{\hat{p}_1 - \hat{p}_2 - (p_1 - p_2)}{\sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}}} \stackrel{D}{\longrightarrow} \mbox{Normal}(0,1). \]

28.9 Example: Binomial CI

A 95% CI for the difference \(p_1 - p_2\) can be obtained by unfolding the above pivotal statistic:

\[\left((\hat{p}_1 - \hat{p}_2) - 1.96 \sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}} \right.,\]

\[\left. (\hat{p}_1 - \hat{p}_2) + 1.96 \sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}} \right)\]

28.10 Example: Binomial HT

Suppose we wish to test \(H_0: p_1 = p_2\) vs \(H_1: p_1 \not= p_2\).

First form the \(z\)-statistic:

\[ z = \frac{\hat{p}_1 - \hat{p}_2 }{\sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}}}. \]

Now, calculate the p-value:

\[ {\rm Pr}(|Z^*| \geq |z|) \]

where \(Z^*\) is a Normal(0,1) random variable.