16 Multivariate RVs
16.1 Multinomial
Suppose \boldsymbol{X} (an m-vector) is \mbox{Multinomial}_m(n, \boldsymbol{p}), where \boldsymbol{p} is an m-vector such that \sum_{i=1}^m p_i = 1. It has pmf
f(\boldsymbol{x}; \boldsymbol{p}) = {n \choose x_1 \ x_2 \ \cdots \ x_m} p_1^{x_1} p_2^{x_2} \cdots p_m^{x_m}
where
{n \choose x_1 \ x_2 \ \cdots \ x_m} = \frac{n!}{x_1! x_2! \cdots x_m!} and \sum_{i=1}^m x_i = n.
The Multinomial distribution is a generalization of the Binomial distribution. It models n independent outcomes where each outcome has probability p_i of category i occurring (for i=1, 2, \ldots, m). The counts per category are contained in the X_i random variables that are constrained so that \sum_{i=1}^m X_i = n.
It can be calculated that
{\operatorname{E}}[X_i] = np_i, \quad {\operatorname{Var}}(X_i) = n p_i (1-p_i),
{\operatorname{Cov}}(X_i, X_j) = -n p_i p_j \quad (i \not= j).
16.2 Multivariate Normal
The n-vector \boldsymbol{X} has Multivariate Normal distribution when \boldsymbol{X} \sim \mbox{MVN}_n(\boldsymbol{\mu}, \boldsymbol{\Sigma}) where \boldsymbol{\mu} is the n-vector of population means and \boldsymbol{\Sigma} is the n \times n variance-covariance matrix. Its pdf is
f(\boldsymbol{x}; \boldsymbol{\mu}, \boldsymbol{\Sigma}) = \frac{1}{\sqrt{2 \pi |\boldsymbol{\Sigma}|}} \exp -\left\{ -\frac{1}{2} (\boldsymbol{x} - \boldsymbol{\mu})^T \boldsymbol{\Sigma}^{-1} (\boldsymbol{x} - \boldsymbol{\mu}) \right\}.
Fun fact: \boldsymbol{\Sigma}^{-1/2} (\boldsymbol{X}-\boldsymbol{\mu}) \sim \mbox{MVN}_n(\boldsymbol{0}, \boldsymbol{I}).
16.3 Dirichlet
The Dirichlet distribution models an m-vector \boldsymbol{X} so that 0 \leq X_i \leq 1 and \sum_{i=1}^m X_i = 1. It is a generalization of the Beta distribution. The rv \boldsymbol{X} \sim \mbox{Dirichlet}_m(\boldsymbol{\alpha}), where \boldsymbol{\alpha} is an m-vector, has pdf
f(\boldsymbol{x}; \boldsymbol{\alpha}) = \frac{\Gamma\left( \sum_{i=1}^m \alpha_i \right)}{\prod_{i=1}^m \Gamma(\alpha_i)} \prod_{i=1}^m x_i^{\alpha_i-1}.
It can be calculated that {\operatorname{E}}[X_i] = \frac{\alpha_i}{\alpha_0}, {\operatorname{Var}}(X_i) = \frac{\alpha_i (\alpha_0 - \alpha_i)}{\alpha_0^2 (\alpha_0 + 1)}, {\operatorname{Cov}}(X_i, X_j) = \frac{- \alpha_i \alpha_j}{\alpha_0^2 (\alpha_0 + 1)} where \alpha_0 = \sum_{k=1}^m \alpha_k and i \not= j in {\operatorname{Cov}}(X_i, X_j).
16.4 In R
For the Multinomial, base R contains the functions dmultinom
and rmultinom
.
For the Multivariate Normal, there are several packages that work with this distribution. One choice is the package mvtnorm
, which contains the functions dmvnorm
and rmvnorm
.
For the Dirichlet, there are several packages that work with this distribution. One choice is the package MCMCpack
, which contains the functions ddirichlet
and rdirichlet
.