16 Multivariate RVs

16.1 Multinomial

Suppose \(\boldsymbol{X}\) (an \(m\)-vector) is \(\mbox{Multinomial}_m(n, \boldsymbol{p})\), where \(\boldsymbol{p}\) is an \(m\)-vector such that \(\sum_{i=1}^m p_i = 1\). It has pmf

\[ f(\boldsymbol{x}; \boldsymbol{p}) = {n \choose x_1 \ x_2 \ \cdots \ x_m} p_1^{x_1} p_2^{x_2} \cdots p_m^{x_m} \]

where

\[{n \choose x_1 \ x_2 \ \cdots \ x_m} = \frac{n!}{x_1! x_2! \cdots x_m!}\] and \(\sum_{i=1}^m x_i = n\).

The Multinomial distribution is a generalization of the Binomial distribution. It models \(n\) independent outcomes where each outcome has probability \(p_i\) of category \(i\) occurring (for \(i=1, 2, \ldots, m\)). The counts per category are contained in the \(X_i\) random variables that are constrained so that \(\sum_{i=1}^m X_i = n\).

It can be calculated that

\[{\operatorname{E}}[X_i] = np_i, \quad {\operatorname{Var}}(X_i) = n p_i (1-p_i),\]

\[{\operatorname{Cov}}(X_i, X_j) = -n p_i p_j \quad (i \not= j).\]

16.2 Multivariate Normal

The \(n\)-vector \(\boldsymbol{X}\) has Multivariate Normal distribution when \(\boldsymbol{X} \sim \mbox{MVN}_n(\boldsymbol{\mu}, \boldsymbol{\Sigma})\) where \(\boldsymbol{\mu}\) is the \(n\)-vector of population means and \(\boldsymbol{\Sigma}\) is the \(n \times n\) variance-covariance matrix. Its pdf is

\[ f(\boldsymbol{x}; \boldsymbol{\mu}, \boldsymbol{\Sigma}) = \frac{1}{\sqrt{2 \pi |\boldsymbol{\Sigma}|}} \exp -\left\{ -\frac{1}{2} (\boldsymbol{x} - \boldsymbol{\mu})^T \boldsymbol{\Sigma}^{-1} (\boldsymbol{x} - \boldsymbol{\mu}) \right\}. \]

 

Fun fact: \(\boldsymbol{\Sigma}^{-1/2} (\boldsymbol{X}-\boldsymbol{\mu}) \sim \mbox{MVN}_n(\boldsymbol{0}, \boldsymbol{I})\).

16.3 Dirichlet

The Dirichlet distribution models an \(m\)-vector \(\boldsymbol{X}\) so that \(0 \leq X_i \leq 1\) and \(\sum_{i=1}^m X_i = 1\). It is a generalization of the Beta distribution. The rv \(\boldsymbol{X} \sim \mbox{Dirichlet}_m(\boldsymbol{\alpha})\), where \(\boldsymbol{\alpha}\) is an \(m\)-vector, has pdf

\[ f(\boldsymbol{x}; \boldsymbol{\alpha}) = \frac{\Gamma\left( \sum_{i=1}^m \alpha_i \right)}{\prod_{i=1}^m \Gamma(\alpha_i)} \prod_{i=1}^m x_i^{\alpha_i-1}. \]

It can be calculated that \[{\operatorname{E}}[X_i] = \frac{\alpha_i}{\alpha_0}, {\operatorname{Var}}(X_i) = \frac{\alpha_i (\alpha_0 - \alpha_i)}{\alpha_0^2 (\alpha_0 + 1)}, {\operatorname{Cov}}(X_i, X_j) = \frac{- \alpha_i \alpha_j}{\alpha_0^2 (\alpha_0 + 1)}\] where \(\alpha_0 = \sum_{k=1}^m \alpha_k\) and \(i \not= j\) in \({\operatorname{Cov}}(X_i, X_j)\).

16.4 In R

For the Multinomial, base R contains the functions dmultinom and rmultinom.

For the Multivariate Normal, there are several packages that work with this distribution. One choice is the package mvtnorm, which contains the functions dmvnorm and rmvnorm.

For the Dirichlet, there are several packages that work with this distribution. One choice is the package MCMCpack, which contains the functions ddirichlet and rdirichlet.