The Multivariate Normal Distribution

Defn: iff

Defn: if and only if with the independent and each .

In this case according to our theorem

superscript denotes matrix transpose.

Defn: has a multivariate normal distribution if it has the same distribution as for some , some matrix of constants and .

, singular: does not have a density.

invertible: derive multivariate normal density by change of variables:

So

Now define and notice that

and

Thus is

the density. Note density is the same for all such that . This justifies the notation .

For which , is this a density?

Any but if then

where . Inequality strict except for which is equivalent to . Thus is a positive definite symmetric matrix.

Conversely, if is a positive definite symmetric matrix then there is a square invertible matrix such that so that there is a distribution. ( can be found via the Cholesky decomposition, e.g.)

When is singular will not have a density: such that ; is confined to a hyperplane.

Still true: distribution of depends only on : if then and have the same distribution.

Expectation, moments

Defn: If has density then

any from to .

FACT: if for a smooth (mapping )

by change of variables formula for integration. This is good because otherwise we might have two different values for .

Linearity: for real and .

Defn: The moment (about the origin) of a real rv is (provided it exists). We generally use for .

Defn: The central moment is

We call the variance.

Defn: For an valued random vector

is the vector whose entry is (provided all entries exist).

Fact: same idea used for random matrices.

Defn: The ( ) variance covariance matrix of is

which exists provided each component has a finite second moment.

Example moments: If then

and (integrating by parts)

so that

for . Remembering that and

we find that

If now , that is, , then and

In particular, we see that our choice of notation for the distribution of is justified; is indeed the variance.

Similarly for we have with and

and

Note use of easy calculation: and

Moments and independence

Theorem: If are independent and each is integrable then is integrable and

Moment Generating Functions

Defn: The moment generating function of a real valued is

defined for those real for which the expected value is finite.

Defn: The moment generating function of is

defined for those vectors for which the expected value is finite.

Example: If then

Theorem: () If is finite for all in a neighbourhood of then

1. Every moment of is finite.

2. is (in fact is analytic).

3. .

Note: means has continuous derivatives of all orders. Analytic means has convergent power series expansion in neighbourhood of each .

The proof, and many other facts about mgfs, rely on techniques of complex variables.

Characterization & MGFs

Theorem: Suppose and are valued random vectors such that

for in some open neighbourhood of in . Then and have the same distribution.

The proof relies on techniques of complex variables.

MGFs and Sums

If are independent and then mgf of is product mgfs of individual :

or . (Also for multivariate .)

Example: If are independent then

Conclusion: If then

Example: If then and

Theorem: Suppose and where and . Then and have the same distribution if and only iff the following two conditions hold:

1. .

2. .

Alternatively: if , each MVN then and imply that and have the same distribution.

Proof: If 1 and 2 hold the mgf of is

Thus . Conversely if and have the same distribution then they have the same mean and variance.

Thus mgf is determined by and .

Theorem: If then there is a matrix such that has same distribution as for .

We may assume that is symmetric and non-negative definite, or that is upper triangular, or that is lower triangular.

Proof: Pick any such that such as from the spectral decomposition. Then .

From the symmetric square root can produce an upper triangular square root by the Gram Schmidt process: if has rows then let be . Choose proportional to where so that has unit length. Continue in this way; you automatically get if . If has columns then is orthogonal and is an upper triangular square root of .

Variances, Covariances, Correlations

Defn: The covariance between and is

This is a matrix.

Properties:

• .

• Cov is bilinear:

and

Properties of the distribution

1: All margins are multivariate normal: if

and

then .

2: : affine transformation of MVN is normal.

3: If

then and are independent.

4: All conditionals are normal: the conditional distribution of given is Proof of ( 1): If then

for the identity matrix of correct dimension.

So

Compute mean and variance to check rest.

Proof of ( 2): If then

Proof of ( 3): If

then

Proof of ( 4): first case: assume has an inverse.

Define

Then

Thus is where

Now joint density of and factors

By change of variables joint density of is

where is the constant Jacobian of the linear transformation from to and

Thus conditional density of given is

As a function of this density has the form of the advertised multivariate normal density.

Specialization to bivariate case:

Write

where we define

Note that

Then

is independent of . The marginal distribution of is where

This simplifies to

Notice that it follows that

More generally: any and :

 0

RHS is minimized at

Minimum value is

where

defines the correlation between and .

Multiple Correlation
Now suppose is scalar but is vector.

Defn: Multiple correlation between and

over all .

Thus: maximize

Put . For invertible problem is equivalent to maximizing

where

Solution: find largest eigenvalue of .

Note

where

is a vector. Set

and multiply by to get

or

If then we see so largest eigenvalue is .

Summary: maximum squared correlation is

Achieved when eigenvector is so

Notice: since is squared correlation between two scalars ( and ) we have

Equals 1 iff is linear combination of .

Correlation matrices, partial correlations:

Correlation between two scalars and is

If has variance then the correlation matrix of is with entries

If are MVN with the usual partitioned variance covariance matrix then the conditional variance of given is

From this define partial correlation matrix

Note: these are used even when are NOT MVN

Richard Lockhart
2002-09-24