web

STAT 801: Mathematical Statistics

Distribution Theory

Basic Problem: Start with assumptions about or CDF of random vector $X=(X_1,\ldots,X_p)$ . Define $Y=g(X_1,\ldots,X_p)$ to be some function of (usually some statistic of interest). How can we compute the distribution or CDF or density of ?

Univariate Techniques

Method 1: compute the CDF by integration and differentiate to find .

Example: $U \sim$ Uniform and $Y=-\log U$ .

$\displaystyle F_Y(y)$	$\displaystyle =$	$\displaystyle P(Y \le y) = P(-\log U \le y)$
	$\displaystyle =$	$\displaystyle P(\log U \ge -y) = P(U \ge e^{-y})$
	$\displaystyle =$	$\displaystyle \left\{ \begin{array}{ll} 1- e^{-y} & y > 0 \\ 0 & y \le 0 \, . \end{array}\right.$

has standard exponential distribution.

Example: $Z \sim N(0,1)$ , i.e.

$\displaystyle f_Z(z) = \frac{1}{\sqrt{2\pi}} e^{-z^2/2}$

and

. Then

$\displaystyle F_Y(y)$	$\displaystyle = P(Z^2 \le y)$
	$\displaystyle = \left\{ \begin{array}{ll} 0 & y < 0 \\ P(-\sqrt{y} \le Z \le \sqrt{y}) & y \ge 0 \, . \end{array} \right.$

Now differentiate

$\displaystyle P(-\sqrt{y} \le Z \le \sqrt{y}) = F_Z(\sqrt{y}) -F_Z(-\sqrt{y})$

to get

$\displaystyle f_Y(y) = \left\{ \begin{array}{ll} 0 & y < 0 \\ \frac{d}{dy}\le... ...(-\sqrt{y})\right] & y > 0 \\ \mbox{undefined} & y=0 \, . \end{array}\right.$

Then

$\displaystyle \frac{d}{dy} F_Z(\sqrt{y})$	$\displaystyle =$	$\displaystyle f_Z(\sqrt{y})\frac{d}{dy}\sqrt{y}$
	$\displaystyle =$	$\displaystyle \frac{1}{\sqrt{2\pi}} \exp\left(-\left(\sqrt{y}\right)^2/2\right)\frac{1}{2} y^{-1/2}$
	$\displaystyle =$	$\displaystyle \frac{1}{2\sqrt{2\pi y}} e^{-y/2} \,.$

(Similar formula for other derivative.) Thus

$\displaystyle f_Y(y) = \left\{ \begin{array}{ll} \frac{1}{\sqrt{2\pi y}} e^{-y/2} & y>0 \\ 0 & y < 0 \\ \mbox{undefined} & y=0 \, . \end{array}\right.$

We will find indicator notation useful:

$\displaystyle 1(y>0) = \left\{ \begin{array}{ll} 1 & y>0 \\ 0 & y \le 0 \end{array}\right.$

which we use to write

$\displaystyle f_Y(y) = \frac{1}{\sqrt{2\pi y}} e^{-y/2} 1(y>0)$

(changing definition unimportantly at

Notice: I never evaluated before differentiating it. In fact and are integrals I can't do but I can differentiate then anyway. Remember fundamental theorem of calculus:

$\displaystyle \frac{d}{dx} \int_a^x f(y) \, dy = f(x)$

at any

where

is continuous.

Summary: for with and each real valued

$\displaystyle P(Y \le y)$	$\displaystyle = P(g(X) \le y)$
	$\displaystyle = P(X \in g^{-1}(-\infty,y]) \, .$

Take

to compute the density

$\displaystyle f_Y(y) = \frac{d}{dy}\int_{\{x:g(x) \le y\}} f_X(x) \, dx \, .$

Often can differentiate without doing integral.

Method 2: Change of variables.

Assume is one to one. I do: is increasing and differentiable. Interpretation of density (based on density = $F^\prime$ ):

$\displaystyle f_Y(y)$	$\displaystyle =$	$\displaystyle \lim_{\delta y \to 0} \frac{P(y \le Y \le y+\delta y)}{\delta y}$
	$\displaystyle =$	$\displaystyle \lim_{\delta y \to 0} \frac{F_Y(y+\delta y)-F_Y(y)}{\delta y}$

and

$\displaystyle f_X(x) = \lim_{\delta x \to 0} \frac{P(x \le X \le x+\delta x)}{\delta x} \, .$

Now assume

. Define $\delta y$ by $y+\delta y = g(x+\delta x)$ . Then

$\displaystyle P( y \le Y \le g(x+\delta x) ) = P( x \le X \le x+\delta x) \, .$

Get

$\displaystyle \frac{P( y \le Y \le y+\delta y) )}{\delta y} = \frac{P( x \le X \le x+\delta x)/\delta x}{ \{g(x+\delta x)-y\}/\delta x} \, .$

Take limit to get

$\displaystyle f_Y(y) = f_X(x)/g^\prime(x)$

$\displaystyle f_Y(g(x))g^\prime(x) = f_X(x) \, .$

Alternative view:

Each probability is integral of a density.

The first is the integral of the density of over the small interval from to $y=g(x+\delta x)$ . The interval is narrow so is nearly constant and

$\displaystyle P( y \le Y \le g(x+\delta x) ) \approx f_Y(y)(g(x+\delta x) - g(x)) \, .$

Since

has a derivative the difference

$\displaystyle g(x+\delta x) - g(x) \approx \delta x g^\prime(x)$

and we get

$\displaystyle P( y \le Y \le g(x+\delta x) ) \approx f_Y(y) g^\prime(x) \delta x\, .$

Same idea applied to $P( x \le X \le x+\delta x)$ gives

$\displaystyle P( x \le X \le x+\delta x) \approx f_X(x) \delta x$

so that

$\displaystyle f_Y(y) g^\prime(x) \delta x \approx f_X(x) \delta x$

or, cancelling the $\delta x$ in the limit

$\displaystyle f_Y(y) g^\prime(x) = f_X(x)\, .$

If you remember

then you get

$\displaystyle f_X(x) = f_Y(g(x)) g^\prime(x)\, .$

Or solve

to get

in terms of

, that is, $x=g^{-1}(y)$ and then

$\displaystyle f_Y(y) = f_X(g^{-1}(y)) / g^\prime(g^{-1}(y))\, .$

This is just the change of variables formula for doing integrals.

Remark: For decreasing $g^\prime < 0$ but Then the interval $(g(x), g(x+\delta x))$ is really $(g(x+\delta x),g(x))$ so that $g(x) - g(x+\delta x) \approx -g^\prime(x) \delta x$ . In both cases this amounts to the formula

$\displaystyle f_X(x) = f_Y(g(x))\vert g^\prime(x)\vert \, .$

Mnemonic:

$\displaystyle f_Y(y) dy = f_X(x) dx \, .$

Example: $X\sim$ Weibull(shape $\alpha$ , scale $\beta$ ) or

$\displaystyle f_X(x)= \frac{\alpha}{\beta} \left(\frac{x}{\beta}\right)^{\alpha-1} \exp\left\{ -(x/\beta)^\alpha\right\} 1(x>0)\, .$

Let $Y=\log X$ or $g(x) = \log(x)$ .

Solve $y=\log x$ : $x=\exp(y)$ or $g^{-1}(y) = e^y$ .

Then $g^\prime(x) = 1/x$ and $1/g^\prime(g^{-1}(y)) = 1/(1/e^y) =e^y$ .

Hence

$\displaystyle f_Y(y) = \frac{\alpha}{\beta} \left(\frac{e^y}{\beta}\right)^{\alpha-1} \exp\left\{ -(e^y/\beta)^\alpha\right\} 1(e^y>0) e^y\, .$

For any

so indicator

1. So

$\displaystyle f_Y(y) = \frac{\alpha}{\beta^\alpha} \exp\left\{\alpha y -e^{\alpha y}/\beta^\alpha\right\} \, .$

Define $\phi = \log\beta$ and $\theta = 1/\alpha$ ; then,

$\displaystyle f_Y(y) = \frac{1}{\theta} \exp\left\{\frac{y-\phi}{\theta} -\exp\left\{\frac{y-\phi}{\theta}\right\}\right\} \, .$

Extreme Value density with location parameter $\phi$ and scale parameter $\theta$ . (Note: several distributions are called Extreme Value.)

Marginalization

Simplest multivariate problem:

$\displaystyle X=(X_1,\ldots,X_p), \qquad Y=X_1$

(or in general

is any

Theorem 1 If

has density $f(x_1,\ldots,x_p)$ and

then $Y=(X_1,\ldots,X_q)$ has density

$\displaystyle f_Y(x_1,\ldots,x_q) = \int_{-\infty}^\infty \cdots \int_{-\infty}^\infty f(x_1,\ldots,x_p) \, dx_{q+1} \ldots dx_p\, .$

$f_{X_1,\ldots,X_q}$ is the marginal density of $X_1,\ldots,X_q$ and the joint density of but they are both just densities. ``Marginal'' just to distinguish from the joint density of .

Example The function

$\displaystyle f(x_1,x_2) = Kx_1x_21(x_1> 0,x_2 >0,x_1+x_2 < 1)$

is a density provided

$\displaystyle P(X\in R^2) = \int_{-\infty}^\infty \int_{-\infty}^\infty f(x_1,x_2)\, dx_1\, dx_2 = 1 \, .$

The integral is

$\displaystyle K \int_0^1 \int_0^{1-x_1} x_1 x_2 \, dx_1\, dx_2$

. The marginal density of

$\displaystyle f_{X_1}(x_1) =$	$\displaystyle \int_{-\infty}^\infty 24 x_1 x_2$
	$\displaystyle \times1(x_1> 0, x_2 >0, x_1+x_2 < 1)\, dx_2$
$\displaystyle =$	$\displaystyle 24 \int_0^{1-x_1} x_1 x_2 1(0 < x_1< 1) dx_2$
$\displaystyle =$	$\displaystyle 12 x_1(1-x_1)^2 1(0 < x_1 < 1) \, .$

This is a Beta

density.

General problem has $Y=(Y_1,\ldots,Y_q)$ with $Y_i = g_i(X_1,\ldots,X_p)$ .

Case 1: . won't have density for ``smooth'' . will have a singular or discrete distribution. Problem rarely of real interest. (But, e.g., residuals have singular distribution.)

Case 2: . We use a change of variables formula which generalizes the one derived above for the case . (See below.)

Case 3: . Pad out -add on more variables (carefully chosen) say $Y_{q+1},\ldots,Y_p$ . Find functions $g_{q+1}, \ldots,g_p$ . Define for $q<i \le p$ , $Y_i = g_i(X_1,\ldots,X_p)$ and $Z=(Y_1,\ldots,Y_p) \, .$ Choose so that we can use change of variables on $g=(g_1,\ldots,g_p)$ to compute . Find by integration:

$\begin{multline*} f_Y(y_1,\ldots,y_q) = \\ \int_{-\infty}^\infty \cdots \int_{... ... \\ f_Z(y_1,\ldots,y_q,z_{q+1},\ldots,z_p) dz_{q+1} \ldots dz_p \end{multline*}$

Change of Variables

Suppose $Y=g(X) \in R^p$ with $X\in R^p$ having density . Assume is a one to one (``injective") map, i.e., if and only if . Find :

Step 1: Solve for in terms of : $x=g^{-1}(y)$ .

Step 2: Use basic equation:

$\displaystyle f_Y(y) dy =f_X(x) dx$

and rewrite it in the form

$\displaystyle f_Y(y) = f_X(g^{-1}(y)) \frac{dx}{dy} \, .$

Interpretation of derivative $\frac{dx}{dy}$ when

$\displaystyle \frac{dx}{dy} = \left\vert \mbox{det}\left(\frac{\partial x_i}{\partial y_j}\right)\right\vert$

which is the so called Jacobian.

Equivalent formula inverts the matrix:

$\displaystyle f_Y(y) = \frac{f_X(g^{-1}(y))}{ \left\vert\frac{dy}{dx}\right\vert}$

This notation means

$\displaystyle \left\vert\frac{dy}{dx}\right\vert = \left\vert \mbox{det} \left... ...} & \cdots & \frac{\partial y_p}{\partial x_p} \end{array} \right]\right\vert$

but with

replaced by the corresponding value of

, that is, replace

by $g^{-1}(y)$ .

Example: The density

$\displaystyle f_X(x_1,x_2) = \frac{1}{2\pi} \exp\left\{ -\frac{x_1^2+x_2^2}{2}\right\}$

is the standard bivariate normal density. Let

where $Y_1=\sqrt{X_1^2+X_2^2}$ and $0 \le Y_2< 2\pi$ is angle from the positive

axis to the ray from the origin to the point

. I.e.,

in polar co-ordinates.

Solve for in terms of :

$\displaystyle X_1$	$\displaystyle =$	$\displaystyle Y_1 \cos(Y_2)$
$\displaystyle X_2$	$\displaystyle =$	$\displaystyle Y_1 \sin(Y_2)$

so that

$\displaystyle g(x_1,x_2)$	$\displaystyle =$	$\displaystyle (g_1(x_1,x_2),g_2(x_1,x_2))$
	$\displaystyle =$	$\displaystyle (\sqrt{x_1^2 + x_2^2},$ argument $\displaystyle (x_1,x_2))$
$\displaystyle g^{-1}(y_1,y_2)$	$\displaystyle =$	$\displaystyle (g^{-1}_1(y_1,y_2),g^{-1}_2(y_1,y_2))$
	$\displaystyle =$	$\displaystyle (y_1\cos(y_2), y_1\sin(y_2))$
$\displaystyle \left\vert\frac{dx}{dy}\right\vert$	$\displaystyle =$	$\displaystyle \left\vert \mbox{det}\left( \begin{array}{cc} \cos(y_2) & -y_1\sin(y_2) \\ \sin(y_2) & y_1 \cos(y_2) \end{array}\right) \right\vert$
	$\displaystyle =$	$\displaystyle y_1 \, .$

It follows that

$\displaystyle f_Y(y_1,y_2) = \frac{1}{2\pi}\exp\left\{-\frac{y_1^2}{2}\right\}y_1 \times 1(0 \le y_1 < \infty) 1(0 \le y_2 < 2\pi ) \, .$

Next: marginal densities of , ?

Factor as where

$\displaystyle h_1(y_1) = y_1e^{-y_1^2/2} 1(0 \le y_1 < \infty)$

and

$\displaystyle h_2(y_2) = 1(0 \le y_2 < 2\pi )/ (2\pi) \, .$

Then

$\displaystyle f_{Y_1}(y_1)$	$\displaystyle =$	$\displaystyle \int_{-\infty}^\infty h_1(y_1)h_2(y_2) \, dy_2$
	$\displaystyle =$	$\displaystyle h_1(y_1) \int_{-\infty}^\infty h_2(y_2) \, dy_2$