next up previous


Postscript version of this page

STAT 801: Mathematical Statistics

Distribution Theory

Basic Problem: Start with assumptions about $ f$ or CDF of random vector $ X=(X_1,\ldots,X_p)$. Define $ Y=g(X_1,\ldots,X_p)$ to be some function of $ X$ (usually some statistic of interest). How can we compute the distribution or CDF or density of $ Y$?

Univariate Techniques

Method 1: compute the CDF by integration and differentiate to find $ f_Y$.

Example: $ U \sim$   Uniform$ [0,1]$ and $ Y=-\log U$.

$\displaystyle F_Y(y)$ $\displaystyle =$ $\displaystyle P(Y \le y)
= P(-\log U \le y)$  
  $\displaystyle =$ $\displaystyle P(\log U \ge -y) = P(U \ge e^{-y})$  
  $\displaystyle =$ $\displaystyle \left\{ \begin{array}{ll}
1- e^{-y} & y > 0
\\
0 & y \le 0 \, .
\end{array}\right.$  

so $ Y$ has standard exponential distribution.

Example: $ Z \sim N(0,1)$, i.e.

$\displaystyle f_Z(z) = \frac{1}{\sqrt{2\pi}} e^{-z^2/2}
$

and $ Y=Z^2$. Then

$\displaystyle F_Y(y)$ $\displaystyle = P(Z^2 \le y)$    
  $\displaystyle = \left\{ \begin{array}{ll} 0 & y < 0 \\ P(-\sqrt{y} \le Z \le \sqrt{y}) & y \ge 0 \, . \end{array} \right.$    

Now differentiate

$\displaystyle P(-\sqrt{y} \le Z \le \sqrt{y}) = F_Z(\sqrt{y}) -F_Z(-\sqrt{y})
$

to get

$\displaystyle f_Y(y) = \left\{ \begin{array}{ll}
0 & y < 0
\\
\frac{d}{dy}\le...
...(-\sqrt{y})\right] & y > 0
\\
\mbox{undefined} & y=0 \, .
\end{array}\right.
$

Then
$\displaystyle \frac{d}{dy} F_Z(\sqrt{y})$ $\displaystyle =$ $\displaystyle f_Z(\sqrt{y})\frac{d}{dy}\sqrt{y}$  
  $\displaystyle =$ $\displaystyle \frac{1}{\sqrt{2\pi}} \exp\left(-\left(\sqrt{y}\right)^2/2\right)\frac{1}{2} y^{-1/2}$  
  $\displaystyle =$ $\displaystyle \frac{1}{2\sqrt{2\pi y}} e^{-y/2} \,.$  

(Similar formula for other derivative.) Thus

$\displaystyle f_Y(y) = \left\{ \begin{array}{ll}
\frac{1}{\sqrt{2\pi y}} e^{-y/2} & y>0
\\
0 & y < 0
\\
\mbox{undefined} & y=0 \, .
\end{array}\right.
$

We will find indicator notation useful:

$\displaystyle 1(y>0) = \left\{ \begin{array}{ll}
1 & y>0
\\
0 & y \le 0
\end{array}\right.
$

which we use to write

$\displaystyle f_Y(y) = \frac{1}{\sqrt{2\pi y}} e^{-y/2} 1(y>0)
$

(changing definition unimportantly at $ y=0$).

Notice: I never evaluated $ F_Y$ before differentiating it. In fact $ F_Y$ and $ F_Z$ are integrals I can't do but I can differentiate then anyway. Remember fundamental theorem of calculus:

$\displaystyle \frac{d}{dx} \int_a^x f(y) \, dy = f(x)
$

at any $ x$ where $ f$ is continuous.

Summary: for $ Y=g(X)$ with $ X$ and $ Y$ each real valued

$\displaystyle P(Y \le y)$ $\displaystyle = P(g(X) \le y)$    
  $\displaystyle = P(X \in g^{-1}(-\infty,y]) \, .$    

Take $ d/dy$ to compute the density

$\displaystyle f_Y(y) = \frac{d}{dy}\int_{\{x:g(x) \le y\}} f_X(x) \, dx \, .
$

Often can differentiate without doing integral.

Method 2: Change of variables.

Assume $ g$ is one to one. I do: $ g$ is increasing and differentiable. Interpretation of density (based on density = $ F^\prime$):

$\displaystyle f_Y(y)$ $\displaystyle =$ $\displaystyle \lim_{\delta y \to 0} \frac{P(y \le Y \le y+\delta y)}{\delta y}$  
  $\displaystyle =$ $\displaystyle \lim_{\delta y \to 0} \frac{F_Y(y+\delta y)-F_Y(y)}{\delta y}$  

and

$\displaystyle f_X(x) = \lim_{\delta x \to 0} \frac{P(x \le X \le x+\delta x)}{\delta x} \, .
$

Now assume $ y=g(x)$. Define $ \delta y$ by $ y+\delta y = g(x+\delta x)$. Then

$\displaystyle P( y \le Y \le g(x+\delta x) ) = P( x \le X \le x+\delta x) \, .
$

Get

$\displaystyle \frac{P( y \le Y \le y+\delta y) )}{\delta y}
=
\frac{P( x \le X \le x+\delta x)/\delta x}{
\{g(x+\delta x)-y\}/\delta x} \, .
$

Take limit to get

$\displaystyle f_Y(y) = f_X(x)/g^\prime(x)
$

or

$\displaystyle f_Y(g(x))g^\prime(x) = f_X(x) \, .
$

Alternative view:

Each probability is integral of a density.

The first is the integral of the density of $ Y$ over the small interval from $ y=g(x)$ to $ y=g(x+\delta x)$. The interval is narrow so $ f_Y$ is nearly constant and

$\displaystyle P( y \le Y \le g(x+\delta x) ) \approx f_Y(y)(g(x+\delta x) - g(x)) \, .
$

Since $ g$ has a derivative the difference

$\displaystyle g(x+\delta x) - g(x) \approx \delta x g^\prime(x)
$

and we get

$\displaystyle P( y \le Y \le g(x+\delta x) ) \approx f_Y(y) g^\prime(x) \delta x\, .
$

Same idea applied to $ P( x \le X \le x+\delta x)$ gives

$\displaystyle P( x \le X \le x+\delta x) \approx f_X(x) \delta x
$

so that

$\displaystyle f_Y(y) g^\prime(x) \delta x \approx f_X(x) \delta x
$

or, cancelling the $ \delta x$ in the limit

$\displaystyle f_Y(y) g^\prime(x) = f_X(x)\, .
$

If you remember $ y=g(x)$ then you get

$\displaystyle f_X(x) = f_Y(g(x)) g^\prime(x)\, .
$

Or solve $ y=g(x)$ to get $ x$ in terms of $ y$, that is, $ x=g^{-1}(y)$ and then

$\displaystyle f_Y(y) = f_X(g^{-1}(y)) / g^\prime(g^{-1}(y))\, .
$

This is just the change of variables formula for doing integrals.

Remark: For $ g$ decreasing $ g^\prime < 0$ but Then the interval $ (g(x), g(x+\delta x))$ is really $ (g(x+\delta x),g(x))$ so that $ g(x) - g(x+\delta x) \approx -g^\prime(x) \delta x$. In both cases this amounts to the formula

$\displaystyle f_X(x) = f_Y(g(x))\vert g^\prime(x)\vert \, .
$

Mnemonic:

$\displaystyle f_Y(y) dy = f_X(x) dx \, .
$

Example: $ X\sim$Weibull(shape $ \alpha$, scale $ \beta$) or

$\displaystyle f_X(x)= \frac{\alpha}{\beta} \left(\frac{x}{\beta}\right)^{\alpha-1}
\exp\left\{ -(x/\beta)^\alpha\right\} 1(x>0)\, .
$

Let $ Y=\log X$ or $ g(x) = \log(x)$.

Solve $ y=\log x$: $ x=\exp(y)$ or $ g^{-1}(y) = e^y$.

Then $ g^\prime(x) = 1/x$ and $ 1/g^\prime(g^{-1}(y)) = 1/(1/e^y) =e^y$.

Hence

$\displaystyle f_Y(y) = \frac{\alpha}{\beta} \left(\frac{e^y}{\beta}\right)^{\alpha-1}
\exp\left\{ -(e^y/\beta)^\alpha\right\} 1(e^y>0) e^y\, .
$

For any $ y$, $ e^y > 0 $ so indicator $ =$ 1. So

$\displaystyle f_Y(y) = \frac{\alpha}{\beta^\alpha}
\exp\left\{\alpha y -e^{\alpha y}/\beta^\alpha\right\} \, .
$

Define $ \phi = \log\beta$ and $ \theta = 1/\alpha$; then,

$\displaystyle f_Y(y) = \frac{1}{\theta}
\exp\left\{\frac{y-\phi}{\theta} -\exp\left\{\frac{y-\phi}{\theta}\right\}\right\} \, .
$

Extreme Value density with location parameter $ \phi$ and scale parameter $ \theta$. (Note: several distributions are called Extreme Value.)

Marginalization

Simplest multivariate problem:

$\displaystyle X=(X_1,\ldots,X_p), \qquad Y=X_1$

(or in general $ Y$ is any $ X_j$).

Theorem 1   If $ X$ has density $ f(x_1,\ldots,x_p)$ and $ q<p$ then $ Y=(X_1,\ldots,X_q)$ has density

$\displaystyle f_Y(x_1,\ldots,x_q)
=
\int_{-\infty}^\infty \cdots \int_{-\infty}^\infty
f(x_1,\ldots,x_p) \, dx_{q+1} \ldots dx_p\, .
$

$ f_{X_1,\ldots,X_q}$ is the marginal density of $ X_1,\ldots,X_q$ and $ f_X$ the joint density of $ X$ but they are both just densities. ``Marginal'' just to distinguish from the joint density of $ X$.

Example The function

$\displaystyle f(x_1,x_2) = Kx_1x_21(x_1> 0,x_2 >0,x_1+x_2 < 1)
$

is a density provided

$\displaystyle P(X\in R^2) = \int_{-\infty}^\infty \int_{-\infty}^\infty f(x_1,x_2)\, dx_1\, dx_2 = 1 \, .
$

The integral is

$\displaystyle K \int_0^1 \int_0^{1-x_1} x_1 x_2 \, dx_1\, dx_2$    
   

so $ K=24$. The marginal density of $ x_1$ is

$\displaystyle f_{X_1}(x_1) =$ $\displaystyle \int_{-\infty}^\infty 24 x_1 x_2$    
  $\displaystyle \times1(x_1> 0, x_2 >0, x_1+x_2 < 1)\, dx_2$    
$\displaystyle =$ $\displaystyle 24 \int_0^{1-x_1} x_1 x_2 1(0 < x_1< 1) dx_2$    
$\displaystyle =$ $\displaystyle 12 x_1(1-x_1)^2 1(0 < x_1 < 1) \, .$    

This is a Beta$ (2,3)$ density.

General problem has $ Y=(Y_1,\ldots,Y_q)$ with $ Y_i = g_i(X_1,\ldots,X_p)$.

Case 1: $ q>p$. $ Y$ won't have density for ``smooth'' $ g$. $ Y$ will have a singular or discrete distribution. Problem rarely of real interest. (But, e.g., residuals have singular distribution.)

Case 2: $ q=p$. We use a change of variables formula which generalizes the one derived above for the case $ p=q=1$. (See below.)

Case 3: $ q<p$. Pad out $ Y$-add on $ p-q$ more variables (carefully chosen) say $ Y_{q+1},\ldots,Y_p$. Find functions $ g_{q+1}, \ldots,g_p$. Define for $ q<i \le p$, $ Y_i = g_i(X_1,\ldots,X_p)$ and $ Z=(Y_1,\ldots,Y_p) \, .
$ Choose $ g_i$ so that we can use change of variables on $ g=(g_1,\ldots,g_p)$ to compute $ f_Z$. Find $ f_Y$ by integration:

\begin{multline*}
f_Y(y_1,\ldots,y_q)
=
\\
\int_{-\infty}^\infty \cdots \int_{...
...
\\
f_Z(y_1,\ldots,y_q,z_{q+1},\ldots,z_p) dz_{q+1} \ldots dz_p
\end{multline*}

Change of Variables

Suppose $ Y=g(X) \in R^p$ with $ X\in R^p$ having density $ f_X$. Assume $ g$ is a one to one (``injective") map, i.e., $ g(x_1) = g(x_2)$ if and only if $ x_1 = x_2$. Find $ f_Y$:

Step 1: Solve for $ x$ in terms of $ y$: $ x=g^{-1}(y)$.

Step 2: Use basic equation:

$\displaystyle f_Y(y) dy =f_X(x) dx
$

and rewrite it in the form

$\displaystyle f_Y(y) = f_X(g^{-1}(y)) \frac{dx}{dy} \, .
$

Interpretation of derivative $ \frac{dx}{dy}$ when $ p>1$:

$\displaystyle \frac{dx}{dy} = \left\vert \mbox{det}\left(\frac{\partial x_i}{\partial y_j}\right)\right\vert
$

which is the so called Jacobian.

Equivalent formula inverts the matrix:

$\displaystyle f_Y(y) = \frac{f_X(g^{-1}(y))}{ \left\vert\frac{dy}{dx}\right\vert}
$

This notation means

$\displaystyle \left\vert\frac{dy}{dx}\right\vert =
\left\vert \mbox{det} \left...
...} & \cdots &
\frac{\partial y_p}{\partial x_p}
\end{array} \right]\right\vert
$

but with $ x$ replaced by the corresponding value of $ y$, that is, replace $ x$ by $ g^{-1}(y)$.

Example: The density

$\displaystyle f_X(x_1,x_2) = \frac{1}{2\pi} \exp\left\{ -\frac{x_1^2+x_2^2}{2}\right\}
$

is the standard bivariate normal density. Let $ Y=(Y_1,Y_2)$ where $ Y_1=\sqrt{X_1^2+X_2^2}$ and $ 0 \le Y_2< 2\pi$ is angle from the positive $ x$ axis to the ray from the origin to the point $ (X_1,X_2)$. I.e., $ Y$ is $ X$ in polar co-ordinates.

Solve for $ x$ in terms of $ y$:

$\displaystyle X_1$ $\displaystyle =$ $\displaystyle Y_1 \cos(Y_2)$  
$\displaystyle X_2$ $\displaystyle =$ $\displaystyle Y_1 \sin(Y_2)$  

so that
$\displaystyle g(x_1,x_2)$ $\displaystyle =$ $\displaystyle (g_1(x_1,x_2),g_2(x_1,x_2))$  
  $\displaystyle =$ $\displaystyle (\sqrt{x_1^2 + x_2^2},$argument$\displaystyle (x_1,x_2))$  
$\displaystyle g^{-1}(y_1,y_2)$ $\displaystyle =$ $\displaystyle (g^{-1}_1(y_1,y_2),g^{-1}_2(y_1,y_2))$  
  $\displaystyle =$ $\displaystyle (y_1\cos(y_2), y_1\sin(y_2))$  
$\displaystyle \left\vert\frac{dx}{dy}\right\vert$ $\displaystyle =$ $\displaystyle \left\vert \mbox{det}\left( \begin{array}{cc}
\cos(y_2) & -y_1\sin(y_2)
\\
\sin(y_2) & y_1 \cos(y_2)
\end{array}\right) \right\vert$  
  $\displaystyle =$ $\displaystyle y_1 \, .$  

It follows that

$\displaystyle f_Y(y_1,y_2) = \frac{1}{2\pi}\exp\left\{-\frac{y_1^2}{2}\right\}y_1
\times 1(0 \le y_1 < \infty)
1(0 \le y_2 < 2\pi ) \, .
$

Next: marginal densities of $ Y_1$, $ Y_2$?

Factor $ f_Y$ as $ f_Y(y_1,y_2) = h_1(y_1)h_2(y_2)$ where

$\displaystyle h_1(y_1) = y_1e^{-y_1^2/2} 1(0 \le y_1 < \infty)
$

and

$\displaystyle h_2(y_2) = 1(0 \le y_2 < 2\pi )/ (2\pi) \, .
$

Then

$\displaystyle f_{Y_1}(y_1)$ $\displaystyle =$ $\displaystyle \int_{-\infty}^\infty h_1(y_1)h_2(y_2) \, dy_2$  
  $\displaystyle =$ $\displaystyle h_1(y_1) \int_{-\infty}^\infty h_2(y_2) \, dy_2$  

so marginal density of $ Y_1$ is a multiple of $ h_1$. Multiplier makes $ \int f_{Y_1} =1$ but in this case

$\displaystyle \int_{-\infty}^\infty h_2(y_2) \, dy_2 = \int_0^{2\pi} (2\pi)^{-1} dy_2 = 1
$

so that

$\displaystyle f_{Y_1}(y_1) = y_1e^{-y_1^2/2} 1(0 \le y_1 < \infty) \, .
$

(Special case of Weibull or Rayleigh distribution.) Similarly

$\displaystyle f_{Y_2}(y_2) = 1(0 \le y_2 < 2\pi )/ (2\pi)
$

which is the Uniform($ (0,2\pi)$ density.

Exercise: $ W=Y_1^2/2$ has standard exponential distribution.

Recall: by definition $ U=Y_1^2$ has a $ \chi^2$ distribution on 2 degrees of freedom.

Exercise: find $ \chi^2_2$ density.

Note: We show below factorization of density is equivalent to independence.

next up previous


Postscript version of this page

Richard Lockhart
2001-01-05