web

$next$ $up$ $previous$

Postscript version of this page
PDF version of this page

STAT 450: Statistical Theory

Expectation, moments

Two elementary definitions of expected values:

Defn: If has density then

$\displaystyle E(g(X)) = \int g(x)f(x)\, dx \,.$

Defn: If has discrete density then

$\displaystyle E(g(X)) = \sum_x g(x)f(x) \,.$

FACT: If for a smooth

$\displaystyle E(Y)$	$\displaystyle = \int y f_Y(y) \, dy$
	$\displaystyle = \int g(x) f_Y(g(x)) g^\prime(x) \, dy$
	$\displaystyle = E(g(X))$

by the change of variables formula for integration. This is good because otherwise we might have two different values for

In general, there are random variables which are neither absolutely continuous nor discrete. See STAT 801 for general definition of E.

Defn: We call integrable if

$\displaystyle E(\vert X\vert) < \infty \, .$

Facts: is a linear, monotone, positive operator:

Linear: provided and are integrable.
Positive: $P(X \ge 0) = 1$ implies $E(X) \ge 0$ .
Monotone: $P(X \ge Y)=1$ and , integrable implies $E(X) \ge E(Y)$ .

Major technical theorems:

Monotone Convergence: If $0 \le X_1 \le X_2 \le \cdots$ and $X= \lim X_n$ (which has to exist) then

$\displaystyle E(X) = \lim_{n\to \infty} E(X_n) \, .$

Dominated Convergence: If $\vert X_n\vert \le Y_n$ and $\exists$ rv such that $X_n \to X$ (technical details of this convergence later in the course) and a random variable such that $Y_n \to Y$ with $E(Y_n) \to E(Y) < \infty$ then

$\displaystyle E(X_n) \to E(X) \, .$

Often used with all

the same rv

Theorem: With this definition of if has density (even in say) and then

$\displaystyle E(Y) = \int g(x) f(x) dx \, .$

(Could be a multiple integral.) If

has pmf

then

$\displaystyle E(Y) =\sum_x g(x) f(x) \, .$

Firts conclusion works, e.g., even if has a density but doesn't.

Defn: The $r^{\rm th}$ moment (about the origin) of a real rv is $\mu_r^\prime=E(X^r)$ (provided it exists). We generally use $\mu$ for .

Defn: The $r^{\rm th}$ central moment is

$\displaystyle \mu_r = E[(X-\mu)^r] \, .$

We call $\sigma^2 = \mu_2$ the variance.

Defn: For an valued random vector

$\displaystyle \mu_X = E(X)$

is vector whose $i^{\rm th}$ entry is

(provided all entries exist).

Defn: The ( $p \times p$ ) variance covariance matrix of is

$\displaystyle Var(X) = E\left[ (X-\mu)(X-\mu)^t \right]$

which exists provided each component

has a finite second moment.

Moments and probabilities of rare events are closely connected as will be seen in a number of important probability theorems.

Example: Markov's inequality

$\displaystyle P(\vert X-\mu\vert \ge t )$	$\displaystyle = E[1(\vert X-\mu\vert \ge t)]$
	$\displaystyle \le E\left[\frac{\vert X-\mu\vert^r}{t^r}1(\vert X-\mu\vert \ge t)\right]$
	$\displaystyle \le \frac{E[\vert X-\mu\vert^r]}{t^r}$

Intuition: if moments are small then large deviations from average are unlikely.

Special Case: Chebyshev's inequality

$\displaystyle P(\vert X-\mu\vert \ge t ) \le \frac{{\rm Var}(X)}{t^2} \, .$

Example moments: If is standard normal then

$\displaystyle E(Z)$	$\displaystyle = \int_{-\infty}^\infty z e^{-z^2/2} dz /\sqrt{2\pi}$
	$\displaystyle = \left.\frac{-e^{-z^2/2}}{\sqrt{2\pi}}\right\vert _{-\infty}^\infty$
	$\displaystyle = 0$

and (integrating by parts)

$\displaystyle E(Z^r) =$	$\displaystyle \int_{-\infty}^\infty z^r e^{-z^2/2} dz /\sqrt{2\pi}$
$\displaystyle =$	$\displaystyle \left.\frac{-z^{r-1}e^{-z^2/2}}{\sqrt{2\pi}}\right\vert _{-\infty}^\infty$
	$\displaystyle + (r-1) \int_{-\infty}^\infty z^{r-2} e^{-z^2/2} dz /\sqrt{2\pi}$

so that

$\displaystyle \mu_r = (r-1)\mu_{r-2}$

for $r \ge 2$ . Remembering that $\mu_1=0$ and

$\displaystyle \mu_0 = \int_{-\infty}^\infty z^0 e^{-z^2/2} dz /\sqrt{2\pi}=1$

we find that

$\displaystyle \mu_r = \left\{ \begin{array}{ll} 0 & \mbox{$r$ odd} \\ (r-1)(r-3)\cdots 1 & \mbox{$r$ even} \, . \end{array}\right.$

If now $X\sim N(\mu,\sigma^2)$ , that is, $X\sim \sigma Z + \mu$ , then $E(X) = \sigma E(Z) + \mu = \mu$ and

$\displaystyle \mu_r(X) = E[(X-\mu)^r] = \sigma^r E(Z^r) \, .$

In particular, we see that our choice of notation $N(\mu,\sigma^2)$ for the distribution of $\sigma Z + \mu$ is justified; $\sigma$ is indeed the variance.

Similarly for $X=\sim MVN(\mu,\Sigma)$ we have $X=AZ+\mu$ with $Z\sim MVN(0,I)$ and

$\displaystyle E(X) = \mu$

and

$\displaystyle {\rm Var}(X)$	$\displaystyle = E\left\{(X-\mu)(X-\mu)^t\right\}$
	$\displaystyle = E\left\{ AZ (AZ)^t\right\}$
	$\displaystyle = A E(ZZ^t) A^t$
	$\displaystyle = AIA^t = \Sigma \, .$

Note use of easy calculation:

and

$\displaystyle {\rm Var}(Z) = E(ZZ^t) =I \, .$

Moments and independence

Theorem: If $X_1,\ldots,X_p$ are independent and each is integrable then $X=X_1\cdots X_p$ is integrable and

$\displaystyle E(X_1\cdots X_p) = E(X_1) \cdots E(X_p) \, .$

$next$ $up$ $previous$

Richard Lockhart
2002-09-16