No Title

$next$ $up$ $previous$

Postscript version of this file

STAT 870 Lecture 5

The Strong Law of Large Numbers

Standard approaches to probability theorems:

Convert sophisticated mathematical definitions of events (such as the event that a sequence has a limit, or the event that infinitely many times in an infinite sequence of coin tosses there will be more heads than tails) into set notation.
Use theorems of real analysis to re-express events more conveniently.
To prove P(A)=0 find an event B which contains A and has P(B)=0.
Given A defined in terms of infinitely many random variables write A as increasing $\cup$ or decreasing $\cap$ of events each defined from only finitely many random variables.
Use inequalities:
- Kolmogorov's inequality: extension of Chebyshev's to maximum of several rvs
- Borel-Cantelli lemma,
- Minkowski's inequality (p>1)
  
  $\begin{displaymath}E^{1/p}((X+Y)^p) \le {\rm E}^{1/p}(X^p)+{\rm E}^{1/p}(Y^p) \end{displaymath}$
- Hölder's inequality 1/p+1/q=1
  
  $\begin{displaymath}\vert{\rm E}(XY)\vert \le {\rm E}^{1/p}(X^p){\rm E}^{1/q}(Y^q) \end{displaymath}$
- Jensen's inequality. ${\rm E}(\phi(X)) \ge \phi({\rm E}(X))$ for $\phi$ convex.

Events in Set Notation

Event that Y_n converges to 0 is

$\begin{displaymath}A \equiv \{\omega: \lim_{n\to\infty} Y_n(\omega) = 0\} \end{displaymath}$

Not explicitly written in terms of simple events involving only a finite number of Ys.

Recall basic definition of limit: y_n converges to 0 if $\forall\epsilon>0$ $\exists N$ such that $\forall n\ge N$ we have $\vert y_n\vert\le \epsilon$ .

Convert the definition in A into set theory notation:

replace y_n by $Y_n(\omega)$ ,
replace each for every by an intersection
replace each there exists with a union.

We get

$\begin{displaymath}A = \bigcap_{\epsilon > 0} \bigcup_{N=1}^\infty \bigcap_{n=N}^\infty \{\omega: \vert Y_n(\omega)\vert \le \epsilon\} \end{displaymath}$

Not obvious A is event because intersection over $\epsilon > 0$ is uncountable.

However, the intersection is countable. Let

$\begin{displaymath}B_{\epsilon} \equiv \bigcup_{N=1}^\infty \bigcap_{n=N}^\infty \{\omega: \vert Y_n(\omega)\vert \le \epsilon\} \end{displaymath}$

Notice that

$\begin{displaymath}\epsilon^\prime < \epsilon \implies B_{\epsilon^\prime} \subset B_\epsilon \end{displaymath}$

This means that

$\begin{displaymath}\bigcap_{\epsilon > 0} B_\epsilon = \bigcap_{m=1}^\infty B_{1/m} \end{displaymath}$

A is countable intersection of countable unions of countable intersections of events, so A is an event.

Here are some other events:

Sequence S_n has a limit. Formally, a sequence s_n has a limit if $\exists s_\infty$ such that $\forall\epsilon>0$ $\exists N$ such that $\forall n\ge N$ we have $\vert s_n-s_\infty\vert\le \epsilon$ . Mechanically get event:

$\begin{displaymath}\bigcup_s \bigcap_{\epsilon > 0} \bigcup_{N=1}^\infty \bigcap_{n=N}^\infty \{\omega: \vert S_n(\omega)-s\vert \le \epsilon\} \end{displaymath}$

Intersection over $\epsilon$ can be made countable. Union over s, however, is not easy to make countable. Instead use theorem of analysis to describe existence of a limit. A sequence s_n has a limit if and only if the sequence is Cauchy.
Cauchy sequence: $\forall\epsilon>0$ $\exists N$ such that $\forall n\ge N, m\ge n$ we have $\vert s_m-s_n\vert \le \epsilon$ . $\{S_n \text{ has a limit}\}$ is

$\begin{displaymath}\bigcap_{\epsilon > 0} \bigcup_{N=1}^\infty \bigcap_{n=N}^\infty \bigcap_{m=n}^\infty \{\vert S_m-S_n\vert \le \epsilon\} \end{displaymath}$

Intersection over all $\epsilon > 0$ is countable intersection over $\epsilon = 1/r$ for positive integers r.
Y_n is summable: the sequence of partial sums $S_n = \sum_1^n Y_i$ has a limit so
$\begin{multline*}\{Y_n\text{ summable}\} \\ =\bigcap_{r=1}^\infty \bigcup_{N=1}... ..._{m=n}^\infty \{\left\vert \sum_{n+1}^m Y_j\right\vert \le 1/r\} \end{multline*}$
S_n >0 for infinitely many n: $\forall N \exists n \ge N$ such that S_n>0. is

$\begin{displaymath}\bigcap_{N=1}^\infty \bigcup_{n=N}^\infty\{S_n > 0\} \end{displaymath}$
limit superior of S_n is 1 is intersection of two events, $\limsup S_n \le 1$ and $\limsup S_n \ge 1$ . Former is $\forall \epsilon>0 \exists N$ such that $\forall n\ge N$ $S_n \le 1+\epsilon$ . Latter is $\forall\epsilon>0$ and $\forall N$ there is an $n \ge N$ such that $S_n \ge 1-\epsilon$ . Event is $A^*\cap A_*$ where
$\begin{align*}A^* & = \bigcap_{r=1}^\infty \bigcup_{N=1}^\infty \bigcap_{n=N}^\i... ...\infty \bigcap_{N=1}^\infty \bigcup_{n=N}^\infty \{ S_n \ge 1+1/r\} \end{align*}$

Strong Law with 4 moments

Theorem 1 Suppose that $X_n; n\ge 1$ is a sequence of independent identically distributed random variables with ${\rm E}(X_n) = 0$ . Then

$\begin{displaymath}n^{-1}\sum_1^n X_k \to 0 \end{displaymath}$

almost surely.

Must prove P(A)=1 where

$\begin{displaymath}A = \bigcap_{r=1}^\infty \bigcup_{N=1}^\infty \bigcap_{n=N}^\infty B_{n,r} \end{displaymath}$

$\begin{displaymath}B_{n,r}=\{ \vert(X_1+\cdots X_n)/n\vert \le 1/r\} \end{displaymath}$

Let

$\begin{displaymath}C_r = \bigcup_{N=1}^\infty \bigcap_{n=N}^\infty B_{n,r} \end{displaymath}$

and notice that

$\begin{displaymath}C_1 \supset C_2 \supset \cdots \end{displaymath}$

so that

$\begin{displaymath}P( C_1) \ge P(C_2) \ge \cdots \ge \lim P(C_r) = P(A). \end{displaymath}$

Must prove P(C_r) =1 for each r.

Remark: Usually ``for every $\epsilon > 0$ '' will be decreasing intersections because the condition in the definition is harder to satisfy for smaller $\epsilon$ . Similarly unions in definitions tend to be increasing.

Now each C_r is an increasing union. Let

$\begin{displaymath}C_{N,r} = \bigcap_{n=N}^\infty B_{n,r} \end{displaymath}$

and notice that

$\begin{displaymath}C_{1,r} \subset C_{2,r} \subset \cdots \end{displaymath}$

We need to prove that

$\begin{displaymath}P(C_r) = \lim_N P(C_{N,r}) = 1 \end{displaymath}$

or equivalently that

$\begin{displaymath}\lim_NP(C_{N,r}^c) = 0 \end{displaymath}$

Proof with 4 moments using Borel-Cantelli Lemma.

Suppose A_n is a sequence of events. The event

$\begin{displaymath}B\equiv \{A_n \text{ infinitely often}\} = \{ A_n \text{ i.o.}\} \end{displaymath}$

is the set of $\omega$ belonging to infinitely many of the A_n We have seen above that

$\begin{displaymath}B = \bigcap_{N=1}^\infty \bigcup_{n=N}^\infty A_n \end{displaymath}$

Lemma [Borel-Cantelli]: If $\sum P(A_n) < \infty$ then $P(A_n \text{ i.o.}) = 0$ .

To prove the lemma we need to prove that

$\begin{displaymath}\lim_{N\to\infty}P(\bigcup_{n=N}^\infty A_n ) = 0 \end{displaymath}$

But (this next inequality is quite widely useful - it applies to an arbitrary sequence of events A_n)

$\begin{displaymath}P(\bigcup_{n=N}^\infty A_n ) \le \sum_{n=N}^\infty P(A_n) \end{displaymath}$

Fact: convergent series has tail sums which converge to 0 so

$\begin{displaymath}\lim_{N\to\infty} \sum_{n=N}^\infty P(A_n) =0 \end{displaymath}$

which proves the lemma.

Apply lemma: notice that complement of

$\begin{displaymath}C_r = \bigcup_{N=1}^\infty \bigcap_{n=N}^\infty B_{n,r} \end{displaymath}$

is just

$\begin{displaymath}C_r^c =\bigcap_{N=1}^\infty \bigcup_{n=N}^\infty B_{n,r}^c \end{displaymath}$

In words opposite of $\exists N$ such that B_n,r happens for every $n \ge N$ is $\forall N$ there is at least one $n \ge N$ for which B_n,r does not happen:

$\begin{displaymath}C_r^c = \{ B_{n,r}^c\text{ i.o.}\} \end{displaymath}$

Summary: if we prove

$\begin{displaymath}\sum_{n=1}^\infty P(B_{n,r}^c) < \infty \end{displaymath}$

then we will conclude P(C_r^c) =0 so P(C_r) =1.

Next ingredient: Markov's inequality
$\begin{align*}P(B_{n,r}^c) & = P(\vert X_1+\cdots+X_n\vert > n/r) \\ & \le \frac{{\rm E}\left[(X_1+\cdots+X_n)^4\right]}{(n/r)^4} \end{align*}$
Expand out, collect terms:
$\begin{multline*}{\rm E}\left[(X_1+\cdots+X_n)^4\right] \\ = n{\rm E}(X_1^4) + n(n-1)\left[{\rm E}(X_1^2)\right]^2 \end{multline*}$
This produces the bound (using $n(n-1) \le n^2$ )

$\begin{displaymath}P(B_{n,r}^c) \le \frac{r^4{\rm E}(X_1^4)}{n^3} + \frac{r^4\left[{\rm E}(X_1^2)\right]^2}{n^2} \end{displaymath}$

Since

$\begin{displaymath}\sum_n n^{-p} < \infty \end{displaymath}$

for any p>1 the infinite sums converge proving SLLN for rvs for which

$\begin{displaymath}{\rm E}(X_1^4) < \infty \quad\text{and}\quad {\rm E}(X_1^2) < \infty \end{displaymath}$

In fact $X_1^2 \le 1+X_1^4$ proves that ${\rm E}(X_1^4) < \infty$ implies ${\rm E}(X_1^2) < \infty$ .

Proof with 2 finite moments

Assume ${\rm E}(X_1^2)={\rm var}(X_1) < \infty$ .

Use Kronecker's Lemma to find event inside A with probability 1.

Lemma 1 (Kronecker) Suppose

$\begin{displaymath}0 < b_1 < b_2 \cdots \end{displaymath}$

and that x_n is a sequence of real numbers. If

$\begin{displaymath}s_n = \sum_{i=1}^n \frac{x_i}{b_i} \end{displaymath}$

is a convergent sequence then

$\begin{displaymath}\lim_{n\to\infty} \frac{x_1+\cdots +x_n}{b_n} = 0 \end{displaymath}$

Lemma implies that

$\begin{displaymath}A^* = \{ \sum_1^n X_k/k \text{ is convergent sequence}\} \subset A \end{displaymath}$

so that we need only prove that P(A^*)=1.

We compute $P( \text{partial sums Cauchy})$ so study

$\begin{displaymath}\bigcap_{r} \bigcup_{N=1}^\infty \bigcap_{n=N}^\infty \bigcap_{m=n}^\infty \{\vert S_n-S_m\vert \le 1/r\} \end{displaymath}$

where $S_n = \sum 1^n X_i/i$ .

Key ingredient: lower bound on

$\begin{displaymath}P\left(\bigcap_{m=n}^\infty \{\vert S_n-S_m\vert \le \epsilon\}\right) \end{displaymath}$

or upper bound for
$\begin{multline*}P(\bigcup_{m=n}^\infty \{\vert S_m-S_n\vert>\epsilon\}) \\ = \lim_{N\to\infty} P(\bigcup_{m=n}^N\{\vert S_m-S_n\vert>\epsilon\}) \end{multline*}$

Now S_m-S_n is a sum of m-n independent rvs. Need bound on

$\begin{displaymath}P\left(\bigcup_1^n \{ \vert \sum_1^k Z_i\vert > \epsilon\}\right) \end{displaymath}$

where the Z_i are independent mean 0 random variables.

Theorem [Kolmogorov]: Suppose $Z_1,\ldots,Z_n$ are independent random variables with finite variances and mean 0. Put $S_k = \sum_1^k Z_i$ . Then

$\begin{displaymath}P(\exists k, 1 \le k \le n : \vert S_k\vert > \epsilon) \le \frac{{\rm Var}(S_n)}{\epsilon^2} \end{displaymath}$

The proof uses the idea of a stopping time. Define T to be the least value of $k\in\{1,\ldots,n\}$ for which $\vert S_k\vert> \epsilon$ if such a k exists and T=n if no such k exists. Notice that the event T=k is determined by the variables $Z_1,\ldots,Z_k$ ; that is, there is a function $f_k(Z_1,\ldots,Z_k)$ such that

$\begin{displaymath}1(T=k) =f_k(Z_1,\ldots,Z_k) \end{displaymath}$

Now
$\begin{align*}P(\exists k, 1 \le k \le n : \vert S_k\vert > \epsilon) & = P(\vert S_T\vert > \epsilon) \\ & \le \frac{{\rm E}(S_T^2)}{\epsilon^2} \end{align*}$
by Chebyshev's inequality.

Next we compare ${\rm E}(S_n^2)$ to ${\rm E}(S_T^2)$ . We have
$\begin{align*}{\rm E}(S_n^2) & = {\rm E}\left[(S_n-S_T+S_T)^2\right] \\ & = {\r... ...S_n-S_T)^2\right] \\ & \qquad + 2{\rm E}\left[S_T(S_n-S_T)\right] \end{align*}$
First two terms non-negative so if

$\begin{displaymath}{\rm E}\left[S_T(S_n-S_T)\right] =0 \end{displaymath}$

the theorem will be proved.

But
$\begin{align*}S_T(S_n-S_T) & = \sum_1^{n-1} S_k(S_n-S_k)1(T=k) \\ & = \sum_1^{n-1} S_k(S_n-S_k)f_k(Z_1,\ldots,Z_k) \end{align*}$
Notice that S_n-S_k and $S_kf_k(Z_1,\ldots,Z_k)$ are independent so
$\begin{align*}{\rm E}\left[S_T(S_n-S_T)\right] & = \sum_1^{n-1} {\rm E}(S_n-S_k) \\ \qquad \times{\rm E}\left[S_k1(T=k)\right] \\ & = 0 \end{align*}$

Remark: In fact all I needed to prove Kolmogorov's inequality was to know that S_n-S_k was uncorrelated with any function of $S_1,\ldots,S_k$ (or equivalently uncorrelated with any function of $Z_1,\ldots,Z_k$ ).

A sequence $S_1,S_2,\ldots$ is a martingale if, for each $k \ge 1$ ,

$\begin{displaymath}{\rm E}[S_{k+1}\vert S_1,\ldots,S_k] = S_k \end{displaymath}$

(I have yet to define conditional expectation properly but another definition is that

$\begin{displaymath}{\rm E}[ (S_{k+1}-S_k)f_k(S_1,\ldots,S_k)] = 0 \end{displaymath}$

for every bounded Borel function f_k.)

Theorem [Kolmogorov]: Suppose $S_1,\ldots,S_n$ is a martingale with finite variances and mean 0. Then

$\begin{displaymath}P(\exists k, 1 \le k \le n : \vert S_k\vert > \epsilon) \le \frac{{\rm Var}(S_n)}{\epsilon^2}\, . \end{displaymath}$

Back to strong law. Fix $\epsilon > 0$ and integer m. The events

$\begin{displaymath}B_{m,N} = \bigcap_{n=m}^N \{\vert S_n-S_m\vert \le \epsilon\} \end{displaymath}$

decrease, that is, $B_{m,1} \supset B_{m,2} \supset \cdots$ . So
$\begin{align*}P(B_{m,\infty}) &\equiv P( \bigcap_{n=m}^\infty \{\vert S_n-S_m\vert \le \epsilon\}) \\ & = \lim_N P(B_{m,N}) \end{align*}$

Now
$\begin{align*}P(B_{m,N}) & = 1-P(B_{m,N}^c) \\ & \ge 1-\frac{{\rm Var}(S_N-S_m)... ...\ge 1-\frac{{\rm Var}(X_1)}{\epsilon^2} \sum_m^\infty \frac{1}{i^2} \end{align*}$
This shows

$\begin{displaymath}P(B_{m,\infty}) \ge 1-\frac{{\rm Var}(X_1)}{\epsilon^2} \sum_m^\infty \frac{1}{i^2} \end{displaymath}$

Since $\sum_1^\infty i^{-2}<\infty$ the right hand side of this inequality goes to 0 as m goes to $\infty$ . In other words

$\begin{displaymath}\lim_{m\to\infty} P(\bigcap_{n=m}^\infty \{\vert S_n-S_m\vert \le \epsilon\}) = 1 \end{displaymath}$

This proves

$next$ $up$ $previous$

Richard Lockhart
2000-09-20