next up previous


Postscript version of this file

STAT 870 Lecture 5

The Strong Law of Large Numbers

Standard approaches to probability theorems:

Events in Set Notation

Event that Yn converges to 0 is

\begin{displaymath}A \equiv \{\omega: \lim_{n\to\infty} Y_n(\omega) = 0\}
\end{displaymath}

Not explicitly written in terms of simple events involving only a finite number of Ys.

Recall basic definition of limit: yn converges to 0 if $\forall\epsilon>0$$\exists N$ such that $\forall n\ge N$ we have $\vert y_n\vert\le \epsilon$.

Convert the definition in A into set theory notation:

We get

\begin{displaymath}A = \bigcap_{\epsilon > 0} \bigcup_{N=1}^\infty \bigcap_{n=N}^\infty
\{\omega: \vert Y_n(\omega)\vert \le \epsilon\}
\end{displaymath}

Not obvious A is event because intersection over $\epsilon > 0$ is uncountable.

However, the intersection is countable. Let

\begin{displaymath}B_{\epsilon} \equiv \bigcup_{N=1}^\infty \bigcap_{n=N}^\infty
\{\omega: \vert Y_n(\omega)\vert \le \epsilon\}
\end{displaymath}

Notice that

\begin{displaymath}\epsilon^\prime < \epsilon \implies B_{\epsilon^\prime} \subset
B_\epsilon
\end{displaymath}

This means that

\begin{displaymath}\bigcap_{\epsilon > 0} B_\epsilon = \bigcap_{m=1}^\infty B_{1/m}
\end{displaymath}

A is countable intersection of countable unions of countable intersections of events, so A is an event.

Here are some other events:

Strong Law with 4 moments

Theorem 1   Suppose that $X_n; n\ge 1$ is a sequence of independent identically distributed random variables with ${\rm E}(X_n) = 0$. Then

\begin{displaymath}n^{-1}\sum_1^n X_k \to 0
\end{displaymath}

almost surely.

Must prove P(A)=1 where

\begin{displaymath}A = \bigcap_{r=1}^\infty \bigcup_{N=1}^\infty \bigcap_{n=N}^\infty B_{n,r}
\end{displaymath}


\begin{displaymath}B_{n,r}=\{ \vert(X_1+\cdots X_n)/n\vert \le 1/r\}
\end{displaymath}

Let

\begin{displaymath}C_r = \bigcup_{N=1}^\infty \bigcap_{n=N}^\infty B_{n,r}
\end{displaymath}

and notice that

\begin{displaymath}C_1 \supset C_2 \supset \cdots
\end{displaymath}

so that

\begin{displaymath}P( C_1) \ge P(C_2) \ge \cdots \ge \lim P(C_r) = P(A).
\end{displaymath}

Must prove P(Cr) =1 for each r.

Remark: Usually ``for every $\epsilon > 0$'' will be decreasing intersections because the condition in the definition is harder to satisfy for smaller $\epsilon$. Similarly unions in definitions tend to be increasing.

Now each Cr is an increasing union. Let

\begin{displaymath}C_{N,r} = \bigcap_{n=N}^\infty B_{n,r}
\end{displaymath}

and notice that

\begin{displaymath}C_{1,r} \subset C_{2,r} \subset \cdots
\end{displaymath}

We need to prove that

\begin{displaymath}P(C_r) = \lim_N P(C_{N,r}) = 1
\end{displaymath}

or equivalently that

\begin{displaymath}\lim_NP(C_{N,r}^c) = 0
\end{displaymath}

Proof with 4 moments using Borel-Cantelli Lemma.

Suppose An is a sequence of events. The event

\begin{displaymath}B\equiv \{A_n \text{ infinitely often}\} = \{ A_n \text{ i.o.}\}
\end{displaymath}

is the set of $\omega$ belonging to infinitely many of the An We have seen above that

\begin{displaymath}B = \bigcap_{N=1}^\infty \bigcup_{n=N}^\infty A_n
\end{displaymath}

Lemma [Borel-Cantelli]: If $ \sum P(A_n) < \infty$ then $ P(A_n \text{ i.o.}) = 0 $.

To prove the lemma we need to prove that

\begin{displaymath}\lim_{N\to\infty}P(\bigcup_{n=N}^\infty A_n ) = 0
\end{displaymath}

But (this next inequality is quite widely useful - it applies to an arbitrary sequence of events An)

\begin{displaymath}P(\bigcup_{n=N}^\infty A_n ) \le \sum_{n=N}^\infty P(A_n)
\end{displaymath}

Fact: convergent series has tail sums which converge to 0 so

\begin{displaymath}\lim_{N\to\infty} \sum_{n=N}^\infty P(A_n) =0
\end{displaymath}

which proves the lemma.

Apply lemma: notice that complement of

\begin{displaymath}C_r = \bigcup_{N=1}^\infty \bigcap_{n=N}^\infty B_{n,r}
\end{displaymath}

is just

\begin{displaymath}C_r^c =\bigcap_{N=1}^\infty \bigcup_{n=N}^\infty B_{n,r}^c
\end{displaymath}

In words opposite of $\exists N$such that Bn,r happens for every $n \ge N$ is $\forall N$ there is at least one $n \ge N$ for which Bn,r does not happen:

\begin{displaymath}C_r^c = \{ B_{n,r}^c\text{ i.o.}\}
\end{displaymath}

Summary: if we prove

\begin{displaymath}\sum_{n=1}^\infty P(B_{n,r}^c) < \infty
\end{displaymath}

then we will conclude P(Crc) =0 so P(Cr) =1.

Next ingredient: Markov's inequality
\begin{align*}P(B_{n,r}^c) & = P(\vert X_1+\cdots+X_n\vert > n/r)
\\
& \le \frac{{\rm E}\left[(X_1+\cdots+X_n)^4\right]}{(n/r)^4}
\end{align*}
Expand out, collect terms:
\begin{multline*}{\rm E}\left[(X_1+\cdots+X_n)^4\right]
\\
= n{\rm E}(X_1^4) + n(n-1)\left[{\rm E}(X_1^2)\right]^2
\end{multline*}
This produces the bound (using $n(n-1) \le n^2$)

\begin{displaymath}P(B_{n,r}^c) \le \frac{r^4{\rm E}(X_1^4)}{n^3} + \frac{r^4\left[{\rm E}(X_1^2)\right]^2}{n^2}
\end{displaymath}

Since

\begin{displaymath}\sum_n n^{-p} < \infty
\end{displaymath}

for any p>1 the infinite sums converge proving SLLN for rvs for which

\begin{displaymath}{\rm E}(X_1^4) < \infty \quad\text{and}\quad {\rm E}(X_1^2) < \infty
\end{displaymath}

In fact $X_1^2 \le 1+X_1^4$ proves that ${\rm E}(X_1^4) < \infty$ implies ${\rm E}(X_1^2) < \infty $.

Proof with 2 finite moments

Assume ${\rm E}(X_1^2)={\rm var}(X_1) < \infty$.

Use Kronecker's Lemma to find event inside A with probability 1.

Lemma 1 (Kronecker)   Suppose

\begin{displaymath}0 < b_1 < b_2 \cdots
\end{displaymath}

and that xn is a sequence of real numbers. If

\begin{displaymath}s_n = \sum_{i=1}^n \frac{x_i}{b_i}
\end{displaymath}

is a convergent sequence then

\begin{displaymath}\lim_{n\to\infty} \frac{x_1+\cdots +x_n}{b_n} = 0
\end{displaymath}

Lemma implies that

\begin{displaymath}A^* = \{ \sum_1^n X_k/k \text{ is convergent sequence}\} \subset A
\end{displaymath}

so that we need only prove that P(A*)=1.

We compute $P( \text{partial sums Cauchy})$ so study

\begin{displaymath}\bigcap_{r} \bigcup_{N=1}^\infty \bigcap_{n=N}^\infty
\bigcap_{m=n}^\infty \{\vert S_n-S_m\vert \le 1/r\}
\end{displaymath}

where $S_n = \sum 1^n X_i/i$.

Key ingredient: lower bound on

\begin{displaymath}P\left(\bigcap_{m=n}^\infty \{\vert S_n-S_m\vert \le \epsilon\}\right)
\end{displaymath}

or upper bound for
\begin{multline*}P(\bigcup_{m=n}^\infty \{\vert S_m-S_n\vert>\epsilon\})
\\
= \lim_{N\to\infty}
P(\bigcup_{m=n}^N\{\vert S_m-S_n\vert>\epsilon\})
\end{multline*}

Now Sm-Sn is a sum of m-n independent rvs. Need bound on

\begin{displaymath}P\left(\bigcup_1^n \{ \vert \sum_1^k Z_i\vert > \epsilon\}\right)
\end{displaymath}

where the Zi are independent mean 0 random variables.

Theorem [Kolmogorov]: Suppose $Z_1,\ldots,Z_n$ are independent random variables with finite variances and mean 0. Put $S_k = \sum_1^k Z_i$. Then

\begin{displaymath}P(\exists k, 1 \le k \le n : \vert S_k\vert > \epsilon) \le
\frac{{\rm Var}(S_n)}{\epsilon^2}
\end{displaymath}

The proof uses the idea of a stopping time. Define T to be the least value of $k\in\{1,\ldots,n\}$ for which $\vert S_k\vert> \epsilon$if such a k exists and T=n if no such k exists. Notice that the event T=k is determined by the variables $Z_1,\ldots,Z_k$; that is, there is a function $f_k(Z_1,\ldots,Z_k)$ such that

\begin{displaymath}1(T=k) =f_k(Z_1,\ldots,Z_k)
\end{displaymath}

Now
\begin{align*}P(\exists k, 1 \le k \le n : \vert S_k\vert > \epsilon)
& = P(\vert S_T\vert > \epsilon)
\\
& \le \frac{{\rm E}(S_T^2)}{\epsilon^2}
\end{align*}
by Chebyshev's inequality.

Next we compare ${\rm E}(S_n^2)$ to ${\rm E}(S_T^2)$. We have
\begin{align*}{\rm E}(S_n^2) & = {\rm E}\left[(S_n-S_T+S_T)^2\right]
\\
& = {\r...
...S_n-S_T)^2\right]
\\
& \qquad + 2{\rm E}\left[S_T(S_n-S_T)\right]
\end{align*}
First two terms non-negative so if

\begin{displaymath}{\rm E}\left[S_T(S_n-S_T)\right] =0
\end{displaymath}

the theorem will be proved.

But
\begin{align*}S_T(S_n-S_T) & = \sum_1^{n-1} S_k(S_n-S_k)1(T=k)
\\
& = \sum_1^{n-1} S_k(S_n-S_k)f_k(Z_1,\ldots,Z_k)
\end{align*}
Notice that Sn-Sk and $S_kf_k(Z_1,\ldots,Z_k)$are independent so
\begin{align*}{\rm E}\left[S_T(S_n-S_T)\right] & = \sum_1^{n-1} {\rm E}(S_n-S_k)
\\
\qquad \times{\rm E}\left[S_k1(T=k)\right]
\\
& = 0
\end{align*}

Remark: In fact all I needed to prove Kolmogorov's inequality was to know that Sn-Sk was uncorrelated with any function of $S_1,\ldots,S_k$ (or equivalently uncorrelated with any function of $Z_1,\ldots,Z_k$).

A sequence $S_1,S_2,\ldots$ is a martingale if, for each $k \ge 1$,

\begin{displaymath}{\rm E}[S_{k+1}\vert S_1,\ldots,S_k] = S_k
\end{displaymath}

(I have yet to define conditional expectation properly but another definition is that

\begin{displaymath}{\rm E}[ (S_{k+1}-S_k)f_k(S_1,\ldots,S_k)] = 0
\end{displaymath}

for every bounded Borel function fk.)

Theorem [Kolmogorov]: Suppose $S_1,\ldots,S_n$ is a martingale with finite variances and mean 0. Then

\begin{displaymath}P(\exists k, 1 \le k \le n : \vert S_k\vert > \epsilon) \le
\frac{{\rm Var}(S_n)}{\epsilon^2}\, . \end{displaymath}

Back to strong law. Fix $\epsilon > 0$ and integer m. The events

\begin{displaymath}B_{m,N} = \bigcap_{n=m}^N \{\vert S_n-S_m\vert \le \epsilon\}
\end{displaymath}

decrease, that is, $B_{m,1} \supset B_{m,2} \supset \cdots$. So
\begin{align*}P(B_{m,\infty}) &\equiv P( \bigcap_{n=m}^\infty \{\vert S_n-S_m\vert \le \epsilon\})
\\
&
= \lim_N P(B_{m,N})
\end{align*}

Now
\begin{align*}P(B_{m,N}) & = 1-P(B_{m,N}^c)
\\
& \ge 1-\frac{{\rm Var}(S_N-S_m)...
...\ge 1-\frac{{\rm Var}(X_1)}{\epsilon^2} \sum_m^\infty \frac{1}{i^2}
\end{align*}
This shows

\begin{displaymath}P(B_{m,\infty}) \ge 1-\frac{{\rm Var}(X_1)}{\epsilon^2} \sum_m^\infty
\frac{1}{i^2}
\end{displaymath}

Since $\sum_1^\infty i^{-2}<\infty$ the right hand side of this inequality goes to 0 as m goes to $\infty$. In other words

\begin{displaymath}\lim_{m\to\infty} P(\bigcap_{n=m}^\infty \{\vert S_n-S_m\vert \le \epsilon\}) = 1
\end{displaymath}

This proves


next up previous



Richard Lockhart
2000-09-20