Undergraduate version of central limit theorem:
if
are iid from a population with mean
and standard
deviation
then
has approximately
a normal distribution.
Also Binomial
random variable
has approximately a
distribution.
Precise meaning of statements
like ``
and
have approximately the same distribution''?
Desired meaning:
and
have nearly the same cdf.
But care needed.
Q1) If
is a large number is the
distribution close to the distribution of
?
Q2) Is
close to the
distribution?
Q3) Is
close to
distribution?
Q4) If
is the distribution
of
close to that of
?
Answers depend on how close close needs to be so it's a matter of definition.
In practice the usual sort of approximation we want
to make is to say that some random variable
, say,
has nearly some continuous distribution, like
.
So: want to know probabilities like
are nearly
.
Real difficulty: case of discrete random variables or infinite dimensions: not done in this course.
Mathematicians' meaning of close:
Either they can provide an upper bound on the distance between the two things or they are talking about taking a limit.
In this course we take limits.
Definition: A sequence of random variables
converges in
distribution to a random variable
if
Now let's go back to the questions I asked:
Summary: to derive approximate distributions:
Show sequence of rvs
converges to some
.
The limit distribution (i.e. dstbon of
) should
be non-trivial, like say
.
Don't say:
is approximately
.
Do say:
converges to
in distribution.
The Central Limit Theorem
If
are iid with mean 0 and variance 1 then
converges in distribution to
. That is,
Proof: As before
Edgeworth expansions
In fact if
then
![\begin{multline*}
\log(\phi(t)) \approx
\\ [-t^2/2 -i\gamma t^3/6 +\cdots]
\\
-[\cdots]^2/2 +\cdots
\end{multline*}](img62.gif)
Now apply this calculation to
Remarks:
Multivariate convergence in distribution
Definition:
converges in distribution to
if
This is equivalent to either of
Cramér Wold Device:
converges in distribution to
for each
or
Convergence of characteristic functions:
Extensions of the CLT
Slutsky's Theorem: If
converges in distribution to
and
converges in distribution (or in probability) to
, a
constant, then
converges in distribution to
. More
generally, if
is continuous then
.
Warning: the hypothesis that the limit of
be constant is essential.
Definition: We say
converges to
in probability if
The fact is that for
constant convergence in distribution and in
probability are the same. In general convergence in probability implies
convergence in distribution. Both of these are weaker than almost sure
convergence:
Definition: We say
converges to
almost surely if
The delta method: Suppose:
If
and
then
is
matrix of first derivatives of components of
.
Example: Suppose
are a sample from a population with
mean
, variance
, and third and fourth central moments
and
. Then
Take
.
Then
converges to
.
Take
. Then
Define
.
Then
. The gradient of
has components
. This leads to
![\begin{multline*}
n^{1/2}(s^2-\sigma^2) \approx
\\
n^{1/2}[1, -2\mu]
\left[\b...
...ne{X^2} -
(\mu^2 + \sigma^2)
\\
\bar{X} -\mu
\end{array}\right]
\end{multline*}](img126.gif)
Remark: In this sort of problem it is best to learn to recognize that the
sample variance is unaffected by subtracting
from each
. Thus
there is no loss in assuming
which simplifies
and
.
Special case: if the observations are
then
and
. Our calculation has