Reading for Today's Lecture:
Goals of Today's Lecture:
Today's notes
Suppose T is and unbiased estimate of
.
Then
Summary of Implications
What can we do to find UMVUEs when the CRLB is a strict inequality?
Example: Suppose X has a Binomial(n,p) distribution.
The score function is
Different tactic: Suppose
T(X) is some unbiased function of X. Then we have
A Binomial random variable is a sum of n iid Bernoulli(p)rvs. If
iid Bernoulli(p) then
is Binomial(n,p). Could we do better by than
by trying
for some other function T?
Try n=2. There are 4 possible values for
Y1,Y2. If
h(Y1,Y2) = T(Y1,Y2) - [Y1+Y2]/2 then
We have already shown that the sum in [] is 0!
This long, algebraically involved, method of proving that
is the UMVUE of p is one special case of
a general tactic.
To get more insight I begin by rewriting
Notice large fraction in formula is average value
of T over values of y when
is held fixed at x. Notice
that the weights in this average do not depend on p. Notice that
this average is actually
Notice conditional probabilities do not depend on p. In a
sequence of Binomial trials if I tell you that 5 of 17 were heads and the
rest tails the actual trial numbers of the 5 Heads are chosen at random
from the 17 possibilities; all of the 17 choose 5 possibilities have the
same chance and this chance does not depend on p.
Notice: with data
log likelihood
is
In the binomial situation the conditional distribution of the data
given X is the same for all values of
;
we
say this conditional distribution is free of
.
Defn: Statistic T(X) is sufficient for the model
if conditional distribution of
data X given T=t is free of
.
Intuition: Data tell us about
if
different values of
give different distributions to X. If two
different values of
correspond to same density or cdf
for X we cannot distinguish these two values of
by examining X. Extension of this notion: if two
values of
give same conditional distribution of X given Tthen observing T in addition to X doesn't improve our ability to
distinguish the two values.
Mathematically Precise version of this intuition: Suppose T(X)is sufficient statistic and S(X) is any estimate or confidence interval or ... If you only know value of T then:
You can carry out the first step only if the statistic T is
sufficient; otherwise you need to know the true value of
to
generate X*.
Example 1:
iid Bernoulli(p).
Given
the indexes of the y successes have the
same chance of being any one of the
possible subsets of
.
This chance does not depend on p so
is a sufficient statistic.
Example 2:
iid
.
Joint distribution of
is multivariate
normal. All entries of mean vector are
.
Variance covariance
matrix can be partitioned as
You can now compute the conditional means and variances of Xi given
and use the fact that the conditional law is multivariate
normal to prove that the conditional distribution of the data given
is multivariate normal with mean vector all of whose
entries are x and variance-covariance matrix given
by
.
Since this does not depend
on
we find that
is sufficient.
WARNING: Whether or not statistic is sufficient depends on
density function and on
.
Theorem: [Rao-Blackwell] Suppose S(X) is a sufficient statistic
for model
.
If T is an
estimate of
then:
Proof: Review conditional distributions: abstract definition of conditional expectation is:
Defn: E(Y|X) is any function of X such that
Fact: If X,Y has joint density
fX,Y(x,y) and
conditional density f(y|x) then
Proof:
Think of E(Y|X) as average Y holding X fixed. Behaves like ordinary expected value but functions of X only are like constants.