Goals for today:
Definition: In the general problem of testing
against
the level of a test function
is
Application of the NP lemma: In the
model
consider
and
or
.
The UMP level
test of
against
is
Proof: I showed there is a function
which is increasing and for which
Definition: The family
has monotone likelihood ratio with respect to a statistic T(X)if for each
the likelihood ratio
is a monotone increasing function of T(X).
Theorem: For a monotone likelihood ratio family the
Uniformly Most Powerful level
test of
(or of
)
against the alternative
is
A typical family where this will work is a one parameter exponential
family. In almost any other problem the method doesn't work and there
is no uniformly most powerful test. For instance to test
against the two sided alternative
there is no UMP level
test. If there were its power at
would have to be
as high as that of the one sided level
test and so its rejection
region would have to be the same as that test, rejecting for large
positive values of
.
But it also has to have power as
good as the one sided test for the alternative
and
so would have to reject for large negative values of
.
This would make its level too large.
The favourite test is the usual 2 sided test which rejects for large values
of
with the critical value chosen appropriately.
This test maximizes the power subject to two constraints: first, that
the level be
and second that the test have power which is
minimized at
.
This second condition is really that the
power on the alternative be larger than it is on the null.
Definition: A test
of
against
is unbiased level
if it has level
and,
for every
we have
When testing a point null hypothesis like
this requires
that the power function be minimized at
which will mean that
if
is differentiable then
In the
problem there is a version of the Neyman Pearson
lemma which proves that the Uniformly Most Powerful Unbiased
test rejects for large values of
.
A test
is a Uniformly Most Powerful Unbiased level
test if
Conclusion: The two sided z test which rejects
if
What good can be said about the t-test? It's UMPU.
Suppose
are iid
and that
we want to test
or
against
.
Notice that
the parameter space is two dimensional and that the boundary
between the null and alternatives is
If a test has
for all
and
for
all
then we must have
for all
because the power function of any test must
be continuous.
It is possible to use these facts and ideas of sufficiency and completeness to prove that the t test is UMPU for the one and two sided problems.
For general composite hypotheses optimality theory is not usually
successful in producing an optimal test. instead we look for
heuristics to guide our choices. The simplest approach is to consider
the likelihood ratio
Example 1: In the
problem suppose we
want to test
against
.
(Remember there is a UMP
test.) The log likelihood function is
Example 2: In the
problem suppose we
make the null
.
Then the value of
is simply 0 while
the maximum of the log-likelihood over the alternative
occurs at
.
This gives
Example 3: For the
problem testing
against
we must find two estimates of
.
The maximum of the likelihood over the alternative
occurs at the global mle
.
We find
We also need to maximize
over the null hypothesis.
Recall
Notice that if n is large we have
This is a general phenomenon when the null hypothesis being tested is of the form
.
Here is the general
theory. Suppose that the vector of p+q parameters
can
be partitioned into
with
a vector
of p parameters and
a vector of q parameters.
To test
we find two mles of
.
First the
global mle
maximizes
the likelihood over
(because typically the probability that
is exactly
is 0).
Now we maximize the likelihood over the null hypothesis, that is
we find
to maximize
Now suppose that the true value of
is
(so that the null hypothesis is true). The score function is a
vector of length p+q and can be partitioned as
.
The Fisher information matrix can be partitioned as
According to our large sample
theory for the mle we have
Theorem: The log-likelihood ratio statistic
Aside:
Theorem: Suppose that
with
non-singular and Mis a symmetric matrix. If
then Xt M X has a
distribution with degrees of freedom
.
Proof: We have X=AZ where
and
Z is standard multivariate normal. So
Xt M X = Zt At M A Z.
Let Q=At M A.
Since
the condition in the theorem is actually
The matrix Q is symmetric and so can be written in the form
where
is a diagonal matrix containing the
eigenvalues of Q and P is an orthogonal matrix whose columns
are the corresponding orthonormal eigenvectors. It follows that we can
rewrite
We have established that the general distribution of any
quadratic form Xt M X is a linear combination of
variables.
Now go back to the condition QQ=Q. If
is an eigenvalue
of Q and
is a corresponding eigenvector then
but also
.
Thus
.
It follows that either
or
.
This means
that the weights in the linear combination are all 1 or 0 and that
Xt M X has a
distribution with degrees of freedom,
,
equal to the number of
which are equal to 1. This is
the same as the sum of the
so
In the application
is
the Fisher information and
where