No Title

$next$ $up$ $previous$

STAT 450

Lecture 28

Goals for today:

Introduce concepts of Hypothesis Testing
Define power, level, most powerful.
State and prove the Neyman Pearson Lemma

Last time:

Exponential Families, Lehmann-Scheffé Theorem

If $X_1,\ldots,X_n$ are iid with density

$\begin{displaymath}f(x_j,\theta) = h(x_j) \exp\{\sum_1^p a_i(\theta)S_i(x_j)+c(\theta)\} \end{displaymath}$

then the joint density of the data is

$\begin{displaymath}\prod_j h(x_j) \exp\{\sum_1^p a_i(\theta) \sum_j S_i(x_j)+nc(\theta)\} \end{displaymath}$

If the range of the function $(a_1(\theta),\ldots,a_p(\theta))$ (as $\theta$ varies over $\Theta$ contains a (hyper-) rectangle in R^p then the statistic

$\begin{displaymath}(\sum_j S_1(X_j), \ldots, \sum_j S_p(X_j)) \end{displaymath}$

is complete and sufficient.

The Lehmann-Scheffé Theorem

Theorem: If S is a complete sufficient statistic for some model and h(S) is an unbiased estimate of some parameter $\phi(\theta)$ then h(S) is the UMVUE of $\phi(\theta)$ .

Example: In the $N(\mu,\sigma^2)$ example $\bar{X},s$ is complete and sufficient so the UMVUEs of $\mu, \sigma^2,\sigma^4$ are $\bar{X}$ , s² and s⁴/2.

Criticism of Unbiasedness

1.

The UMVUE can be inadmissible for squared error loss meaning that there is a (biased, of course) estimate whose MSE is smaller for every parameter value. An example is the UMVUE of $\phi=p(1-p)$ which is $\hat\phi =n\hat{p}(1-\hat{p})/(n-1)$ . The MSE of

$\begin{displaymath}\tilde{\phi} = \min(\hat\phi,1/4) \end{displaymath}$

is smaller than that of $\hat\phi$ .

2.

There are examples where unbiased estimation is impossible. The log odds in a Binomial model is $\phi=\log(p/(1-p))$ . Since the expectation of any function of the data is a polynomial function of p and since $\phi$ is not a polynomial function of p there is no unbiased estimate of $\phi$

3.

The UMVUE of $\sigma$ is not the square root of the UMVUE of $\sigma^2$ . This method of estimation does not have the parameterization equivariance that maximum likelihood does.

4.

Unbiasedness is irrelevant (unless you plan to average together many estimators). The property is an average over possible values of the estimate in which positive errors are allowed to cancel negative errors. An exception to this criticism is that if you plan to average a number of estimators to get a single estimator then it is a problem if all the estimators have the same bias. In assignment 5 you have the one way layout example in which the mle of the residual variance averages together many biased estimates and so is very badly biased. That assignment shows that the solution is not really to insist on unbiasedness but to consider an alternative to averaging for putting the individual estimates together.

Minimal Sufficiency

In any model the statistic $S(X)\equiv X$ is sufficient. In any iid model the vector of order statistics $X_{(1)}, \ldots, X_{(n)}$ is sufficient. In the $N(\mu,1)$ model then we have three possible sufficient statistics:

1.: $S_1 = (X_1,\ldots,X_n)$ .
2.: $S_2 = (X_{(1)}, \ldots, X_{(n)})$ .
3.: $S_3 = \bar{X}$ .

Notice that I can calculate S₃ from the values of S₁ or S₂but not vice versa and that I can calculate S₂ from S₁ but not vice versa. It turns out that $\bar{X}$ is a minimal sufficient statistic meaning that it is a function of any other sufficient statistic. (You can't collapse the data set any more without losing information about $\mu$ .)

To recognize minimal sufficient statistics you look at the likelihood function:

Fact: If you fix some particular $\theta^*$ then the log likelihood ratio function

$\begin{displaymath}\ell(\theta)-\ell(\theta^*) \end{displaymath}$

is minimal sufficient. WARNING: the function is the statistic.

The subtraction of $\ell(\theta^*)$ gets rid of those irrelevant constants in the log-likelihood. For instance in the $N(\mu,1)$ example we have

$\begin{displaymath}\ell(\mu) = -n\log(2\pi)/2 - \sum X_i^2/2 + \mu\sum X_i -n\mu^2/2 \end{displaymath}$

This depends on $\sum X_i^2$ which is not needed for the sufficient statistic. Take $\mu^*=0$ and get

$\begin{displaymath}\ell(\mu) -\ell(\mu^*) = \mu\sum X_i -n\mu^2/2 \end{displaymath}$

This function of $\mu$ is minimal sufficient. Notice that from $\sum X_i$ you can compute this minimal sufficient statistic and vice versa. Thus $\sum X_i$ is also minimal sufficient.

FACT: A complete sufficient statistic is also minimal sufficient.