No Title

STAT 450

Lecture 29

Goals for today:

Introduce concepts of Hypothesis Testing
Define power, level, most powerful.
State and prove the Neyman Pearson Lemma

Hypothesis Testing

Hypothesis testing is a statistical problem where you must choose, on the basis of data X, between two alternatives. We formalize this as the problem of choosing between two hypotheses: $H_o: \theta\in \Theta_0$ or $H_1: \theta\in\Theta_1$ where $\Theta_0$ and $\Theta_1$ are a partition of the model $P_\theta; \theta\in \Theta$ . That is $\Theta_0 \cup \Theta_1 = \Theta$ and $\Theta_0 \cap\Theta_1=\{\}$ .

A rule for making the required choice can be described in two ways:

1.

In terms of the set

$\begin{displaymath}C=\{X: \mbox{we choose $\Theta_1$ if we observe $X$}\} \end{displaymath}$

called the rejection or critical region of the test.

2.

In terms of a function $\phi(x)$ which is equal to 1 for those x for which we choose $\Theta_1$ and 0 for those x for which we choose $\Theta_0$ .

For technical reasons which will come up soon I prefer to use the second description. However, each $\phi$ corresponds to a unique rejection region $R_\phi=\{x:\phi(x)=1\}$ .

The Neyman Pearson approach to hypothesis testing which we consider first treats the two hypotheses asymmetrically. The hypothesis H_o is referred to as the null hypothesis (because traditionally it has been the hypothesis that some treatment has no effect).

Definition: The power function of a test $\phi$ (or the corresponding critical region $R_\phi$ ) is

$\begin{displaymath}\pi(\theta) = P_\theta(X\in R_\phi) = E_\theta(\phi(X)) \end{displaymath}$

We are interested here in optimality theory, that is, the problem of finding the best $\phi$ . A good $\phi$ will evidently have $\pi(\theta)$ small for $\theta\in\Theta_0$ and large for $\theta\in\Theta_1$ . There is generally a trade off which can be made in many ways, however.

Simple versus Simple testing

Finding a best test is easiest when the hypotheses are very precise.

Definition: A hypothesis H_i is simple if $\Theta_i$ contains only a single value $\theta_i$ .

The simple versus simple testing problem arises when we test $\theta=\theta_0$ against $\theta=\theta_1$ so that $\Theta$ has only two points in it. This problem is of importance as a technical tool, not because it is a realistic situation.

Suppose that the model specifies that if $\theta=\theta_0$ then the density of X is f₀(x) and if $\theta=\theta_1$ then the density of X is f₁(x). How should we choose $\phi$ ? To answer the question we begin by studying the problem of minimizing the total error probability.

We define a Type I error as the error made when $\theta=\theta_0$ but we choose H₁, that is, $X\in R_\phi$ . The other kind of error, when $\theta=\theta_1$ but we choose H₀ is called a Type II error. We define the level of a simple versus simple test to be

$\begin{displaymath}\alpha = P_{\theta_0}(\mbox{We make a Type I error}) \end{displaymath}$

$\begin{displaymath}\alpha = P_{\theta_0}(X\in R_\phi) = E_{\theta_0}(\phi(X)) \end{displaymath}$

The other error probability is denoted $\beta$ and defined as

$\begin{displaymath}\beta= P_{\theta_1}(X\not\in R_\phi) = E_{\theta_1}(1-\phi(X)) \end{displaymath}$

Suppose we want to minimize $\alpha+\beta$ , the total error probability. We want to minimize

$\begin{displaymath}E_{\theta_0}(\phi(X))+E_{\theta_1}(1-\phi(X)) = \int[ \phi(x) f_0(x) +(1-\phi(x))f_1(x)] dx \end{displaymath}$

The problem is to choose, for each x, either the value 0 or the value 1, in such a way as to minimize the integral. But for each x the quantity

$\begin{displaymath}\phi(x) f_0(x) +(1-\phi(x))f_1(x) \end{displaymath}$

can be chosen either to be f₀(x) or f₁(X). To make it small we take $\phi(x) = 1$ if f₁(x)> f₀(x) and $\phi(x) = 0$ if f₁(x) < f₀(x). It makes no difference what we do for those x for which f₁(x)=f₀(x). Notice that we can divide both sides of these inequalities to rephrase the condition in terms of the likelihood ration f₁(x)/f₀(x).

Theorem: For each fixed $\lambda$ the quantity $\beta+\lambda\alpha$ is minimized by any $\phi$ which has

$\begin{displaymath}\phi(x) =\left\{\begin{array}{ll} 1 & \frac{f_1(x)}{f_0(x)} >... ...bda \\ 0 & \frac{f_1(x)}{f_0(x)} < \lambda \end{array}\right. \end{displaymath}$

Neyman and Pearson suggested that in practice the two kinds of errors might well have unequal consequences. They suggested that rather than minimize any quantity of the form above you pick the more serious kind of error, label it Type I and require your rule to hold the probability $\alpha$ of a Type I error to be no more than some prespecified level $\alpha_0$ . (This value $\alpha_0$ is typically 0.05 these days, chiefly for historical reasons.)

The Neyman and Pearson approach is then to minimize beta subject to the constraint $\alpha \le \alpha_0$ . Usually this is really equivalent to the constraint $\alpha=\alpha_0$ (because if you use $\alpha<\alpha_0$ you could make R larger and keep $\alpha \le \alpha_0$ but make $\beta$ smaller. For discrete models, however, this may not be possible.

Example: Suppose X is Binomial(n,p) and either p=p₀=1/2 or p=p₁=3/4. If R is any critical region (so R is a subset of $\{0,1,\ldots,n\}$ ) then

$\begin{displaymath}P_{1/2}(X\in R) = \frac{k}{2^n} \end{displaymath}$

for some integer k. If we want $\alpha_0=0.05$ with say n=5 for example we have to recognize that the possible values of $\alpha$ are 0, 1/32=0.03125, 2/32=0.0625 and so on. For $\alpha_0=0.05$ we must use one of three rejection regions: R₁ which is the empty set, R₂ which is the set x=0 or R₃ which is the set x=5. These three regions have $\alpha$ equal to 0, 0.3125 and 0.3125 respectively and $\beta$ equal to 1, 1-(1/4)⁵ and 1-(3/4)⁵ respectively so that R₃ minimizes $\beta$ subject to $\alpha<0.05$ . If we raise $\alpha_0$ slightly to 0.0625 then the possible rejection regions are R₁, R₂, R₃ and a fourth region $R_4=R_2\cup R_3$ . The first three have the same $\alpha$ and $\beta$ as before while R₄ has $\alpha=\alpha_0=0.0625$ an $\beta=1-(3/4)^5-(1/4)^5$ . Thus R₄ is optimal! The trouble is that this region says if all the trials are failures we should choose p=3/4 rather than p=1/2 even though the latter makes 5 failures much more likely than the former.

The problem in the example is one of discreteness. Here's how we get around the problem. First we expand the set of possible values of $\phi$ to include numbers between 0 and 1. Values of $\phi(x)$ between 0 and 1 represent the chance that we choose H₁ given that we observe x; the idea is that we actually toss a (biased) coin to decide! This tactic will show us the kinds of rejection regions which are sensible. In practice we then restrict our attention to levels $\alpha_0$ for which the best $\phi$ is always either 0 or 1. In the binomial example we will insist that the value of $\alpha_0$ be either 0 or $P_{\theta_0} ( X\ge 5)$ or $P_{\theta_0} ( X\ge 4)$ or ...

Definition: A hypothesis test is a function $\phi(x)$ whose values are always in [0,1]. If we observe X=x then we choose H₁ with conditional probability $\phi(X)$ . In this case we have

$\begin{displaymath}\pi(\theta) = E_\theta(\phi(X)) \end{displaymath}$

$\begin{displaymath}\alpha = E_0(\phi(X)) \end{displaymath}$

and

$\begin{displaymath}\beta = E_1(\phi(X)) \end{displaymath}$

$next$ $up$ $previous$

Richard Lockhart
1999-11-22