|
Stat650/403
Completely Randomized Design
In this section we consider an experiment with two treatment
levels. The N units in the study are randomly assigned to the
treatments so that n1 of the units are assigned treatment 1 and the
other n2 units are assigned treatment 2.
Some basic ideas:
- The assignment for one of the groups, such as treatment 1,
is a simple random sample of n1 units from the N study
units. The remaining n2 units are assigned treatment 2.
- Thus, every possible collection of n1 units has
equal probability of being the group assigned to treatment 1.
- The randomization in the assignment overcomes
biases and effects of unobserved variables.
- Replication of the same treatment to more than one
unit results in increased precision for estimating treatment effects
and enables the calculation of hypothesis tests and confidence
intervals.
- It is important that the experiment be set up so that the response
of a unit to a treatment depends only on the treatment assigned to that
unit, not to the combination of treatments assigned to other
units. This is an assumption that underlies almost every
experiment.
- The null hypothesis of no treatment effect determines
a corresponding sampling distribution of a test statistic.
- A hypothesis testing procedure should have the
property that, if the null hypothesis is true, we should have a low
probability of rejecting it. This probability is called the significance
level.
- The probability under the null hypothesis of obtaining a
value of the test statistic as high or higher than the one we observe
is called the p-value.
- A confidence interval for treatment effect can be
constructed from a hypothesis test procedure by testing every
possible size of treatment effect. Those values that are not
rejected are put in the confidence interval, while those that are
rejected are excluded.
Learning goals:
- Be able to assign treatments in a completely randomized
design with two levels of one factor.
- Understand the concept of the reference sampling
distribution that would be true if the null hypothesis is true.
- Be able to compute (with software, not by hand) the p-value
for the randomization test of no treatment effect.
- Be able to compute (with software, not by hand) the p-value
for the randomization test of any specified treatment effect.
- Understand how a statistical test can be used to construct
a confidence interval with a specified level of confidence.
Randomization Test
There is a good description and discussion of
randomization or
permutation tests in Chapter
18 of Moore et. al. (2003). The
permutation/randomization test part starts with section 18.6 on about
the 50th page of the chapter.
Reference:
D.S. Moore, G. P. McCabe, W.M. Duckworth, S.L. Sclove
(2003). The practice of business statistics: using data for
decisions.
W. H. Freeman.
Example: Tomato experiment; Testing different hypotheses
about effect of treatment with the randomization test
tomato <-
read.table(file=url("http://www.stat.sfu.ca/~thompson/stat403-650/data/tomato.txt"),header=T)
# the above is all one line and reads in the tomato data straight from
the web site
# and puts in in a data frame called "tomato"
tomato # this prints out the data
tr <- tomato$tr # give a simpler name to the treatment variable
within the data frame
y <- tomato$y # alternatively, one could simply type
"attach(tomato)"
N <- 8 # the number of units in the experiment
n1 <- 4 # the number of units in the control group
s1 <- 1:n1 # the units in the control group, corresponding to row
number in the data
diffobs <- mean(y[-s1])-mean(y[s1]) # the observed difference in the
treatment sample means
diff <- numeric(0) # tell R that diff is a vector, or colunm of
numbers
# now make a loop to made 4000 random assignments of the units to
control and fertilizer
# and compute the difference in sample means for the two groups, as if
treatment had no effect
for(k in 1:4000){s1 <- sample(N,n1);diff[k] <-
mean(y[-s1])-mean(y[s1])}
hist(diff) # make a histrogram of the sampling distribution the test
statistic "diff" has
# when there is
no treatment effect
points(diffobs, 0, cex=4, pch=21, bg="blue") # draw the observed
value of the test statistic
abline(v=diffobs, col="blue") # draw a line at that value
# make a function to calculate the p-value:
pvalue <- function(){length(diff[diff>=diffobs])/length(diff)}
# call that function
pvalue()
effect0 <- 10 # try a new null hypothesis, that the treatment effect
is exactly 10 kg
y0 <- c(y[s1],y[-s1]-effect0) # this is the response each unit
would have if all units received the
# control treatment, under the null hypothesis of treatment effect = 10
y0 # prints the value each unit would give if it gets the control
treatment
y0 + effect0 # prints the response of each unit if it gets the
fertilizer, under H0.
# put this in a loop and compute the sampling distribution of the test
statistic
# under the new null hypothesis that the treatment effect is 10:
for(k in 1:4000){s1 <- sample(N,n1);diff[k] <-
mean(y0[-s1]+effect0)-mean(y0[s1])}
hist(diff)
points(diffobs, 0, cex=4, pch=21, bg="blue")
abline(v=diffobs, col="blue")
points(effect0, 0, cex=4, pch=24)
pvalue()
The p-value was about 0.045 and the histogram of the reference
distribution shows its nonnormal irregularities:

Confidence Interval from Randomization Test
Example: Tomato experiment; confidence intervals based on the
randomization test
The function below puts the above commands together.
###################################################################
## randomization test and confidence interval
## effect0 is the hypothesized treatment effect; its default is 0.
## b is the number of simulation runs; its default is 4000)
## xmin and xmax are the limits for the histogram horizontal axis
###################################################################
randtest <- function(tr, y, effect0 = 0, b=4000, xmin=-30, xmax=60)
{
N <- length(y)
s1 <- (1:N)[tr == 1]
n1 <- length(s1)
diffobs <- mean(y[-s1])-mean(y[s1])
y0 <- y
y0[s1] <- y[s1]
y0[-s1] <- y[-s1] - effect0
diff <- numeric(b)
for(k in 1:b)
{
s1 <- sample(N,n1)
diff[k] <-
mean(y0[-s1]+effect0)-mean(y0[s1])
}
hist(diff, xlim=c(xmin,xmax), col="grey")
points(diffobs,0,cex=4,pch=21,bg="blue")
abline(v=diffobs, lwd =2, col="blue")
pvalueup <- length(diff[diff >=
diffobs])/length(diff)
pvaluedown <- length(diff[diff <=
diffobs])/length(diff)
pvalue <- min(pvalueup, pvaluedown)
points(effect0,0,cex=4,pch=24,bg="white")
if(pvalue <= .05)
points(effect0,0,cex=4,pch=24,bg="red")
print(c(effect0, pvalue))
}
#########################################################
The following commands call the function repeatedly, for a sequence of
hypothesized effects
effects <- -10:40 # makes the variable "effects" equal the the
sequence -10, -9, ..., 39, 40.
effects # prints them out
steps <- length(effects) # there should be 51 numbers in the
sequence
for (i in 1:steps) randtest(tr,y,effect0=effects[i]) # calls the
function randtest for each effects value
# from the resulting printout we can see that the lower limit of the
confidence interval
# is between 0 and 1
# we can refine the calculation by using a finer sequence between 0 and
1
effects <- seq(0,1,.05)
steps <- length(effects)
for (i in 1:steps) randtest(tr,y,effect0=effects[i])
# and for the upper limit, a finer sequence (in steps of .05) between
29 and 30
effects <- seq(29,30,.05)
steps <- length(effects)
for (i in 1:steps) randtest(tr,y,effect0=effects[i])
The 90 percent confidence interval for treatment effect
consists of all the hypothesized values of effect that we don't
reject. For each test, we reject if the one-tailed p-value
in the most extreme direction is less than .05. Since we
would also reject for any observed value in the .05 region of the other
tail, we have .10 probability of rejecting any hypothesized effect
value, if it was true. This gives the 1-.10=.90 confidence
level.
|