|
Stat650/403/890
Factorial Treatment Structure
Reading: Read the section on two-way anova (Section
16.2, pages 179-185) in Julian Faraway's Practical
Regression and ANOVA Using R. For supplemental
reading see pages 1-35 Chapter
9 on two-factor experiments in Carl Schwarz' notes.
Some main points: If there are more than one
treatment factors we may be interested not just in whether the
treatment combinations have effects but in interpretation of the
possible effects into main effects and interaction effects.
We will consider the completely randomized design in which the
units are assigned at random to the different treatment
combination. In a full factorial design, each possible
treatment combination is assigned to at least one unit.
Learning goals: Recognize factorial
treatment structures when you see them. Know how to analyze
a two-factor completely randomized experimental design.
Become familiar with useful graphical and diagnostic methods for
making sense of such experiments. Understand the
assumptions in the model commonly assumed for the data in these
experiments. Understand the overall ideas of multiple
comparison methods for these experiments. Understand the
motivation for transormations of the data, and know how to apply
the log transformation.
Example: Poison and poison treatment
The data for the rat poison and treatment experiment can
be found as the data set "poisson" in the R library
"boot" or as "poisson.data" in the library
"BHH2". If you want to work on these data
in parallel with Faraway, the following commands may be useful.
library(boot) attach(poisons)
Look
at the data with simple boxplots and then interaction
plots:
boxplot(time~poison) boxplot(time~treat) interaction.plot(treat,poison,time) interaction.plot(poison,treat,time)
Alternatively,
make a plot of survival time by treatment with colours for the
different poisons. "treat" is coded A, B, C, D,
and as.numeric(treat) changes that to 1, 2, 3, 4.
plot(as.numeric(treat),time,bg=as.numeric(poison),pch=21,cex=2)
Fit
a linear model with main and interaction effects, and name the
resulting linear model output object as, say, "rats".
Then look at the analysis of variance table.
rats
<- lm(time ~ poison*treat) anova(rats)
Examine the
distribution of the residuals to see if it looks normally
distributed, and then plot the residuals above the fitted
values to see if the variance tends to be larger where the
mean survival time is larger. The normal quantile-quantile
plot (qqnorm) should give a straight line if the residuals are
normal.
hist(rats$res) qqnorm(rats$res) plot(rats$fitted,
rats$res, xlab="fitted",
ylab="residuals")
Since the variance
indeed does not look constant, try a transformation, taking
logarithms of the data.
rats
<- lm(log(time) ~
poison*treat) anova(rats) hist(log(time)) qqnorm(rats$res) plot(rats$fitted,
rats$res, xlab="fitted", ylab="residuals",main="log
response")
Here is one more transformation to
try, using the reciprocal of survival time, which could be
interpreted as death rate.
rats
<- lm(1/time ~
poison*treat) anova(rats) summary(rats) hist(1/time) qqnorm(rats$res) plot(rats$fitted,
rats$res, xlab="fitted",
ylab="residuals",main="1/response")
interaction.plot(treat,poison,1/time) interaction.plot(poison,treat,1/time)
Here
is the model assuming no interaction:
rats
~ lm(1/time ~ poison + treat) anova(rats) summary(rats)
Here
are the commands again, this time with the output tables:
>
rats <- lm(1/time ~ poison * treat) >
anova(rats) Analysis of Variance Table
Response:
1/time
Df Sum Sq Mean Sq F value Pr(>F) poison
2 34.877 17.439 72.6347 2.310e-13 *** treat
3 20.414 6.805 28.3431 1.376e-09 *** poison:treat
6 1.571 0.262 1.0904
0.3867 Residuals 36 8.643
0.240 --- Signif. codes: 0 ‘***’ 0.001 ‘**’
0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The
interaction was not significant. The model fitted without
interaction looks like this:
>
rats <- lm(1/time ~ poison + treat) >
anova(rats) Analysis of Variance Table
Response:
1/time
Df Sum Sq Mean Sq F value Pr(>F) poison
2 34.877 17.439 71.708 2.865e-14 *** treat
3 20.414 6.805 27.982 4.192e-10 *** Residuals
42 10.214 0.243 --- Signif. codes: 0
‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’
1 > summary(rats)
Call: lm(formula = 1/time ~
poison + treat)
Residuals:
Min 1Q
Median 3Q
Max -0.82757 -0.37619 0.02116 0.27568
1.18153
Coefficients:
Estimate Std. Error t value Pr(>|t|) (Intercept)
2.6977 0.1744 15.473 <
2e-16 *** poison2
0.4686 0.1744 2.688
0.01026 * poison3
1.9964 0.1744 11.451 1.69e-14
*** treatB -1.6574
0.2013 -8.233 2.66e-10 *** treatC
-0.5721 0.2013 -2.842 0.00689
** treatD -1.3583
0.2013 -6.747 3.35e-08 *** --- Signif. codes: 0
‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’
1
Residual standard error: 0.4931 on 42 degrees of
freedom Multiple R-Squared: 0.8441,
Adjusted R-squared: 0.8255 F-statistic: 45.47 on 5 and 42 DF,
p-value: 6.974e-16
For confidence intervals
on the difference between treatment effects, the standard
error for the difference is given in the summary table as .201.
For an interval with .95 confidence, the upper .025 point in the
t distribution with 36 degrees of freedom (the df for residuals
from the anova table) is obtained with either of
>
qt(.975,42) [1] 2.018082 > qt(.025,42,lower.tail=F) [1]
2.018082
Thus the half width of the confidence
interval for the difference between any two treatment main
effects is
> 2.018082 * .2013 [1] 0.4062399
The
confidence interval will be the difference between the two
treatment effect coefficients plus or minus .406. Any pair
of treatment effect coefficient having a difference greater than
.406 will have a confidence interval not including zero, so that
the null hypothesis of no difference between those two treatment
effects would be rejected at the .05 level.
The
conservative Bonferonni multiple comparison procedure for
comparing all six pairs of treatment effects would divide the
alpha value by the number of comparisons giving:
>
.05/6 [1] 0.008333333
Putting half of that in each tail
produces a t value of
> qt(0.008333333/2, 42,
lower.tail=F) [1] 2.769016
and an interval half width
of
> 2.769016 * .2013 [1] 0.5574029
Thus, any
difference greater than .557 in magnitude would be significant
from the multiple comparison point of view. Notice this
does not change the conclusion on which of the poison treatments
are significantly different.
Example: Vitamin C dose and type
In this experiment 60 guinea pigs were randomly assigned to
treatment combinations with different doses and delivery methods
of Vitamin C. The response is the length of odontoblasts
(teeth) in each of 10 guinea pigs at each of three dose levels of
Vitamin C (0.5, 1, and 2 mg) with each of two delivery methods
(orange juice or ascorbic acid). Thus there are six
different treatment combinations, and 10 guinea pigs are assigned
to each.
Look at the data
set:
?ToothGrowth ToothGrowth
Fit
a model with main effects and interaction:
toothmodel
<-
lm(len~supp*dose,data=ToothGrowth)
alternatively: attach(ToothGrowth) toothmodel
<-
lm(len~supp*dose)
anova(toothmodel)
interaction.plot(dose,supp,len) interaction.plot(supp,dose,len)
|