Stat650/403


Completely Randomized Design


Model-based methods

 
Some main points:

  • If the response variables in the population are independent, normally distributed random variables equal variances and means depending on treatment level, then the pooled t-statistic has exactly a t distribution with n1 + n2 - 2 degrees of freedom. 
  • If the above assumptions hold but the variances (and sample sizes) are unequal, then the above statistic no longer has a t distribution, but the alternate (Welch) t statistic has approximately a t distribution, with a formula for approximate degrees of freedom.
  • If the data are not normally distributed, an approximate t distribution for the t statistic still applies from the central limit theorem. 
  • The independence assumption is very important. 
  • The completely randomized design ensures that the assumptions for the t-test are at least approximately met. 

Learning objectives:

  1. Know how to apply a two-sample t test for treatment effect for a completely randomized design with two treatments.
  2. Know how to obtain t-statistic based confidence intervals. 
  3. Understand the assumptions under which the t test and confidence interval procedures work exactly or approximately.
  4. Have a working knowledge of when to use one version or the other of the two-sample t test. 
  5. Understand how the completely randomized design helps to make the t tests and confidence interval procedures work correctly. 
  6. Understand the conceptual difference between assuming a model for the population  values, as opposed to basing the analysis solely on the assignment probabilities of the experimental design used. 


Two-sample t Test and Confidence Interval


It is common in the analysis of experiments to assume the following statistical model for the response variables. For the $j$th unit receiving the $i$th treatment, $y_{ij}$ is a random variable having a normal distribution. The mean of $y_{ij}$ is $\mu_i$, so that the expected value of the response might depend on the treatment but does not depend on the specific unit in this model. The variance of $y_{ij}$ is assumed to be some constant $\sigma^2$, so that the variance is the same for each treatment group. The standard deviation of $y_{ij}$ is then $\sigma$. Further, the responses $y_{ij}$ are assumed to be independent, so that the response of one unit does not depend on the response of another unit, either in the same group or in a different group.

Let $\bar y_i$ denote the sample variance for the responses of the units assigned to treatment $i$, and let $s^2_i$ denote the sample variance for the same group. With the assumption of equal variances, the common variance $\sigma^2$ can be estimated unbiasedly with the pooled sample variance


\begin{displaymath}s^2_p = \frac{(n_1 -1)s^2_1 + (n_2-1)s^2_2}{n_1+n_2-2} \end{displaymath}

Under these assumptions the quantity

\begin{displaymath}t = \frac{(\bar y_1 - \bar y_2)-(\mu_1-\mu_2)}{s_p \sqrt{1/n_1 +1/n_2}} \end{displaymath}

has a $t$ distribution with $n_1+n_2-2$ degrees of freedom. This then serves as the reference distribution for testing a hypothesis about any specified mean treatment effect $\mu_1-\mu_2$ and for constructing a confidence interval for mean treatment effect.

To test the null hypothesis of no treatment effect

\begin{displaymath}H_0: \mu_1=\mu_2\end{displaymath}

against the two-sided alternative that there is some treatment effect in either direction
\begin{displaymath}H_A: \mu_1\ne \mu_2\end{displaymath}

the two-sample t statistic
\begin{displaymath}t = \frac{(\bar y_1 - \bar y_2)}{s_p \sqrt{1/n_1 +
1/n_2}} \end{displaymath}

is computed and compared to the $t$ distribution with $n_1+n_2-2$ degrees of freedom. If the observed value of $t$ from the experimental data is greater than the upper $\alpha/2$ point of the $t$ distribution or less than the lower $\alpha/2$ point the null hypothesis is rejected in favor of the two-sided alternative.

The test is said to be an $\alpha$-level test because with that procedure, the probability of rejecting the null hypothesis if it is actually true is $\alpha$. Because we want that probability to be small, it is conventional to choose a small value of $\alpha$ such as .01, .05, or .10.

For a one sided alternative, such as $H_A:\mu_1 > \mu_2$, then the null hypothesis is rejected if the observed value of $t$ is greater than the upper $\alpha$ point of the $t$ distribution, that is if the observed value of the $t$ statistic is far enough in the direction of the alternative.

Rather than make the arbritrary specification of a level $\alpha$, it is common to simply report the p-value, which is the probability of obtaining a value of the test statistic as extreme or more so than the one observed, in the direction of the alternative. This probability is again computed using the reference $t$ distribution that describes the distribution of the test statistic when the null hypothesis is true.

Confidence intervals based on a normal distribution assumption or approximation typically have the form

(estimate) $\pm$ $t$ $\times$ (standard error of estimate)
where $t$ is the upper $\alpha/2$ point of the reference $t$ distribution. The standard error of the estimate is the square root of the estimated variance of the estimate.

The point estimate of treatment effect is $\bar y_2 - \bar
y_1$. The confidence interval gives an interval estimate of the same treatment effect.

The standard deviation of this estimated effect, since the two sample means are independent, is $\sqrt{\sigma^2/n_1
+\sigma^2/n_2}=\sigma\sqrt{1/n_1+1/n_2}$. The standard error of the estimate, which estimates the standard deviation, is thus $s_p\sqrt{1/n_1 + 1/n_2}$.

A $(1-\alpha)$ confidence interval for treatment effect thus has the form


\begin{displaymath}
(\bar y_1 -\bar y_2) \pm t s_p \sqrt{1/n_1 + 1/n_2}
\end{displaymath}

where $t$ is the upper $\alpha/2$ point of the $t$ distribution with $n_1+n_2-2$ degrees of freedom.




Read the discussion here on the two forms of the two-sample t statistic, with the more general form not assming equal variances for different treatment groups. 

Example: Tomato experiment; two-sample t test

Because of the way the t.test function in R works, it is convenient to form two separate response variables, one for each treatment level. 

y1 <- y[1:4]  # puts the responses to the control into a variable called y1
y2 <- y[5:8]  # puts the responses to the fertilizer into a variable called y2
cbind(y1,y2) #prints out the two variables in column format
        y1    y2
[1,] 16.75 31.63
[2,] 37.35 46.14
[3,] 29.40 36.55
[4,] 24.32 53.47

First calculate the t test by hand, so to speak.  Without the assumption of equal variances, this is

diffobs <- mean(y[-s1])-mean(y[s1]) # the observed difference between treatment means
stderror <- sqrt(var(y[s1])/length(y[s1])+var(y[-s1])/length(y[-s1])) # unpooled s.e.
tobs <- diffobs/stderror # observed value of the unpooled t statistic
> tobs # print out its value
[1] 2.297385
> pt(tobs, df=5.916, lower.tail=F)  # one-sided test of treatment effect
[1] 0.03096608
>2 * pt(tobs, df=5.916, lower.tail=F)  # two-sided test
[1] 0.06193215

Now do the same for the pooled t-statistic:

> degreesfree <- n1 + n2 -2  # degrees of freedom for the pooled t statistic
> degreesfree # print it out
[1] 6
> var(y1) # the sample variance for the control group
[1] 75.03977
> var(y2)
[1] 95.30963
> ssqpooled <- ((n1-1)* var(y1) + (n2-1) * var(y2))/degreesfree # pooled sample variance
> spooled <- sqrt(ssqpooled) # pooled sample standard deviation
> tpooled <- diffobs/(spooled * sqrt(1/n1 + 1/n2)) # pooled t statistic
> ssqpooled # print them out
[1] 85.1747
> spooled
[1] 9.229014
> tpooled
[1] 2.297385

Next do each two sample t test using the R function t.test:

> t.test(y2,y1,conf.level=.90)  # two sample t test not assuming equal variances

        Welch Two Sample t-test

data:  y2 and y1
t = 2.2974, df = 5.916, p-value = 0.06193
alternative hypothesis: true difference in means is not equal to 0
90 percent confidence interval:
  2.279276 27.705724
sample estimates:
mean of x mean of y
  41.9475   26.9550

The pooled t test, assuming equal variances: 

> t.test(y2,y1,var.equal=T,conf.level=.90) # pooled t test

        Two Sample t-test

data:  y2 and y1
t = 2.2974, df = 6, p-value = 0.06132
alternative hypothesis: true difference in means is not equal to 0
90 percent confidence interval:
  2.311503 27.673497
sample estimates:
mean of x mean of y
  41.9475   26.9550

Now specify a one-sided test.  Note the smaller p-value.   The two-sided  t-test  by the same (Welch) method above has twice the p-value. 

> t.test(y2,y1,alternative="greater") # y2 is put first only because we find it
                         # easier to intrepret if  a beneficial fertilizer gives a positive effect

        Welch Two Sample t-test

data:  y2 and y1
t = 2.2974, df = 5.916, p-value = 0.03097
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
 2.279276      Inf
sample estimates:
mean of x mean of y
  41.9475   26.9550

The actual exact distribution of the t-statistic shown here and below.   This is compared to the smooth curve fitted by the assumed reference t-distribution under the null hypothesis of no treatment effect.   The R commands used to obtain this randomiztion distribution are contained in a function here.  The reference t distribution was drawn after that with the command below.   

curve(dt(x,df=5.916),-4,4,100,add=T)  #curve plots a function, dt finds the density
                                                                 #of the t distribution



randomization and theoretical distribution of t statistic


See also the example with the plant growth experiment data. 




steve thompson 2006-03-03