Stat650/403

Completely Randomized Design

Model-based methods

Some main points:

If the response variables in the population are independent, normally distributed random variables equal variances and means depending on treatment level, then the pooled t-statistic has exactly a t distribution with n1 + n2 - 2 degrees of freedom.
If the above assumptions hold but the variances (and sample sizes) are unequal, then the above statistic no longer has a t distribution, but the alternate (Welch) t statistic has approximately a t distribution, with a formula for approximate degrees of freedom.
If the data are not normally distributed, an approximate t distribution for the t statistic still applies from the central limit theorem.
The independence assumption is very important.
The completely randomized design ensures that the assumptions for the t-test are at least approximately met.

Learning objectives:

Know how to apply a two-sample t test for treatment effect for a completely randomized design with two treatments.
Know how to obtain t-statistic based confidence intervals.
Understand the assumptions under which the t test and confidence interval procedures work exactly or approximately.
Have a working knowledge of when to use one version or the other of the two-sample t test.
Understand how the completely randomized design helps to make the t tests and confidence interval procedures work correctly.
Understand the conceptual difference between assuming a model for the population values, as opposed to basing the analysis solely on the assignment probabilities of the experimental design used.

Two-sample t Test and Confidence Interval

It is common in the analysis of experiments to assume the following statistical model for the response variables. For the th unit receiving the th treatment, $y_{ij}$ is a random variable having a normal distribution. The mean of $y_{ij}$ is $\mu_i$ , so that the expected value of the response might depend on the treatment but does not depend on the specific unit in this model. The variance of $y_{ij}$ is assumed to be some constant $\sigma^2$ , so that the variance is the same for each treatment group. The standard deviation of $y_{ij}$ is then $\sigma$ . Further, the responses $y_{ij}$ are assumed to be independent, so that the response of one unit does not depend on the response of another unit, either in the same group or in a different group.

Let $\bar y_i$ denote the sample variance for the responses of the units assigned to treatment , and let denote the sample variance for the same group. With the assumption of equal variances, the common variance $\sigma^2$ can be estimated unbiasedly with the pooled sample variance

$\begin{displaymath}s^2_p = \frac{(n_1 -1)s^2_1 + (n_2-1)s^2_2}{n_1+n_2-2} \end{displaymath}$

Under these assumptions the quantity

$\begin{displaymath}t = \frac{(\bar y_1 - \bar y_2)-(\mu_1-\mu_2)}{s_p \sqrt{1/n_1 +1/n_2}} \end{displaymath}$

has a

distribution with

degrees of freedom. This then serves as the reference distribution for testing a hypothesis about any specified mean treatment effect $\mu_1-\mu_2$ and for constructing a confidence interval for mean treatment effect.

To test the null hypothesis of no treatment effect

$\begin{displaymath}H_0: \mu_1=\mu_2\end{displaymath}$

against the two-sided alternative that there is some treatment effect in either direction

$\begin{displaymath}H_A: \mu_1\ne \mu_2\end{displaymath}$

the two-sample t statistic

$\begin{displaymath}t = \frac{(\bar y_1 - \bar y_2)}{s_p \sqrt{1/n_1 + 1/n_2}} \end{displaymath}$

is computed and compared to the

distribution with

degrees of freedom. If the observed value of

from the experimental data is greater than the upper $\alpha/2$ point of the

distribution or less than the lower $\alpha/2$ point the null hypothesis is rejected in favor of the two-sided alternative.

The test is said to be an $\alpha$ -level test because with that procedure, the probability of rejecting the null hypothesis if it is actually true is $\alpha$ . Because we want that probability to be small, it is conventional to choose a small value of $\alpha$ such as .01, .05, or .10.

For a one sided alternative, such as $H_A:\mu_1 > \mu_2$ , then the null hypothesis is rejected if the observed value of is greater than the upper $\alpha$ point of the distribution, that is if the observed value of the statistic is far enough in the direction of the alternative.

Rather than make the arbritrary specification of a level $\alpha$ , it is common to simply report the p-value, which is the probability of obtaining a value of the test statistic as extreme or more so than the one observed, in the direction of the alternative. This probability is again computed using the reference distribution that describes the distribution of the test statistic when the null hypothesis is true.

Confidence intervals based on a normal distribution assumption or approximation typically have the form

(estimate) $\pm$

$\times$ (standard error of estimate)

where

is the upper $\alpha/2$ point of the reference

distribution. The standard error of the estimate is the square root of the estimated variance of the estimate.

The point estimate of treatment effect is $\bar y_2 - \bar y_1$ . The confidence interval gives an interval estimate of the same treatment effect.

The standard deviation of this estimated effect, since the two sample means are independent, is $\sqrt{\sigma^2/n_1 +\sigma^2/n_2}=\sigma\sqrt{1/n_1+1/n_2}$ . The standard error of the estimate, which estimates the standard deviation, is thus $s_p\sqrt{1/n_1 + 1/n_2}$ .

A $(1-\alpha)$ confidence interval for treatment effect thus has the form

$\begin{displaymath} (\bar y_1 -\bar y_2) \pm t s_p \sqrt{1/n_1 + 1/n_2} \end{displaymath}$

where is the upper $\alpha/2$ point of the distribution with degrees of freedom.

Read the discussion here on the two forms of the two-sample t statistic, with the more general form not assming equal variances for different treatment groups.

Example: Tomato experiment; two-sample t test

Because of the way the t.test function in R works, it is convenient to form two separate response variables, one for each treatment level.

y1 <- y[1:4] # puts the responses to the control into a variable called y1
y2 <- y[5:8] # puts the responses to the fertilizer into a variable called y2
cbind(y1,y2) #prints out the two variables in column format
        y1    y2
[1,] 16.75 31.63
[2,] 37.35 46.14
[3,] 29.40 36.55
[4,] 24.32 53.47

First calculate the t test by hand, so to speak. Without the assumption of equal variances, this is

diffobs <- mean(y[-s1])-mean(y[s1]) # the observed difference between treatment means
stderror <- sqrt(var(y[s1])/length(y[s1])+var(y[-s1])/length(y[-s1])) # unpooled s.e.
tobs <- diffobs/stderror # observed value of the unpooled t statistic
> tobs # print out its value
[1] 2.297385
> pt(tobs, df=5.916, lower.tail=F) # one-sided test of treatment effect
[1] 0.03096608
>2 * pt(tobs, df=5.916, lower.tail=F) # two-sided test
[1] 0.06193215

Now do the same for the pooled t-statistic:

> degreesfree <- n1 + n2 -2 # degrees of freedom for the pooled t statistic
> degreesfree # print it out
[1] 6
> var(y1) # the sample variance for the control group
[1] 75.03977
> var(y2)
[1] 95.30963
> ssqpooled <- ((n1-1)* var(y1) + (n2-1) * var(y2))/degreesfree # pooled sample variance
> spooled <- sqrt(ssqpooled) # pooled sample standard deviation
> tpooled <- diffobs/(spooled * sqrt(1/n1 + 1/n2)) # pooled t statistic
> ssqpooled # print them out
[1] 85.1747
> spooled
[1] 9.229014
> tpooled
[1] 2.297385

Next do each two sample t test using the R function t.test:

> t.test(y2,y1,conf.level=.90) # two sample t test not assuming equal variances

        Welch Two Sample t-test

data: y2 and y1
t = 2.2974, df = 5.916, p-value = 0.06193
alternative hypothesis: true difference in means is not equal to 0
90 percent confidence interval:
2.279276 27.705724
sample estimates:
mean of x mean of y
41.9475   26.9550

The pooled t test, assuming equal variances:

> t.test(y2,y1,var.equal=T,conf.level=.90) # pooled t test

        Two Sample t-test

data: y2 and y1
t = 2.2974, df = 6, p-value = 0.06132
alternative hypothesis: true difference in means is not equal to 0
90 percent confidence interval:
2.311503 27.673497
sample estimates:
mean of x mean of y
41.9475   26.9550

Now specify a one-sided test. Note the smaller p-value.   The two-sided t-test by the same (Welch) method above has twice the p-value.

> t.test(y2,y1,alternative="greater") # y2 is put first only because we find it
                         # easier to intrepret if a beneficial fertilizer gives a positive effect

        Welch Two Sample t-test

data: y2 and y1
t = 2.2974, df = 5.916, p-value = 0.03097
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
2.279276      Inf
sample estimates:
mean of x mean of y
41.9475   26.9550

The actual exact distribution of the t-statistic shown here and below.   This is compared to the smooth curve fitted by the assumed reference t-distribution under the null hypothesis of no treatment effect.   The R commands used to obtain this randomiztion distribution are contained in a function here. The reference t distribution was drawn after that with the command below.

curve(dt(x,df=5.916),-4,4,100,add=T) #curve plots a function, dt finds the density
                                                                 #of the t distribution

randomization and theoretical distribution of t statistic

randomization and theoretical distribution of t statistic

See also the example with the plant growth experiment data.

About this document ...

steve thompson 2006-03-03