Computational Methods

Chapter Contents

The TTEST Procedure

Computational Methods

The t Statistic

The form of the t statistic used varies with the type of test being performed.

To compare an individual mean with a sample of size n to a value m, use
$t = \frac{\bar{x}-m} {\displaystyle s/{\sqrt{n}} }$
where $\bar{x}$ is the sample mean of the observations and s² is the sample variance of the observations.
To compare n paired differences to a value m, use
$t = \frac{\bar{d}-m} {\displaystyle {s_d}/{\sqrt{n}} }$
where $\bar{d}$ is the sample mean of the paired differences and s²_d is the sample variance of the paired differences.
To compare means from two independent samples with n₁ and n₂ observations to a value m, use
$t = \frac{(\bar{x}_1-\bar{x}_2)-m} {\displaystyle s \sqrt{ \frac{1}{n_1} + \frac{1}{n_2} } }$
where s² is the pooled variance
s² = [((n₁-1)s₁²+(n₂-1)s₂²)/(n₁+n₂-2)]
and s₁² and s₂² are the sample variances of the two groups. The use of this t statistic depends on the assumption that $\sigma_1^2=\sigma_2^2$ , where $\sigma_1^2$ and $\sigma_2^2$ are the population variances of the two groups.

The Folded Form F Statistic

The folded form of the F statistic, F', tests the hypothesis that the variances are equal, where

F' = [(max(s₁²,s₂²))/(min(s₁²,s₂²))]

A test of F' is a two-tailed F test because you do not specify which variance you expect to be larger. The p-value gives the probability of a greater F value under the null hypothesis that $\sigma_1^2=\sigma_2^2$ .

The Approximate t Statistic

Under the assumption of unequal variances, the approximate t statistic is computed as

$t^' = \frac{\bar{x}_1-\bar{x}_2}{\sqrt{w_1+w_2}}$

where

w₁ = [(s₁²)/(n₁)], w₂ = [(s₂²)/(n₂)]

The Cochran and Cox Approximation

The Cochran and Cox (1950) approximation of the probability level of the approximate t statistic is the value of p such that

t' = [(w₁t₁+w₂t₂)/(w₁+w₂)]

where t₁ and t₂ are the critical values of the t distribution corresponding to a significance level of p and sample sizes of n₁ and n₂, respectively. The number of degrees of freedom is undefined when $n_1 \ne n_2$ .In general, the Cochran and Cox test tends to be conservative (Lee and Gurland 1975).

Satterthwaite's Approximation

The formula for Satterthwaite's (1946) approximation for the degrees of freedom for the approximate t statistic is:

df = [( (w₁+w₂)² )/( ( [(w₁²)/(n₁-1)]+[(w₂²)/(n₂-1)] ) )]

Refer to Steel and Torrie (1980) or Freund, Littell, and Spector (1986) for more information.

Confidence Interval Estimation

The form of the confidence interval varies with the statistic for which it is computed. In the following confidence intervals involving means, $t_{1-\frac{\alpha}2,n-1}$ is the $100(1-\frac{\alpha}2)$ % quantile of the t distribution with n-1 degrees of freedom. The confidence interval for

an individual mean from a sample of size n compared to a value m is given by
$(\bar{x} - m) +- t_{1-\frac{\alpha}2,n-1} { \frac{s}{\sqrt{n}} }$
where $\bar{x}$ is the sample mean of the observations and s² is the sample variance of the observations
paired differences with a sample of size n differences compared to a value m is given by
$(\bar{d} - m) +- t_{1-\frac{\alpha}2,n-1} { \frac{s_d}{\sqrt{n}} }$
where $\bar{d}$ and s²_d are the sample mean and sample variance of the paired differences, respectively
the difference of two means from independent samples with n₁ and n₂ observations compared to a value m is given by
$((\bar{x}_1 - \bar{x}_2) - m) +- t_{1-\frac{\alpha}2,n_1+n_2-2} s \sqrt{ \frac{1}{n_1} + \frac{1}{n_2} }$
where s² is the pooled variance
s² = [((n₁-1)s₁²+(n₂-1)s₂²)/(n₁+n₂-2)]
and where s₁² and s₂² are the sample variances of the two groups. The use of this confidence interval depends on the assumption that $\sigma_1^2=\sigma_2^2$ , where $\sigma_1^2$ and $\sigma_2^2$ are the population variances of the two groups.

The distribution of the estimated standard deviation of a mean is not symmetric, so alternative methods of estimating confidence intervals are possible. PROC TTEST computes two estimates. For both methods, the data are assumed to have a normal distribution with mean $\mu$ and variance $\sigma^2$ , both unknown. The methods are as follows:

The default method, an equal-tails confidence interval, puts an equal amount of area ( $\frac{\alpha}2$ ) in each tail of the chi-square distribution. An equal tails test of $H_0\colon\sigma=\sigma_0$ has acceptance region
$\{\chi_{\frac{\alpha}2,n-1}^2 \leq \frac{(n-1)S^2}{\sigma_0^2} \leq \chi_{\frac{1-\alpha}2,n-1}^2 \}$
which can be algebraically manipulated to give the following $100(1-\alpha)\%$ confidence interval for $\sigma^2$ :
$(\frac{(n-1)S^2}{\chi_{1-\frac{\alpha}2,n-1}^2}, \frac{(n-1)S^2}{\chi_{\frac{\alpha}2,n-1}^2})$

In order to obtain a confidence interval for $\sigma$ , the square root of each side is taken, leading to the following $100(1-\alpha)\%$ confidence interval:
$(\sqrt{\frac{(n-1)S^2}{\chi_{1-\frac{\alpha}2,n-1}^2}}, \sqrt{\frac{(n-1)S^2}{\chi_{\frac{\alpha}2,n-1}^2}})$
The second method yields a confidence interval derived from the uniformly most powerful unbiased test of $H_0\colon\sigma=\sigma_0$ (Lehmann 1986). This test has acceptance region
$\{c_1 \leq \frac{(n-1)S^2}{\sigma_0^2} \leq c_2 \}$
where the critical values c₁ and c₂ satisfy
$\int_{c_1}^{c_2}f_n (y)dy=1-\alpha$
and
$\int_{c_1}^{c_2}yf_n (y)dy=n(1-\alpha)$
where f_n(y) is the chi-squared distribution with n degrees of freedom. This acceptance region can be algebraically manipulated to arrive at
$P\{\frac{(n-1)S^2}{c_2} \leq \sigma^2 \leq \frac{(n-1)S^2}{c_1} \}=1-\alpha$
where c₁ and c₂ solve the preceding two integrals. To find the area in each tail of the chi-square distribution to which these two critical values correspond, solve $c_1 = \chi_{1-\alpha_2,n-1}^2$ and $c_2=\chi_{\alpha_1,n-1}^2$ for $\alpha_1$ and $\alpha_2$ ; the resulting $\alpha_1$ and $\alpha_2$ sum to $\alpha$ . Hence, a $100(1-\alpha)\%$ confidence interval for $\sigma^2$ is given by
$(\frac{(n-1)S^2}{\chi_{1-\alpha_2,n-1}^2}, \frac{(n-1)S^2}{\chi_{\alpha_1,n-1}^2})$
In order to obtain a $100(1-\alpha)\%$ confidence interval for $\sigma$ , the square root is taken of both terms, yielding
$(\sqrt{\frac{(n-1)S^2}{\chi_{1-\alpha_2,n-1}^2}}, \sqrt{\frac{(n-1)S^2}{\chi_{\alpha_1,n-1}^2}})$

Chapter Contents
Previous
Next
Top