Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The NPAR1WAY Procedure

Simple Linear Rank Tests for Two-Sample Data

Statistics of the form

S  =  \sum_{j=1}^n c_j a(R_j)

are called simple linear rank statistics, where

Rj
is the rank of the observation j

a(Rj)
is the score based on that rank

cj
is an indicator variable denoting the class to which the jth observation belongs

n
is the total number of observations

For two-sample data (where the observations are classified into two levels), PROC NPAR1WAY calculates simple linear rank statistics for the scores that you specify. The "Scores for Linear Rank and One-Way ANOVA Tests" section describes the available scores, which you can use to test for differences in location and differences in scale.

To compute S, PROC NPAR1WAY sums the scores of the observations in the smaller of the two samples. If both samples have the same number of observations, PROC NPAR1WAY sums those scores for the sample that appears first in the input data set.

For each score that you specify, PROC NPAR1WAY computes an asymptotic test of the null hypothesis of no difference between the two classification levels. Exact tests are also available for these two-sample linear rank statistics. PROC NPAR1WAY computes exact tests for each score type that you specify in the EXACT statement. See the "Exact Tests" section for details on exact tests.

To compute an asymptotic test for a linear rank sum statistic, PROC NPAR1WAY uses a standardized test statistic z, which has an asymptotic standard normal distribution under the null hypothesis. The standardized test statistic is computed as

z =  \frac{S - E_0(S)}{\sqrt{var_0(S)}}

where E0(S) is the expected value of S under the null hypothesis, and Var0(S) is the variance under the null hypothesis. As shown in Randles and Wolfe (1979),

E_0(S)  =  
 \frac{n_1}n \sum_{j=1}^n a(R_j)

where n1 is the number of observations in the first (smaller) class level, and

Var_0(S)  = 
 \frac{1}{(n-1)}
 \frac{n_1 \cdot n_2}n
 [ \sum_{j=1}^n (a(R_j) - \bar{a} )^2 ]

where
\bar{a}  =  (1/n) \sum_{j=1}^n a(R_j)

PROC NPAR1WAY computes one-sided and two-sided asymptotic p-values for each two-sample linear rank test. When the test statistic z is greater than its null hypothesis expected value of zero, PROC NPAR1WAY computes the right-sided p-value, which is the probability of a larger value of the statistic occurring under the null hypothesis. When the test statistic is less than or equal to zero, PROC NPAR1WAY computes the left-sided p-value, which is the probability of a smaller value of the statistic occurring under the null hypothesis. The one-sided p-value P1 can be expressed as

P_{1} = {\rm Prob} (Z \gt z) 
 {\rm if} z \gt 0
P_{1} = {\rm Prob} (Z \lt z) 
 {\rm if} z \leq 0

where Z has a standard normal distribution. The two-sided p-value P2 is computed as
P_{2} = {\rm Prob}
 (| Z| \gt | z|)

For Wilcoxon scores and Siegel-Tukey scores, PROC NPAR1WAY incorporates a continuity correction when computing the standardized test statistic z, unless you specify CORRECT=NO. PROC NPAR1WAY applies the continuity correction by subtracting 0.5 from the numerator S - E0(S) if it is greater than zero. If the numerator is less than zero, PROC NPAR1WAY adds 0.5. Some sources recommend a continuity correction for nonparametric tests that use a continuous distribution to approximate a discrete distribution. Refer to Sheskin (1997). If you specify CORRECT=NO, PROC NPAR1WAY does not use a continuity correction for any test.

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.