Chapter Contents |
Previous |
Next |

The MEANS Procedure |

The computational details for confidence limits, hypothesis test statistics, and quantile statistics follow.

Confidence Limits |

A two-sided % confidence interval for the mean has upper and lower limits

where
is
and
is the (
) critical value of the Student's **t** statistics
with
degrees of freedom.

A one-sided % confidence interval is computed as

A two-sided % confidence interval for the standard deviation has lower and upper limits

where and are the and critical values of the chi-square statistic with degrees of freedom. A one-sided % confidence interval is computed by replacing with .

A % confidence interval for the variance has upper and lower limits that are equal to the squares of the corresponding upper and lower limits for the standard deviation.

When you use the WEIGHT statement or WEIGHT= in a VAR statement and the default value of VARDEF=, which is DF, the % confidence interval for the weighted mean has upper and lower limits

where
is the weighted mean,
is the weighted standard deviation,
is the weight for
observation, and
is the
critical value for the Student's **t** distribution
with
degrees of freedom.

Student's t Test |

where
is the sample mean,
is the number of nonmissing values for a variable, and
is the sample standard deviation. Under the null hypothesis,
the population mean equals
. When the data values are approximately normally distributed,
the probability under the null hypothesis of a **t** statistic as
extreme, or more extreme, than the observed value (the **p**-value)
is obtained from the **t** distribution with
degrees of freedom. For large
, the **t** statistic is asymptotically equivalent
to a **z** test.

When you use the WEIGHT statement or WEIGHT= in a VAR
statement and the default value of VARDEF=, which is DF, the Student's **t** statistic is calculated as

where
is the weighted mean,
is the weighted standard deviation, and
is the weight for
observation. The
statistic is treated as having a Student's **t**
distribution with
degrees of freedom. If you specify the EXCLNPWGT option
in the PROC statement,
is the number of nonmissing observations when the value
of the WEIGHT variable is positive. By default,
is the number of nonmissing observations for the WEIGHT
variable.

Quantiles |

- OS
- reads all data into memory and sorts it by unique value.
- P2
- accumulates all data into a fixed sample size that is used to approximate the quantile.

The QMETHOD=P2 technique is based on the piecewise-parabolic (P²) algorithm invented by Jain and Chlamtac (1985). P² is a one-pass algorithm to determine quantiles for a large data set. It requires a fixed amount of memory for each variable for each level within the type. However, using simulation studies, reliable estimations of some quantiles (P1, P5, P95, P99) may not be possible for some data sets such as those with heavily tailed or skewed distributions.

If the number of observations is less than the QMARKERS= value, QMETHOD=P2 produces the same results as QMETHOD=OS when QNTLDEF=5. To compute weighted quantiles, you must use QMETHOD=OS.

Chapter Contents |
Previous |
Next |
Top of Page |

Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.