Chapter Contents

Previous

Next
The UNIVARIATE Procedure

Example 5: Examining the Data Distribution and Saving Percentiles


Procedure features:
PROC UNIVARIATE statement options:
ALPHA=
CIBASIC
CIPCTLNORMAL
MU0=
NORMAL
PLOTS
PLOTSIZE=
OUTPUT statement
Other features:
PRINT procedure
Data set: SCORE

This example


Program

options nodate pageno=1 linesize=64 pagesize=58;
 Note about code
proc univariate data=score mu0=80 alpha=.1 cibasic(type=lower)
                cipctlnormal normal plots plotsize=26;
 Note about code
   var final;
 Note about code
   output out=pctscore median=Median pctlpts=98 50 20 70
          pctlpre=Pctl_ pctlname=Top Mid Low;
   title 'Examining the Distribution of Final Exam Scores';
run; 
 Note about code
proc print data=pctscore noobs;
   title1 'Quantile Statistics for Final Exam Scores';
   title2 'Output Data Set from PROC UNIVARIATE';
run; 


Output
The estimate of the mean test score is 82.4, with a standard deviation of 8.6. The 90 percent lower confidence limit for the mean is 79.

The Tests for Location table includes three hypothesis tests. To determine whether the Student's t statistic is appropriate, you must determine if the data are approximately normally distributed.

PROC UNIVARIATE calculates the Shapiro-Wilk W statistic because the sample size is below 2000. All p-values from the tests for normality are >0.15, which provides insufficient evidence to reject the assumption of normality. The probability plot also supports the assumption that the data are normal. Therefore, the t statistic appears appropriate. The p-value of .35 for this test provides insufficient evidence to reject the null hypothesis that the mean test score is 80.

Examination of the box plot, which is nonsymmetric, and the small sample size, which causes low power, make the sign test a more appropriate test of location. The p-value of .75 for this test provides insufficient evidence to reject the null hypothesis that the mean test score is 80.

The three plots display the data distribution. The PLOTSIZE= option enlarges the plots so that you can easily see if the data are approximately normal.

[HTML Output]
 [Listing Output]
The PCTSCORE data set contains one observation. The median value in Median is equivalent to the 50th percentile in PCTL_MID. [HTML Output]
 [Listing Output]


Chapter Contents

Previous

Next

Top of Page

Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.