Chapter Contents

Previous

Next
The UNIVARIATE Procedure

OUTPUT Statement


Saves statistics and BY variables in an output data set.

Tip: You can save percentiles that are not automatically computed.
Tip: You can use multiple OUTPUT statements to create several OUT= data sets.
Main discussion: Output Data Set
Featured in: Examining the Data Distribution and Saving Percentiles , Creating an Output Data Set with Multiple Analysis Variables , and Creating Schematic Plots and an Output Data Set with BY Groups


OUTPUT <OUT=SAS-data-set> statistic-keyword-1=name(s)
<...statistic-keyword-n=name(s)> <percentiles-specification> ;


Options

OUT=SAS-data-set
identifies the output data set. If SAS-data-set does not exist, PROC UNIVARIATE creates it. If you omit OUT=, the data set is named DATAn, where n is the smallest integer that makes the name unique.
Default: DATAn

statistic-keyword=name(s)
specifies a statistic to store in the OUT= data set and names the new variable that will contain the statistic. The available statistical keywords are

Descriptive statistic keywords

CSS CV KURTOSIS

MAX MEAN N

MIN MODE RANGE

NMISS NOBS STDMEAN

SKEWNESS STD USS

SUM SUMWGT VAR
Quantile statistic keywords

MEDIAN P1 P5

P10 P90 P95

P99 Q1 Q3

QRANGE

Robust statistic keywords

GINI MAD QN

SN STD_GINI STD_MAD

STD_QN STD_QRANGE STD_SN
Hypothesis testing keywords

NORMAL PROBN MSIGN

PROBM SIGNRANK PROBS

T PROBT

See SAS Elementary Statistics Procedures and Statistical Computations for the keyword definitions and statistical formulas.

To store the same statistic for several analysis variables, specify a list of names. The order of the names corresponds to the order of the analysis variables in the VAR statement. PROC UNIVARIATE uses the first name to create a variable that contains the statistic for the first analysis variable, the next name to create a variable that contains the statistic for the second analysis variable, and so on. If you do not want to output statistics for all the analysis variables, specify fewer names than the number of analysis variables.

percentiles-specification
specifies one or more percentiles to store in the OUT= data set and names the new variables that contain the percentiles. The form of percentiles-specification is
PCTLPTS=percentile(s) PCTLPRE=prefix-name(s) <PCTLNAME=suffix-name(s)>

PCTLPTS=percentile(s)
specifies one or more percentiles to compute. You can specify percentiles with the expression start TO stop BY increment where start is a starting number, stop is an ending number, and increment is a number to increment by.
Range: any decimal numbers between 0 and 100, inclusive
Example: To compute the 50th, 95th, 97.5th, and 100th percentiles, submit the statement
output pctlpre=P_ pctlpts=50,95 to 100 by 2.5;

PCTLPRE=prefix-name(s)
specifies one or more prefixes to create the variable names for the variables that contain the PCTLPTS= percentiles. To save the same percentiles for more than one analysis variable, specify a list of prefixes. The order of the prefixes corresponds to the order of the analysis variables in the VAR statement.
Interaction: PROC UNIVARIATE creates a variable name by combining the PCTLPRE= value and either suffix-name or (if you omit PCTLNAME= or if you specify too few suffix-name(s)) the PCTLPTS= value.

PCTLNAME=suffix-name(s)
specifies one or more suffixes to create the names for the variables that contain the PCTLPTS= percentiles. PROC UNIVARIATE creates a variable name by combining the PCTLPRE= value and suffix-name. Because the suffix names are associated with the percentiles that are requested, list the suffix names in the same order as the PCTLPTS= percentiles.

Requirement: You must specify PCTLPRE= to supply prefix names for the variables that contain the PCTLPTS= percentiles.
Interaction: If the number of PCTLNAME= values is fewer than the number of percentile(s) or if you omit PCTLNAME=, PROC UNIVARIATE uses percentile as the suffix to create the name of the variable that contains the percentile. For an integer percentile, PROC UNIVARIATE uses percentile. For a noninteger percentile, PROC UNIVARIATE truncates decimal values of percentile to two decimal places and replaces the decimal point with an underscore.
Interaction: If either the prefix and suffix name combination or the prefix and percentile name combination is longer than 32 characters, PROC UNIVARIATE truncates the prefix name so that the variable name is 32 characters.


Saving Percentiles Not Automatically Computed
You can use PCTLPTS= to output percentiles that are not in the list of quantile statistics. PROC UNIVARIATE computes the requested percentiles based on the method that you specify with the PCTLDEF= option in the PROC UNIVARIATE statement. You must use PCTLPRE=, and optionally PCTLNAME=, to specify variable names for the percentiles. For example, the following statements create an output data set that is named PCTLS that contains the 20th and 40th percentiles of the analysis variables Test1 and Test2:

proc univariate data=score;
   var Test1 Test2;
   output out=pctls pctlpts=20 40 pctlpre=Test1_ Test2_ 
              pctlname=P20 P40;
run;
PROC UNIVARIATE saves the 20th and 40th percentiles for Test1 and Test2 in the variables Test1_P20, Test2_P20, Test1_P40, and Test2_P40.


Using the BY Statement with the OUTPUT Statement
When you use a BY statement, the number of observations in the OUT= data set corresponds to the number of BY groups. Otherwise, the OUT= data set contains only one observation.


Chapter Contents

Previous

Next

Top of Page

Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.