Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
HISTOGRAM Statement

Output Data Sets

You can create two output data sets with the HISTOGRAM statement: the OUTFIT= data set and the OUTHISTOGRAM= data set. These data sets are described in the following sections.

OUTFIT= Data Sets

The OUTFIT= data set contains the parameters of fitted density curves, information on chi-square and EDF goodness-of-fit tests, specification limit information, and capability indices based on the fitted distribution. Since you can specify multiple HISTOGRAM statements with the CAPABILITY procedure, you can create several OUTFIT= data sets. For each variable plotted with the HISTOGRAM statement, the OUTFIT= data set contains one observation for each fitted distribution requested in the HISTOGRAM statement. If you use a BY statement, the OUTFIT= data set contains several observations for each BY group (one observation for each variable and fitted density combination). ID variables are not saved in the OUTFIT= data set.

The OUTFIT= data set contains the variables listed in Table 4.18.

Table 4.18: Variables in the OUTFIT= Data Set
Variable Description
_ADASQ_Anderson-Darling EDF goodness-of-fit statistic
_ADP_p-value for Anderson-Darling EDF goodness-of-fit test
_CHISQ_chi-square goodness-of-fit statistic
_CP_generalized capability index Cp based on the fitted curve
_CPK_generalized capability index Cpk based on the fitted curve
_CPL_generalized capability index CPL based on the fitted curve
_CPM_generalized capability index Cpm based on the fitted curve
_CPU_generalized capability index CPU based on the fitted curve
_CURVE_name of fitted distribution (abbreviated to 8 characters)
_CVMWSQ_Cramer-von Mises EDF goodness-of-fit statistic
_CVMP_p-value for Cramer-von Mises EDF goodness-of-fit test
_DF_degrees of freedom for chi-square goodness-of-fit test
_ESTGTR_estimated percent of population greater than upper specification limit
_ESTLSS_estimated percent of population less than lower specification limit
_ESTSTD_estimated standard deviation
_EXPECT_estimated mean
_K_generalized capability index K based on the fitted curve
_KSD_Kolmogorov-Smirnov EDF goodness-of-fit statistic
_KSP_p-value for Kolmogorov-Smirnov EDF goodness-of-fit test
_LOCATN_location parameter for fitted distribution. For the normal distribution, this is either the value of \mu specified with the MU= option or the sample mean. For all other distributions, this is either the value specified with the THRESHOLD= option or zero.
_LSL_lower specification limit
_MIDPT1_midpoint of first interval used to calculate the value of the chi-square statistic. This is the leftmost interval that contains at least one value of the variable.
_MIDPTN_midpoint of last interval used to calculate the value of the chi-square statistic. This is the rightmost interval that contains at least one value of the variable.
_OBSGTR_observed percent of data greater than upper specification limit
_OBSLSS_observed percent of data less than the lower specification limit
_PCHISQ_p-value for chi-square goodness-of-fit test
_SCALE_value of scale parameter for fitted distribution. For the normal distribution, this is either the value of \sigma specified with the SIGMA= option or the sample standard deviation. For all other distributions, this is either the value specified with the SCALE= option or the value estimated by the procedure.
_SHAPE1_value of shape parameter for fitted distribution. For distributions without a shape parameter (normal and exponential distributions), _SHAPE1_ is set to missing. For the gamma, lognormal, and Weibull distributions, the value of _SHAPE1_ is either the value specified with the SHAPE= option or the value estimated by the procedure. For the beta distribution, _SHAPE1_ is either the value of \alpha specified with the ALPHA= option or the value estimated by the procedure.
_SHAPE2_value of shape parameter for fitted distribution. For the beta distribution, _SHAPE2_ is either the value of \beta specified with the BETA= option or the value estimated by the procedure. For all other distributions, _SHAPE2_ is set to missing.
_TARGET_target value
_USL_upper specification limit
_VAR_variable name
_WIDTH_width of histogram interval

OUTHISTOGRAM= Data Sets

The OUTHISTOGRAM= data set contains information about histogram intervals. Since you can specify multiple HISTOGRAM statements with the CAPABILITY procedure, you can create multiple OUTHISTOGRAM= data sets.

The data set contains a group of observations for each variable plotted with the HISTOGRAM statement. The group contains an observation for each interval of the histogram, beginning with the leftmost interval that contains a value of the variable and ending with the rightmost interval that contains a value of the variable. These intervals will not necessarily coincide with the intervals displayed in the histogram since the histogram may be padded with empty intervals at either end. If you superimpose one or more fitted curves on the histogram, the OUTHISTOGRAM= data set contains multiple groups of observations for each variable (one group for each curve). If you use a BY statement, the OUTHISTOGRAM= data set contains groups of observations for each BY group. ID variables are not saved in the OUTHISTOGRAM= data set.

The OUTHISTOGRAM= data set contains the variables listed in Table 4.19.

Table 4.19: Variables in the OUTHISTOGRAM= Data Set
Variable Description
_CURVE_name of fitted distribution (if requested in HISTOGRAM statement)
_EXPPCT_estimated percent of population in histogram interval determined from optional fitted distribution
_MIDPT_midpoint of histogram interval
_OBSPCT_percent of variable values in histogram interval
_VAR_variable name

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.