Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
HISTOGRAM Statement

Dictionary of Options

The following entries provide detailed descriptions of options for the HISTOGRAM statement.

ALPHA=value
specifies the shape parameter \alpha for fitted curves requested with the BETA and GAMMA options. Enclose the ALPHA= option in parentheses after the BETA or GAMMA options. If you do not specify a value for \alpha,the procedure calculates a maximum likelihood estimate. See Example 4.1. You can specify A= as an alias for ALPHA= if you use it as a beta-option. You can specify SHAPE= as an alias for ALPHA= if you use it as a gamma-option.

ALPHADELTA=value
specifies the change in successive estimates of \hat{\alpha}at which iteration terminates in the Newton-Raphson approximation of the maximum likelihood estimate of \alpha for curves requested by the GAMMA option. Enclose the ALPHADELTA= option in parentheses after the GAMMA option. Iteration continues until the change in \alphais less than the value specified or until the number of iterations exceeds the value of the MAXITER= option. The default value is 0.00001.

ALPHAINITIAL=value
specifies the initial value for \hat{\alpha} in the Newton-Raphson approximation of the maximum likelihood estimate of \alpha for fitted gamma distributions requested with the GAMMA option. Enclose the ALPHAINITIAL= option in parentheses after the GAMMA option. The default value is Thom's approximation of the estimate of \alpha.Refer to Johnson et al. (1994).

ANNOTATE=SAS-data-set
ANNO=SAS-data-set
[Graphics]
specifies an input data set containing annotate variables as described in SAS/GRAPH Software: Reference. See Example 4.7. The ANNOTATE= data set you specify in the HISTOGRAM statement is used for all plots created by the statement. You can also specify an ANNOTATE= data set in the PROC CAPABILITY statement to enhance all plots created by the procedure; for more information, see "ANNOTATE= Data Sets".

BETA<(beta-options )>
displays a fitted beta density curve on the histogram. The curve equation is
p(x) = \{ \frac{(x-\theta)^{\alpha-1}(\sigma+\theta-x)^{\beta-1}}
 { B(\alpha,\b...
 ...ta + \sigma} \ 0 & {for x \leq \theta\space or x \geq \theta + \sigma\space }
 .

where B(\alpha ,\beta )=\frac{\Gamma (\alpha )\Gamma (\beta )}
 {\Gamma (\alpha +\beta )} and  
 

\theta = lower threshold parameter (lower endpoint parameter) \sigma = scale parameter (\sigma \gt) \alpha = shape parameter (\alpha \gt) \beta = shape parameter (\beta \gt) h = width of histogram interval

The beta distribution is bounded below by the parameter \theta and above by the value \theta + \sigma.You can specify \theta and \sigma using the THETA= and SIGMA= beta-options. The following statements fit a beta distribution bounded between 50 and 75, using maximum likelihood estimates for \alpha and \beta:
   proc capability;
      histogram length / beta(theta=50 sigma=25);
   run;

In general, the default values for THETA= and SIGMA= are 0 and 1, respectively. You can specify THETA=EST and SIGMA=EST to request maximum likelihood estimates for \theta and \sigma.

The beta distribution has two shape parameters, \alpha and \beta. If these parameters are known, you can specify their values with the ALPHA= and BETA= beta-options. If you do not specify values, the procedure calculates maximum likelihood estimates for \alpha and \beta.

The BETA option can appear only once in a HISTOGRAM statement. Table 4.2 and Table 4.3 list options you can specify with the BETA option. See Example 4.1. Also see "Formulas for Fitted Curves".

BETA=value
B=value
specifies the second shape parameter \beta for beta density curves requested with the BETA option. Enclose the BETA= option in parentheses after the BETA option. If you do not specify a value for \beta, the procedure calculates a maximum likelihood estimate. See Example 4.1.

C=value
specifies the shape parameter c for Weibull density curves requested with the WEIBULL option. Enclose the C= option in parentheses after the WEIBULL option. If you do not specify a value for c, the procedure calculates a maximum likelihood estimate. See Example 4.2. You can specify the SHAPE= option as an alias for the C= option.

C=value-list | MISE
specifies the standardized bandwidth parameter c for kernel density estimates requested with the KERNEL option. Enclose the C= option in parentheses after the KERNEL option. You can specify up to five values to request multiple estimates. You can also specify the C=MISE option, which produces the estimate with a bandwidth that minimizes the approximate mean integrated square error (MISE). For example, the following statements compute three density estimates:
   proc capability;
      histogram length / kernel(c=0.5 1.0 mise);
   run;

The first two estimates have standardized bandwidths of 0.5 and 1.0, respectively, and the third has a bandwidth that minimizes the approximate MISE.

You can also use the C= option with the K= option, which specifies the kernel function, to compute multiple estimates. If you specify more kernel functions than bandwidths, the last bandwidth in the list is repeated for the remaining estimates. Likewise, if you specify more bandwidths than kernel functions, the last kernel function is repeated for the remaining estimates. For example, the following statements compute three density estimates:
   proc capability;
      histogram length / kernel(c=1 2 3 k=normal quadratic);
   run;

The first uses a normal kernel and a bandwidth of 1, the second uses a quadratic kernel and a bandwidth of 2, and the third uses a quadratic kernel and a bandwidth of 3. See Example 4.5.

If you do not specify a value for c, the bandwidth that minimizes the approximate MISE is used for all the estimates.

CAXIS=color
CAXES=color
[Graphics]
specifies the color used for the axes and tick marks. This option overrides any COLOR= specifications in an AXIS statement. The default is the first color in the device color list.

CBARLINE=color
[Graphics]
specifies the color of the outline of histogram bars. This option overrides the C= option in the SYMBOL1 statement. The default is the first color in the device color list.

CDELTA=value
specifies the change in successive estimates of c at which iterations terminate in the Newton-Raphson approximation of the maximum likelihood estimate of c for fitted Weibull curves requested by the WEIBULL option. Enclose the CDELTA= option in parentheses after the WEIBULL option. Iteration continues until the change in c between consecutive steps is less than the value specified or until the number of iterations exceeds the value of the MAXITER= option. The default value is 0.00001. For examples, see the entry for the WEIBULL option.

CFILL=color
[Graphics]
specifies a color used to fill the bars of the histogram (or the area under a fitted curve if you also specify the FILL option). See the entries for the FILL and PFILL= options for additional details. See Figure 4.5 and Output 4.1.1. Refer to SAS/GRAPH Software: Reference for a list of colors. By default, bars and curve areas are not filled.

CFRAME=color
CFR=color
[Graphics]
specifies the color for the area enclosed by the axes and frame. The area is not filled by default.

CHREF=color
CH=color
[Graphics]
specifies the color for horizontal axis reference lines requested by the HREF=option. The default is the first color in the device color list.

CINITIAL=value
specifies the initial value for \hat{c} in the Newton-Raphson approximation of the maximum likelihood estimate of c for Weibull curves requested with the WEIBULL option. Enclose the CINITIAL= option in parentheses after the WEIBULL option. The default value is 1.8 (refer to Johnson et al. 1994).

COLOR=color
[Graphics]
specifies the color of the density curve. Enclose the COLOR= option in parentheses after the distribution option or the KERNEL option. See Example 4.1. If you use the COLOR= option with the KERNEL option, you can specify a list of up to five colors in parentheses for multiple kernel density estimates. If there are more estimates than colors, the last color specified is used for the remaining estimates.

CTEXT=color
[Graphics]
specifies the color for tick mark values and axis labels. The default is the color specified for the CTEXT= option in the GOPTIONS statement. In the absence of a GOPTIONS statement, the default color is the first color in the device color list.
CURVELEGEND=name | NONE
specifies the name of a LEGEND statement describing the legend for specification limits and fitted curves. Specifying CURVELEGEND=NONE suppresses the legend for fitted curves; this is equivalent to specifying the NOCURVELEGEND option.

CVREF=color
CV=color
[Graphics]
specifies the color for lines requested with the VREF= option. The default is the first color in the device color list.

DELTA=value
specifies the first shape parameter \delta for Johnson SB and Johnson SU density curves requested with the SB and SU options. Enclose the DELTA= option in parentheses after the SB or SU option. If you do not specify a value for \delta, the procedure calculates an estimate.

DESCRIPTION='string'
DES='string'
[Graphics]
specifies a description, up to 40 characters, that appears in the PROC GREPLAY master menu. The default is the variable name.

EXPONENTIAL<(exponential-options )>
EXP<(exponential-options )>
displays a fitted exponential density curve on the histogram. The curve equation is

p(x) = \{ \frac{h x 100\%}{\sigma}
 \exp(-(\frac{x - \theta} {\sigma}))
 & {for x \geq \theta} \ 0 & {for x \lt \theta}
 .


where 
 

\theta = threshold parameter \sigma = scale parameter (\sigma \gt) h = width of histogram interval


The parameter \theta must be less than or equal to the minimum data value. You can specify \theta with the THETA= exponential-option. The default value for \theta is zero. If you specify THETA=EST, a maximum likelihood estimate is computed for \theta.You can specify \sigma with the SIGMA= exponential-option. By default, a maximum likelihood estimate is computed for \sigma. For example, the following statements fit an exponential curve with \theta=10 and with a maximum likelihood estimate for \sigma:
   proc capability;
      histogram / exponential(theta=10 l=2 color=red);
   run;

The curve is red and has a line type of 2. The EXPONENTIAL option can appear only once in a HISTOGRAM statement. Table 4.2 and Table 4.4 list options you can specify with the EXPONENTIAL option. See "Formulas for Fitted Curves".

FILL
[Graphics]
fills areas under a parametric density curve or kernel density estimate with colors and patterns. Enclose the FILL option in parentheses after a curve option or the KERNEL option, as in the following statements:
   proc capability;
      histogram length / normal(fill) cfill=green pfill=solid;
   run;
Depending on the area to be filled (outside or between the specification limits), you can specify the color and pattern with options in the SPEC statement and HISTOGRAM statement, as summarized in the following table:

Area Under Curve Statement Option
between specificationHISTOGRAMCFILL=color
limitsHISTOGRAMPFILL=pattern
left of lowerSPECCLEFT=color
specification limitSPECPLEFT=pattern
right of upperSPECCRIGHT=color
specification limitSPECPRIGHT=pattern
If you do not display specification limits, the CFILL= and PFILL= options specify the color and pattern for the entire area under the curve. Solid fills are used by default if patterns are not specified. You can specify the FILL option with only one fitted curve. For an example, see Output 4.1.1. Refer to SAS/GRAPH Software: Reference for a list of available patterns and colors. If you do not specify the FILL option but specify the options in the preceding table, the colors and patterns are applied to the corresponding areas under the histogram.

FITINTERVAL=value
specifies the value of z for the method of percentiles when this method is used to fit a Johnson SB or Johnson SU distribution. The FITINTERVAL= option is specified in parentheses after the SB or SU option. The default value of z is 0.524.

FITMETHOD=PERCENTILE|MLE|MOMENTS
specifies the method used to estimate the parameters of a Johnson SB or Johnson SU distribution. The FITMETHOD= option is specified in parentheses after the SB or SU option. By default, the method of percentiles is used.

FITTOLERANCE=value
specifies the tolerance value for the ratio criterion when the method of percentiles is used to fit a Johnson SB or Johnson SU distribution. The FITTOLERANCE= option is specified in parentheses after the SB or SU option. The default value is 0.01.

FONT=font
[Graphics]
specifies a software font for reference line and axis labels. You can also specify fonts for axis labels in an AXIS statement. The FONT= font takes precedence over the FTEXT= font specified in the GOPTIONS statement. Hardware characters are used by default.

FORCEHIST
forces the creation of a histogram if there is only one unique observation. By default, a histogram is not created if the standard deviation of the data is zero.

GAMMA<(gamma-options)>
displays a fitted gamma density curve on the histogram. The curve equation is
p(x) = \{ \frac{h x 100\%}{\Gamma(\alpha)\sigma}
 (\frac{x - \theta}{\sigma})^{\...
 ...-(\frac{x - \theta}{\sigma}))
 & {for x \gt \theta} \ 0 & {for x \leq \theta}
 .


where 
 
		 \theta = threshold parameter
		 \sigma = scale parameter (\sigma \gt) 
		 \alpha = shape parameter (\alpha \gt) 
		 h = width of histogram interval
The parameter \theta for the gamma distribution must be less than the minimum data value. You can specify \theta with the THETA= gamma-option. The default value for \theta is 0. If you specify THETA=EST, a maximum likelihood estimate is computed for \theta.In addition, the gamma distribution has a shape parameter \alphaand a scale parameter \sigma. You can specify these parameters with the ALPHA= and SIGMA= gamma-options. By default, maximum likelihood estimates are computed for \alpha and \sigma. For example, the following statements fit a gamma curve with \theta=4 and with maximum likelihood estimates for \alpha and \sigma:
   proc capability;
      histogram length / gamma(theta=4);
   run;
Note that the maximum likelihood estimate of \alphais calculated iteratively using the Newton-Raphson approximation. The ALPHADELTA=, ALPHAINITIAL=, and MAXITER= gamma-options control the approximation.

The GAMMA option can appear only once in a HISTOGRAM statement. Table 4.2 and Table 4.5 list the options you can specify with the GAMMA option. See Example 4.2 and "Formulas for Fitted Curves".

GAMMA=value
specifies the second shape parameter \gamma for Johnson SB and Johnson SU density curves requested with the SB and SU options. Enclose the GAMMA= option in parentheses after the SB or SU option. If you do not specify a value for \gamma, the procedure calculates an estimate.

HANGING
HANG
requests a hanging histogram, as illustrated in Figure 4.6.

caphsyn1.gif (5185 bytes)

Figure 4.6: Hanging Histogram

You can use the HANGING option with only one fitted density curve. A hanging histogram aligns the tops of the histogram bars (displayed as lines) with the fitted curve. The lines are positioned at the midpoints of the histogram bins. A hanging histogram is a goodness-of-fit diagnostic in the sense that the closer the lines are to the horizontal axis, the better the fit. Hanging histograms are discussed by Tukey (1977), Wainer (1974), and Velleman and Hoaglin (1981).

HAXIS=name
[Graphics]
specifies the name of an AXIS statement describing the horizontal axis. You can specify the MIDPTAXIS= option as an alias for the HAXIS= option. See the entry for the MIDPOINTS= option for a syntax example.

HMINOR=n
HM=n
[Graphics]
specifies the number of minor tick marks between each major tick mark on the horizontal axis. Minor tick marks are not labeled. The default is 0.

HREF=value-list
draws reference lines perpendicular to the horizontal axis at the values specified. See Output 4.1.1. Also see the CHREF=, HREFCHAR=, and LHREF=options.

HREFCHAR='character'
[Line Printer]
specifies the character used to form the lines requested by the HREF=option. The default is the vertical bar (|).

HREFLABELS='label1' ... 'labeln'
HREFLABEL='label1' ... 'labeln'
HREFLAB='label1' ... 'labeln'
specifies labels for the lines requested by the option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can have up to 16 characters. See Output 4.1.1.

INDICES
requests capability indices based on the fitted distribution. Enclose the keyword INDICES in parentheses after the distribution keyword. See "Indices Using Fitted Curves" for computational details and see Output 4.4.2.

K=NORMAL | QUADRATIC | TRIANGULAR
specifies the kernel function (normal, quadratic, or triangular) used to compute a kernel density estimate. Enclose the K= option in parentheses after the KERNEL option, as in the following statements:

   proc capability;
      histogram length / kernel(k=quadratic);
   run;


You can specify kernel functions for up to five estimates. You can also use the K= option together with the C= option, which specifies standardized bandwidths. If you specify more kernel functions than bandwidths, the last bandwidth in the list is repeated for the remaining estimates. Likewise, if you specify more bandwidths than kernel functions, the last kernel function is repeated for the remaining estimates. For example, the following statements compute three estimates with bandwidths of 0.5, 1.0, and 1.5:

   proc capability;
      histogram length / kernel(c=0.5 1.0 1.5 k=normal quadratic);
   run;


The first estimate uses a normal kernel, and the last two estimates use a quadratic kernel. By default, a normal kernel is used.

KERNEL<( kernel-options )>
superimposes up to five kernel density estimates on the histogram. You can specify the kernel-options described in the following table:

FILLspecifies that the area under the curve is to be filled
COLOR=specifies the color of the curve
L=specifies the line style for the curve
W=specifies the width of the curve
K=specifies the type of kernel function
C=specifies the smoothing parameter
SYMBOL=specifies the character used to plot the kernel density curve if the histogram is produced on a line printer
You can request multiple kernel density estimates on the same histogram by specifying a list of values for either the C= or K= option. For more information, see the entries for these options. Also see Output 3.1.1 and "Kernel Density Estimates". By default, kernel density estimates are computed using the AMISE method.

L=linetype
specifies the line type used for fitted density curves. If used with the KERNEL option, you can specify a list of up to five line types for multiple kernel density estimates. See the entries for the C= and K= options for details on specifying multiple kernel density estimates. The default is 1, which produces a solid line.

LEGEND=name | NONE
[Graphics]
specifies the name of a LEGEND statement describing the legend for specification limit reference lines and fitted curves. Specifying LEGEND=NONE suppresses all legend information and is equivalent to specifying the NOLEGEND option.

LHREF=linetype
LH=linetype
[Graphics]
specifies the line type for lines requested with the option. See Output 4.1.1. The default is 2, which produces a dashed line.

LOGNORMAL<(lognormal-options)>
displays a fitted lognormal density curve on the histogram. The curve equation is

p(x) = \{ \frac{h x 100\%}{\sigma\sqrt{2\pi}(x - \theta)}
 \exp(-\frac{(\log(x-\theta)-\zeta)^2}
 {2\sigma^2})
 & {for  x \gt \theta} \ 0 & {for  x \leq \theta}
 .


where 
 

\theta = threshold parameter \zeta = scale parameter \sigma = shape parameter (\sigma \gt) h = width of histogram interval
Note that the lognormal distribution is also referred to as the SL distribution in the Johnson system of distributions.

The parameter \theta for the lognormal distribution must be less than the minimum data value. You can specify \theta with the THETA= lognormal-option. The default value for \theta is zero. If you specify THETA=EST, a maximum likelihood estimate is computed for \theta.You can specify the parameters \sigma and \zeta with the SIGMA= and ZETA= lognormal-options. By default, maximum likelihood estimates are computed for \sigma and \zeta. For example, the following statements fit a lognormal distribution function with a default value of \theta=0 and with maximum likelihood estimates for \sigmaand \zeta:
   proc capability;
      histogram length / lognormal;
   run;
The LOGNORMAL option can appear only once in a HISTOGRAM statement. Table 4.2 and Table 4.6 list options that you can specify with the LOGNORMAL option. See Example 4.2 and "Formulas for Fitted Curves".

LVREF=linetype
LV=linetype
[Graphics]
specifies the line type for lines requested with the VREF= option. The default is 2, which produces a dashed line.

MAXITER=n
specifies the maximum number of iterations in the Newton-Raphson approximation of the maximum likelihood estimate of \alpha for fitted gamma curves requested with the GAMMA option and c for fitted Weibull curves requested with the WEIBULL option. Enclose the MAXITER= option in parentheses after the GAMMA or WEIBULL option. The default is 20.

MIDPERCENTS
requests a table listing the midpoints and percent of observations in each histogram interval. If you specify the MIDPERCENTS option in parentheses after a density estimate option, a table listing the midpoints, observed percent of observations, and the estimated percent of the population in each interval (estimated from the fitted distribution) is printed. The following statements create the table shown in Figure 4.7:
   proc capability;
      histogram length / gamma(theta=3 midpercents)
   run;


 
The CAPABILITY Procedure
Fitted Gamma Distribution for length

Histogram Bin Percents
for Gamma Distribution
Bin
Midpoint
Percent
Observed Estimated
10.02 12.000 11.480
10.08 32.000 26.182
10.14 28.000 31.354
10.20 18.000 19.916
10.26 6.000 6.766
10.32 4.000 1.238
Figure 4.7: Table of Observed and Expected Percentages

MIDPOINTS=value-list
lists midpoints for the histogram intervals. The midpoints must be listed in increasing order and must be evenly spaced. The difference between consecutive midpoints is used as the width of the histogram bars. The same value-list is used for all variables. See Output 4.2.1.

If you specify the MIDPOINTS= option, the range of the midpoints, extended at each end by half of the bar width, must cover the range of the data as well as any specification limits. For example, if you specify
   midpoints=2 to 10 by 0.5
then all of the observations and specification limits must fall between 1.75 and 10.25 (otherwise, a default list of midpoints is used).

By default, the number of midpoints is determined using the algorithm described in Terrell and Scott (1985). The default midpoints are primarily applicable to continuous data that are approximately normally distributed.

If you display the histogram with a graphic device and use the MIDPOINTS= and HAXIS= options, you can use the ORDER= option in the AXIS statement you specified with the HAXIS= option. However, for the tick mark labels to coincide with the histogram interval midpoints, the range of the ORDER= list must encompass the range of the MIDPOINTS= list, as illustrated in the following statements:
   proc capability;
      histogram length / midpoints=20 to 80 by 10
                         haxis=axis1;
      axis1 length=6 in order=10 20 30 40 50 60 70 80 90;
   run;


MIDPTAXIS=name
[Graphics]
is an alias for the HAXIS= option described earlier in this section.

MU=value
specifies the parameter \mu for normal density curves requested with the NORMAL option. Enclose the MU= option in parentheses after the NORMAL option. The default value is the sample mean.

NAME='string'
[Graphics]
specifies a name for the plot, up to eight characters, that appears in the PROC GREPLAY master menu. The default is 'CAPABILI'.

NOBARS
suppresses drawing of histogram bars. This option is useful when you want to display fitted curves only.

NOCURVELEGEND
NOCURVEL
suppresses the portion of the legend for fitted curves. If you use the INSET statement to display information about the fitted curve on the histogram, you can use the NOCURVELEGEND option to prevent the information about the fitted curve from being repeated in a legend at the bottom of the histogram. See Output 5.1.1.

NOFRAME
suppresses the frame around the subplot area.

NOLEGEND
suppresses legends for specification limits, fitted curves, distribution lines, and hidden observations. See Example 4.6. Specifying the NOLEGEND option is equivalent to specifying LEGEND=NONE.

NOPLOT
suppresses the creation of a plot. Use the NOPLOT option when you want only to print summary statistics for a fitted density or create either an OUTFIT= or an OUTHISTOGRAM= data set. See Example 4.4.

NOPRINT
suppresses printed output summarizing the fitted curve. Enclose the NOPRINT option in parentheses following the distribution option. See "Customizing a Histogram" for an example.

NORMAL<(normal-options)>
displays a fitted normal density curve on the histogram. The curve equation is
p(x) = \frac{h x 100\%}{\sigma\sqrt{2\pi}}
 \exp(-\frac{1}2
 (\frac{x - \mu}{\sigma})^2)
 & {for -\infty \lt x \lt \infty}

where 
 
		 \mu = mean
		 \sigma = standard deviation (\sigma \gt) 
		 h = width of histogram interval
Note that the normal distribution is also referred to as the SN distribution in the Johnson system of distributions.

You can specify values for \mu and \sigmawith the MU= and SIGMA= normal-options, as shown in the following statements:

   proc capability;
      histogram length / normal(mu=14 sigma=0.05);
   run;


By default, the sample mean and sample standard deviation are used for \mu and \sigma. The NORMAL option can appear only once in a HISTOGRAM statement. Table 4.2 and Table 4.7 list options that you can specify with the NORMAL option. See Figure 4.4 and "Formulas for Fitted Curves".

NOSPECLEGEND
NOSPECL
suppresses the portion of the legend for specification limit reference lines. See Figure 4.5.

OUTFIT=SAS-data-set
creates a SAS data set that contains parameter estimates for fitted curves and related goodness-of-fit information. See "Output Data Sets".

OUTHISTOGRAM=SAS-data-set
OUTHIST=SAS-data-set
creates a SAS data set that contains information about histogram intervals. Specifically, the data set contains the midpoints of the histogram intervals, the observed percent of observations in each interval, and the estimated percent of observations in each interval (estimated from each of the specified fitted curves). See "Output Data Sets".

PCTAXIS=name|value-list
[Graphics]
is an alias for the VAXIS= option.

PERCENTS=value-list
PERCENT=value-list
specifies a list of percents for which quantiles calculated from the data and quantiles estimated from the fitted curve are tabulated. The percents must be between 0 and 100. Enclose the PERCENTS= option in parentheses after the curve option. The default percents are 1, 5, 10, 25, 50, 75, 90, 95, and 99. For example, the following statements create the table shown in Figure 4.8:
   proc capability;
      histogram length / lognormal(percents=1 3 5 95 97 99);
   run;


 
The CAPABILITY Procedure
Fitted Lognormal Distribution for length

Quantiles for Lognormal Distribution
Percent Quantile
Observed Estimated
1.0 10.0180 9.95696
3.0 10.0180 9.98937
5.0 10.0310 10.00658
95.0 10.2780 10.24963
97.0 10.2930 10.26729
99.0 10.3220 10.30071
Figure 4.8: Estimated and Observed Quantiles for the Lognormal Curve

PFILL=pattern
specifies a pattern used to fill the bars of the histograms (or the areas under a fitted curve if you also specify the FILL option). See the entries for the CFILL= and FILL options for additional details. Refer to SAS/GRAPH Software: Reference for a list of pattern values. By default, the bars and curve areas are not filled.

RTINCLUDE
includes the right endpoint of each histogram interval in that interval. By default, the left endpoint is included in the histogram interval.

SB<(SB-options )>
displays a fitted Johnson SB density curve on the histogram. The curve equation is
p(x) = \{ \frac{\delta h x 100\%}{\sigma \sqrt{2\pi} }

 [
 ( \frac{x - \theta}{...
 ...theta + \sigma } \ 0 & {for  x \leq \theta\space or  x \geq \theta + \sigma }
 .


where 
 
		 \theta = threshold parameter (-\infty \lt \theta \lt \infty) 
		 \sigma = scale parameter (\sigma \gt) 
		 \delta = shape parameter (\delta \gt) 
		 \gamma = shape parameter (-\infty \lt \gamma \lt \infty) 
		 h = width of histogram interval
The SB distribution is bounded below by the parameter \theta and above by the value \theta + \sigma.The parameter \theta must be less than the minimum data value. You can specify \theta with the THETA= SB-option, or you can request that \theta be estimated with the THETA = EST SB-option. The default value for \theta is zero. The sum \theta + \sigma must be greater than the maximum data value. The default value for \sigma is one. You can specify \sigma with the SIGMA= SB-option, or you can request that \sigma be estimated with the SIGMA = EST SB-option. You can specify \delta with the DELTA= SB-option, and you can specify \gamma with the GAMMA= SB-option. Note that the SB-options are given in parentheses after the SB option.

By default, the method of percentiles is used to estimate the parameters of the SB distribution. Alternatively, you can request the method of moments or the method of maximum likelihood with the FITMETHOD = MOMENTS or FITMETHOD = MLE options, respectively. Consider the following example:
   proc capability;
      histogram length / sb;
      histogram length / sb( theta=est sigma=est );
      histogram length / sb( theta=0.5 sigma=8.4 
                             delta=0.8 gamma=-0.6 );
   run;
The first HISTOGRAM statement fits an SB distribution with default values of \theta=0 and \sigma=1and with percentile-based estimates for \delta and \gamma.The second HISTOGRAM statement estimates all four parameters with the method of percentiles. The third HISTOGRAM statement displays an SB curve with specified values for all four parameters.

The SB option can appear only once in a HISTOGRAM statement. Table 4.2 and Table 4.8 list options you can specify with the SB option.

SCALE=value
is an alias for the SIGMA= option for curves requested by the BETA, EXPONENTIAL, GAMMA, SB, SU, and WEIBULL options and an alias for the ZETA= option for curves requested by the LOGNORMAL option. See Example 4.1.

SHAPE=value
is an alias for the ALPHA= option for curves requested with the GAMMA option, an alias for the SIGMA= option for curves requested with the LOGNORMAL option, and an alias for the C= option for curves requested with the WEIBULL option.

SIGMA=value|EST
specifies the parameter \sigma for curves requested with the BETA, EXPONENTIAL, GAMMA, LOGNORMAL, NORMAL, SB, SU, and WEIBULL options. Enclose the SIGMA= option in parentheses after the distribution option. The following table summarizes the use of the SIGMA= option:

Distribution Keyword SIGMA= Specifies Default Value Alias
BETAscale parameter \sigma1SCALE=
EXPONENTIALscale parameter \sigmamaximum likelihood estimateSCALE=
GAMMAscale parameter \sigmamaximum likelihood estimateSCALE=
LOGNORMALshape parameter \sigmamaximum likelihood estimateSHAPE=
NORMALscale parameter \sigmastandard deviation 
SBscale parameter \sigma1SCALE=
SUscale parameter \sigmapercentile-based estimate 
WEIBULLscale parameter \sigmamaximum likelihood estimateSCALE=


With the BETA distribution option, you can specify SIGMA=EST to request a maximum likelihood estimate for \sigma.For syntax examples, see the entries for the BETA and NORMAL options.

SPECLEGEND=name | NONE
specifies the name of a LEGEND statement describing the legend for specification limits and fitted curves. Specifying SPECLEGEND=NONE, which suppresses the portion of the legend for specification limit references lines, is equivalent to specifying the NOSPECLEGEND option.

SU<(SU-options )>
displays a fitted Johnson SU density curve on the histogram. The curve equation is
p(x) = \{ \frac{ \delta h x 100\%}{\sigma \sqrt{2\pi} }
 \frac{ 1 }
 { \sqrt{ 1 ...
 ...}{\sigma} )
 )^2
 ]
 & {for  x \gt \theta } \ 0 & {for  x \leq \theta\space }
 .


where 
 

\theta = location parameter (-\infty \lt \theta \lt \infty) \sigma = scale parameter (\sigma \gt) \delta = shape parameter (\delta \gt) \gamma = shape parameter (-\infty \lt \gamma \lt \infty) h = width of histogram interval
You can specify the parameters with the THETA=, SIGMA=, DELTA=, and GAMMA= SU-options, which are enclosed in parentheses after the SU option. If you do not specify these parameters, they are estimated.

By default, the method of percentiles is used to estimate the parameters of the SU distribution. Alternatively, you can request the method of moments or the method of maximum likelihood with the FITMETHOD = MOMENTS or FITMETHOD = MLE options, respectively. Consider the following example:
   proc capability;
      histogram length / su;      
      histogram length / su( theta=0.5 sigma=8.4 
                             delta=0.8 gamma=-0.6 );
   run;
The first HISTOGRAM statement estimates all four parameters with the method of percentiles. The second HISTOGRAM statement displays an SU curve with specified values for all four parameters.

The SU option can appear only once in a HISTOGRAM statement. Table 4.2 and Table 4.9 list options you can specify with the SU option.

SYMBOL='character'
[Line Printer]
specifies the character used to plot the density curve or kernel density curve if the histogram is produced on a line printer. Enclose the SYMBOL= option in parentheses after the distribution option or the KERNEL option. The default character is the first letter of the distribution keyword or `1' for the first kernel density estimate, `2' for the second kernel density estimate, and so on. If you use the SYMBOL= option with the KERNEL option, you can specify a list of up to five characters in parentheses for multiple kernel denisty estimates. If there are more estimates than characters, the last character specified is used for the remaining estimates.

THETA=value|EST
specifies the lower threshold parameter \theta for curves requested with the BETA, EXPONENTIAL, GAMMA, LOGNORMAL, SB, and WEIBULL options, and the location parameter \theta for curves requested with the SU option. Enclose the THETA= option in parentheses after the curve option. See Example 4.1. The default value is zero. If you specify THETA=EST, an estimate is computed for \theta.

THRESHOLD=value
is an alias for the THETA= option. See the preceding entry for the THETA= option.

VAXIS=name|value-list
[Graphics]
specifies the name of an AXIS statement describing the vertical axis. Alternatively, you can specify a value-list for the vertical axis. The PCTAXIS= option is an alias for the VAXIS= option. See Example 4.1.

VMINOR=n
VM=n
[Graphics]
specifies the number of minor tick marks between each major tick mark on the vertical axis. Minor tick marks are not labeled. The default is zero.

VREF=value-list
draws reference lines perpendicular to the vertical axis at the values specified. Also see the CVREF=, LVREF=, and VREFCHAR= options.

VREFCHAR='character'
[Line Printer]
specifies the character used to form the lines requested by the VREF= option for a line printer. The default is a hyphen (-).

VREFLABELS='label1' ... 'labeln'
VREFLABEL='label1' ... 'labeln'
VREFLAB='label1' ... 'labeln'
specifies labels for the lines requested by the VREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can have up to 16 characters.

VSCALE=COUNT | PERCENT | PROPORTION
specifies the scale of the vertical axis. The value COUNT scales the data in units of the number of observations per data unit. The value PERCENT scales the data in units of percent of observations per data unit. The value PROPORTION scales the data in units of proportion of observations per data unit. See Figure 4.5 for an illustration of VSCALE=COUNT. The default is PERCENT.

W=n
[Graphics]
specifies the width in pixels of the fitted curve or the kernel density estimate curve. Enclose the W= option in parentheses after the distribution option or the KERNEL option (with the KERNEL option, you can specify a list of up to five W= values). For example, the following statements display a normal curve with a width of 3:
   proc capability;
      histogram length / normal(w=3);
   run;
The default is 1.

WEIBULL<(Weibull-options)>
displays a fitted Weibull density curve on the histogram. The curve equation is
p(x) = \{ \frac{ch x 100\%}{\sigma}
 (\frac{x - \theta}{\sigma})^{c - 1}
 \exp(-(\frac{x- \theta}{\sigma})^c)
 & {for  x \gt \theta} \ 0 & {for  x \leq \theta}
 .

where 
 
		 \theta = threshold parameter
		 \sigma = scale parameter (\sigma \gt) 
		 c = shape parameter (c >0) 
		 h = width of histogram interval
The parameter \theta must be less than the minimum data value. You can specify \theta with the THETA= Weibull-option. The default value for \theta is zero. If you specify THETA=EST, a maximum likelihood estimate is computed for \theta.You can specify \sigma and c with the SIGMA= and C= Weibull-options. By default, maximum likelihood estimates are computed for c and \sigma. For example, the following statements fit a Weibull distribution with \theta=15 and with maximum likelihood estimates for \sigma and c:

   proc capability;
      histogram length / weibull(theta=15);
   run;


Note that the maximum likelihood estimate of c is calculated iteratively using the Newton-Raphson approximation. The CDELTA=, CINITIAL=, and MAXITER= Weibull-options control the approximation.

The WEIBULL option can appear only once in a HISTOGRAM statement. Table 4.2 and Table 4.10 list the options that you can specify with the WEIBULL option. See Example 4.2 and "Formulas for Fitted Curves".

ZETA=value
specifies a value for the scale parameter \zeta for lognormal density curves requested with the LOGNORMAL option. Enclose the ZETA= option in parentheses after the LOGNORMAL option. By default, the procedure calculates a maximum likelihood estimate for \zeta. You can specify the SCALE= option as an alias for the ZETA= option.

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.