Chapter Contents

Previous

Next
The UNIVARIATE Procedure

PROBPLOT Statement


Creates a probability plot by using high-resolution graphs, which compare ordered variable values with the percentiles of a specified theoretical distribution.

Alias: PROB
Default: Normal probability plot
Restriction: You can specify only one theoretical distribution.
Tip: You can use multiple PROBPLOT statements.
Main discussion:
Featured in: Quantile-Quantile and Probability Plots


PROBPLOT <variable(s)> </ option(s)>;

To do this: Use this option:
Request a distribution

Specify beta probability plot with required shape parameters [IMAGE], [IMAGE]. BETA(beta-suboptions)

Specify exponential probability plot EXPONENTIAL(exponential-suboptions)

Specify gamma probability plot with a required shape parameter [IMAGE] GAMMA(gamma-suboptions)

Specify lognormal probability plot with a required shape parameter [IMAGE] LOGNORMAL(lognormal-suboptions)

Specify normal probability plot NORMAL(normal-suboptions)

Specify three-parameter Weibull probability plot with a required shape parameter [IMAGE] WEIBULL(Weibull-suboptions)

Specify two-parameter Weibull probability plot WEIBULL2(Weibull2-suboptions)
Distribution suboptions

Specify shape parameter [IMAGE] for the beta or gamma distribution ALPHA=

Specify shape parameter [IMAGE] for the beta distribution BETA=

Specify shape parameter [IMAGE] for the Weibull distribution or [IMAGE] for distribution reference line of the Weibull2 distribution C=

Specify [IMAGE] for distribution reference line for the normal distribution MU=

Specify [IMAGE] for distribution reference line for the beta, exponential, gamma, normal, Weibull, or Weibull2 distribution or the required shape parameter [IMAGE] for the lognormal option SIGMA=

Specify slope of distribution reference line for the lognormal or Weibull2 distribution SLOPE=

Specify [IMAGE] for distribution reference line for the beta, exponential, gamma, lognormal, or Weibull distribution, or the lower known threshold [IMAGE] for the Weibull2 distribution THETA=

Specify [IMAGE] for distribution reference line for the lognormal distribution ZETA=
Control appearance of distribution reference line

Specify color of distribution reference line COLOR=

Specify line type of distribution reference line L=

Specify width of distribution reference line W=
Control general plot layout

Create a grid GRID

Specify reference lines perpendicular to the horizontal axis HREF=

Specify labels for HREF lines HREFLABELS=

Specify a line style for grid lines LGRID=

Adjust sample size when computing percentiles NADJ=

Suppress frame around plotting area NOFRAME

Request minor tick marks for percentile axis PCTLMINOR

Specify tick mark labels for percentile axis PCTLORDER=

Adjust ranks when computing percentiles RANKADJ=

Display plot in square format SQUARE

Specify reference lines perpendicular to the vertical axis VREF=

Specify labels for VREF lines VREFLABELS=
Enhance the probability plot

Specify annotate data set ANNOTATE=

Specify color for axis CAXIS=

Specify color for frame CFRAME=

Specify color for HREF= lines CHREF=

Specify color for text CTEXT=

Specify color for VREF= lines CVREF=

Specify description for plot in graphics catalog DESCRIPTION=

Specify software font for text FONT=

Specify number of horizontal minor tick marks HMINOR=

Specify line style for HREF= lines LHREF=

Specify line style for VREF= lines LVREF=

Specify name for plot in graphics catalog NAME=

Specify number of vertical minor tick marks VMINOR=
Enhance the comparative probability plot

Apply annotation requested in ANNOTATE= data set to key cell only ANNOKEY

Specify color for filling frame for row labels CFRAMESIDE=

Specify color for filling frame for column labels CFRAMETOP=

Specify distance between tiles INTERTILE=

Specify number of columns in comparative probability plot NCOLS=

Specify number of rows in comparative probability plot NROWS=


Arguments

variable(s)
identifies one or more variables that the procedure uses to create probability plots.
Default: If you omit variable(s) in the PROBPLOT statement then the procedure creates a probability plot for each variable that you list in the VAR statement, or for each numeric variable in the DATA= data set if you omit a VAR statement.
Requirement: If you specify a VAR statement, use a subset of the variable(s) that you list in the VAR statement. Otherwise, variable(s) are any numeric variables in the DATA= data set.


Options

ALPHA=value(s)|EST
specifies the required shape parameter [IMAGE] [IMAGE] for probability plots when you request the BETA or GAMMA options. The PROBPLOT statement creates a plot for each value that you specify.
Requirement: Enclose this suboption in parentheses following the BETA or GAMMA options.
Tip: To compute a maximum likelihood estimate for [IMAGE], specify ALPHA=EST.

ANNOKEY
specifies to apply the annotation requested with the ANNOTATE= option to the key cell only. By default, PROC UNIVARIATE applies annotation to all of the cells.
Requirement: This option is ignored unless you specify the CLASS statement.
Tip: Use the KEYLEVEL= option in the CLASS statement to specify the key cell.
See also: the KEYLEVEL= option

ANNOTATE=SAS-data-set
specifies an input data set that contains annotate variables as described in SAS/GRAPH Software: Reference.
Alias: ANNO=
Tip: The ANNOTATE = data set that you specify in the PROBPLOT statement is used by all plots that this statement creates. You can also specify an ANNOTATE= data set in the PROC UNIVARIATE statement to enhance all the graphics displays that the procedure creates.
See also: the ANNOTATE= option in the PROC UNIVARIATE statement

BETA(ALPHA=value(s)|EST BETA=value(s)|EST <beta-suboptions>)
displays a beta probability plot for each combination of the required shape parameters [IMAGE] and [IMAGE].
Requirement: You must specify the shape parameters with the ALPHA= and BETA= suboptions.
Interaction: To create a plot that is based on maximum likelihood estimates for [IMAGE] and [IMAGE], specify ALPHA=EST and BETA=EST.
Tip: To obtain graphical estimates of [IMAGE] and [IMAGE], specify lists of values in the ALPHA= and BETA= suboptions. Then select the combination of [IMAGE] and [IMAGE] that most nearly linearizes the point pattern.

To assess the point pattern, add a diagonal distribution reference line that corresponds to the lower threshold parameter [IMAGE] and the scale parameter [IMAGE] with the THETA= and SIGMA= suboptions. Alternatively, you can add a line that corresponds to estimated values of [IMAGE] and [IMAGE] with THETA=EST and SIGMA=EST.

Agreement between the reference line and the point pattern indicates that the beta distribution with parameters [IMAGE], [IMAGE], [IMAGE], and [IMAGE] is a good fit.

Main discussion: Beta Distribution
See also: the ALPHA= suboption and BETA= suboption

BETA=value(s)|EST
specifies the shape parameter [IMAGE] for probability plots when you request the BETA distribution option. PROC UNIVARIATE creates a plot for each value that you specify.
Alias: B=
Requirement: Enclose this suboption in parentheses after the BETA option.
Tip: To compute a maximum likelihood estimate for [IMAGE], specify BETA=EST.

C=value(s)|EST
specifies the shape parameter [IMAGE] for probability plots when you request the WEIBULL option or WEIBULL2 option. C= is a required suboption in the WEIBULL option.
Requirement: Enclose this suboption in parentheses after the WEIBULL option or WEIBULL2 option.
Interaction: To request a distribution reference line in the WEIBULL2 option, you must specify both the C= and SIGMA= suboptions.
Tip: To compute a maximum likelihood estimate for [IMAGE], specify C=EST.

CAXIS=color
specifies the color for the axes.
Alias: CAXES=
Default: the first color in the device color list
Interaction: This option overrides any COLOR= specification.

CFRAME=color
specifies the color for the area that is enclosed by the axes and frame.
Default: the area is not filled

CFRAMESIDE=color
specifies the color to fill the frame area for the row labels that display along the left side of the comparative probability plot. This color also fills the frame area for the label of the corresponding class variable (if you associate a label with the variable).
Default: These areas are not filled.
Requirement: This option is ignored unless you specify the CLASS statement.

CFRAMETOP=color
specifies the color to fill the frame area for the column labels that display across the top of the comparative probability plot. This color also fills the frame area for the label of the corresponding class variable (if you associate a label with the variable).
Default: These areas are not filled.
Requirement: This option does not apply unless you specify the CLASS statement.

CHREF=color
specifies the color for horizontal axis reference lines when you specify the HREF= option.
Default: the first color in the device color list

COLOR=color
specifies the color of the diagonal distribution reference line.
Default: the first color in the device color list
Requirement: You must enclose this suboption in parentheses after a distribution option keyword.

CTEXT=color
specifies the color for tick mark values and axis labels.
Default: the color that you specify for the CTEXT= option in the GOPTIONS statement. If you omit the GOPTIONS statement, the default is the first color in the device color list.

CVREF=color
specifies the color for the reference lines that you request with the VREF= option.
Alias: CV=
Default: the first color in the device color list

DESCRIPTION='string'
specifies a description, up to 40 characters long, that appears in the PROC GREPLAY master menu.
Alias: DES=
Default: the variable name

EXPONENTIAL<(exponential-options)>
displays an exponential probability plot.
Alias: EXP
Tip: To assess the point pattern, add a diagonal distribution reference line that corresponds to [IMAGE] and [IMAGE] with the THETA= and SIGMA= suboptions. Alternatively, you can add a line that corresponds to estimated values of the threshold parameter [IMAGE] and the scale parameter [IMAGE] with the THETA=EST and SIGMA=EST suboptions.

Agreement between the reference line and the point pattern indicates that the exponential distribution with parameters [IMAGE] and [IMAGE] is a good fit.

Main discussion: Exponential Distribution
See also: the SIGMA= suboption and the THETA= suboption

FONT=font
specifies a software font for the reference lines and the axis labels.
Default: hardware characters
Interaction: FONT=font takes precedence over the FTEXT=font that you specify in the GOPTIONS statement.

GAMMA(ALPHA=value(s)|EST <gamma-suboptions>)
displays a gamma probability plot for each value of the required shape parameter [IMAGE].
Requirement: You must specify the shape parameter with the ALPHA= suboption.
Interaction: To create a plot that is based on a maximum likelihood estimate for [IMAGE], specify ALPHA=EST.
Tip: To obtain a graphical estimate of [IMAGE], specify a list of values in the ALPHA= suboption. Then select the value that most nearly linearizes the point pattern.

To assess the point pattern, add a diagonal distribution reference line that corresponds to the threshold parameter [IMAGE] and the scale parameter [IMAGE] with the THETA= and SIGMA= suboptions. Alternatively, you can add a line that corresponds to estimated values of [IMAGE] and [IMAGE] with THETA=EST and SIGMA=EST.

Agreement between the reference line and the point pattern indicates that the exponential distribution with parameters [IMAGE], [IMAGE], and [IMAGE] is a good fit.

Main discussion: Gamma Distribution
See also: the ALPHA= suboption, SIGMA suboption , and THETA suboption

GRID
displays a grid, drawing reference lines that are perpendicular to the percentile axis at major tick marks.
Default: 1

HMINOR=n
specifies the number of minor tick marks between each major tick mark on the horizontal axis. PROC UNIVARIATE does not label minor tick marks.
Alias: HM=
Default: 0

HREF=value(s)
draws reference lines that are perpendicular to the horizontal axis at the values you specify.
See also: CHREF= option

HREFLABELS='label1' ... ' labeln'
specifies labels for the reference lines that you request with the HREF= option.
Alias: HREFLABEL= and HREFLAB=
Restriction: The number of labels must equal the number of reference lines. Labels can have up to 16 characters.

HREFLABPOS=n
specifies the vertical position of HREFLABELS= labels, where n is
1 positions the labels at the left of the plot
2 positions the labels along the top of the plot
3 positions the labels from top to bottom
Default: 1

INTERTILE=value
specifies the distance in horizontal percentage screen units between the framed areas, which are called tiles.
Default: The tiles are contiguous.
Requirement: This option is ignored unless you specify the CLASS statement.

L=linetype
specifies the line type for a diagonal distribution reference line.
Default: 1, which produces a solid line
Requirement: You must enclose this suboption in parentheses after a distribution option.

LGRID=linetype
specifies the line type for the grid that you request with the GRID= option.
Default: 1, which produces solid lines

LHREF=linetype
specifies the line type for the reference lines that you request with the HREF= option.
Alias: LH=
Default: 2, which produces a dashed line

LOGNORMAL(SIGMA=value(s)|EST <lognormal-suboptions>)
displays a lognormal probability plot for each value of the required shape parameter [IMAGE].
Alias: LNORM
Requirement: You must specify the shape parameter with the SIGMA= suboption.
Interaction: To compute a maximum likelihood estimate for [IMAGE], specify SIGMA=EST.
Tip: To obtain a graphical estimate of [IMAGE], specify a list of values for the SIGMA= suboption, and select the value that most nearly linearizes the point pattern.

To assess the point pattern, add a diagonal distribution reference line that corresponds to the threshold parameter [IMAGE] and the scale parameter [IMAGE] with the THETA= and ZETA= suboptions. Alternatively, you can add a line that corresponds to estimated values of [IMAGE] and [IMAGE] with THETA=EST and ZETA=EST.

Agreement between the reference line and the point pattern indicates that the lognormal distribution with parameters [IMAGE], [IMAGE], and [IMAGE] is a good fit.

Main discussion: Lognormal Distribution
See also: the SIGMA= suboption , SLOPE= suboption , THETA= suboption , and ZETA= suboption

LVREF=linetype
specifies the line type for the reference lines that you request with the VREF= option.
Default: 2, which produces a dashed line

MU=value|EST
specifies the mean [IMAGE] for a normal probability plot requested with the NORMAL option.
Default: the sample mean
Requirement: You must enclose this suboption in parentheses after the NORMAL option.
Tip: Specify the MU= and SIGMA= suboptions together to request a distribution reference line. Specify MU=EST to request a distribution reference line with [IMAGE] equal to the sample mean.
Featured in: Displaying a Reference Line on a Normal Probability Plot

NADJ=value
specifies the adjustment value that is added to the sample size in the calculation of theoretical percentiles. For additional information, see Chambers et al. (1983)
Default: [IMAGE] as recommended by Blom (1958)

NAME='string'
specifies a name for the plot, up to eight characters long, that appears in the PROC GREPLAY master menu.
Default: UNIVAR

NCOLS=n
specifies the number of columns in the comparative probability plot.
Alias: NCOL=
Default: NCOLS=1, if you specify only one class variable, and NCOLS=2, if you specify two class variables.
Requirement: This option is ignored unless you specify the CLASS statement.
Interaction: If you specify two class variables, you can use the NCOLS= option with the NROWS= option.

NOFRAME
suppresses the frame around the area that is bounded by the axes.

NORMAL<(normal-suboptions)>
displays a normal probability plot. This is the default if you omit a distribution option.
Tip: To assess the point pattern, add a diagonal distribution reference line that corresponds to [IMAGE] and [IMAGE] with the MU= and SIGMA= suboptions. Alternatively, you can add a line that corresponds to estimated values of [IMAGE] and [IMAGE] with the THETA=EST and SIGMA=EST; the estimates of the mean [IMAGE]and the standard deviation [IMAGE] are the sample mean and sample standard deviation.

Agreement between the reference line and the point pattern indicates that the normal distribution with parameters [IMAGE] and [IMAGE] is a good fit.

Main discussion: Normal Distribution
See also: the MU= suboption and SIGMA= suboption
Featured in: Displaying a Reference Line on a Normal Probability Plot

NROWS=n
specifies the number of rows in the comparative probability plot.
Alias: NROW=
Default: 2
Requirement: This option is ignored unless you specify the CLASS statement.
Interaction: If you specify two class variables, you can use the NCOLS= option with the NROWS= option.

PCTLMINOR
requests minor tick marks for the percentile axis.
Featured in: Displaying a Reference Line on a Normal Probability Plot

PCTLORDER=value(s)
specifies the tick marks that are labeled on the theoretical percentile axis.
Default: 1, 5, 10, 25, 50, 75, 90, 95, and 99
Range: 0 [le] value [le] 100
Restriction: The values that you specify must be in increasing order and cover the plotted percentile range. Otherwise, PROC UNIVARIATE uses the default.

RANKADJ=value
specifies the adjustment value that PROC UNIVARIATE adds to the ranks in the calculation of theoretical percentiles. For additional information, see Chambers et al. (1983).
Default: [IMAGE] as recommended by Blom (1958)

SCALE=value
is an alias for the SIGMA= option when you request probability plots with the BETA, EXPONENTIAL, GAMMA, and WEIBULL options and for the ZETA= option when you request the LOGNORMAL option.
See also: the SIGMA= suboption and ZETA= suboption

SHAPE=value(s)|EST
is an alias for the ALPHA=option when you request gamma plots with the GAMMA option, for the SIGMA= option when you request lognormal plots with the LOGNORMAL option, and for the C= option when you request Weibull plots with the WEIBULL and WEIBULL2 options.
See also: the ALPHA= suboption , SIGMA= suboption , and C= suboption

SIGMA=value(s)|EST
specifies the parameter [IMAGE], where [IMAGE]. The interpretation and use of the SIGMA= option depend on which distribution you specify, as shown Uses of the SIGMA Suboption .

Uses of the SIGMA Suboption
Distribution Option Uses of the SIGMA= Option
BETA, EXPONENTIAL

GAMMA, WEIBULL

THETA= [IMAGE] and SIGMA= [IMAGE] request a distribution reference line that corresponds to [IMAGE] and [IMAGE].
LOGNORMAL SIGMA= [IMAGE] requests [IMAGE] probability plots with shape parameters [IMAGE]. The SIGMA= option is required.
NORMAL MU= [IMAGE] and SIGMA= [IMAGE] request a distribution reference line that corresponds to [IMAGE] and [IMAGE]. SIGMA=EST requests a line with [IMAGE] equal to the sample standard deviation.
WEIBULL2 SIGMA= [IMAGE] and C= [IMAGE] request a distribution reference line that corresponds to [IMAGE] and [IMAGE].

Requirement: You must enclose this suboption in parentheses after the distribution option.
Tip: To compute a maximum likelihood estimate for [IMAGE], specify SIGMA=EST.
Featured in: Displaying a Reference Line on a Normal Probability Plot

SLOPE=value|EST
specifies the slope for a distribution reference when you request the LOGNORMAL option or WEIBULL2 option.
Requirement: You must enclose this suboption in parentheses after the distribution option.
Tip: When you use the LOGNORMAL option and SLOPE= to request the line, you must also specify a threshold parameter value [IMAGE] with the THETA= suboption. SLOPE= is an alternative to the ZETA= suboption for specifying [IMAGE], because the slope is equal to [IMAGE].

When you use the WEIBULL2 option and SLOPE= option to request the line, you must also specify a scale parameter value [IMAGE] with the SIGMA= suboption. SLOPE= is an alternative to the C= suboption for specifying [IMAGE], because the slope is equal to [IMAGE].

For example, the first and second PROBPLOT statements produce the same probability plots as the third and fourth PROBPLOT statements:

proc univariate data=measures;
   probplot width /lognormal(sigma=2 theta=0 zeta=0);
   probplot width /lognormal(sigma=2 theta=0 slope=1);
   probplot width /weibull2(sigma=2 theta=0 c=.25);
   probplot width /weibull2(sigma=2 theta=0 slope=4);  
Main Discussion: Three-Parameter Weibull Distribution

SQUARE
displays the probability plot in a square frame.
Default: rectangular frame

THETA=value|EST
specifies the lower threshold parameter [IMAGE] for probability plots when you request the BETA, EXPONENTIAL, GAMMA, LOGNORMAL, WEIBULL, or WEIBULL2 option.
Default: 0
Requirement: You must enclose this suboption in parentheses after the distribution option.
Interaction: When you use the WEIBULL2 option, the THETA= suboption specifies the known lower threshold [IMAGE], which by default is 0.

When you use the THETA= suboption with another distribution option, THETA= specifies [IMAGE] for a distribution reference line. To compute a maximum likelihood estimate for [IMAGE], specify THETA=EST. To request the line, you must also specify a scale parameter.

THRESHOLD= value
is an alias for the THETA= option. See the THETA= suboption .

VMINOR=n
specifies the number of minor tick marks between each major tick mark on the vertical axis. PROBPLOT does not label minor tick marks.
Alias: VM=
Default: 0

VREF=value(s)
draws reference lines that are perpendicular to the vertical axis at the value(s) that you specify.
See also: CVREF= option and LVREF= option .

VREFLABELS=' label1'... 'labeln'
specifies labels for the reference lines that you request with the VREF= option.
Alias: VREFLABEL= and VREFLAB=
Restriction: The number of labels must equal the number of reference lines. Labels can have up to 16 characters.

W=n
specifies the width in pixels for a diagonal distribution line.
Default: 1
Requirement: You must enclose this suboption in parentheses after the distribution option.

WEIBULL(C=value(s)|EST <Weibull-suboptions>)
creates a three-parameter Weibull probability plot for each value of the required shape parameter [IMAGE].
Alias: WEIB
Requirement: You must specify the shape parameter with the C= suboption.
Interaction: To create a plot that is based on a maximum likelihood estimate for [IMAGE], specify C=EST.
Tip: To obtain a graphical estimate of [IMAGE], specify a list of values in the C= suboption. Then select the value that most nearly linearizes the point pattern.

To assess the point pattern, add a diagonal distribution reference line that corresponds to [IMAGE] and [IMAGE] with the THETA= and SIGMA= suboptions. Alternatively, you can add a line that corresponds to estimated values of [IMAGE] and [IMAGE] with THETA=EST and SIGMA=EST.

Agreement between the reference line and the point pattern indicates that the Weibull distribution with parameters [IMAGE], [IMAGE], and [IMAGE] is a good fit.

Main discussion: Three-Parameter Weibull Distribution
See also the C= suboption , SIGMA= suboption , and THETA= suboption

WEIBULL2<(Weibull-suboptions)>
creates a two-parameter Weibull probability plot. Use this distribution when your data have a known lower threshold [IMAGE], which by default is 0. To specify the threshold value [IMAGE], use the THETA= suboption.
Alias: W2
Tip: An advantage of the two-parameter Weibull plot over the three-parameter Weibull plot is that the parameters [IMAGE] and [IMAGE] can be estimated from the slope and intercept of the point pattern. A disadvantage is that the two-parameter Weibull distribution applies only in situations where the threshold parameter is known.
Tip: To obtain a graphical estimate of [IMAGE], specify a list of values for the C= suboption. Then select the value that most nearly linearizes the point pattern.

To assess the point pattern, add a diagonal distribution reference line that corresponds to [IMAGE] and [IMAGE] with the SIGMA= and C= suboptions. Alternatively, you can add a distribution reference line that corresponds to estimated values of [IMAGE] and [IMAGE] with SIGMA=EST and C=EST.

Agreement between the reference line and the point pattern indicates that the Weibull2 distribution with parameters [IMAGE], [IMAGE], and [IMAGE] is a good fit.

Main discussion: Two-Parameter Weibull Distribution
See also: the C= suboption , SIGMA= suboption , SLOPE= suboption , and THETA= suboption

ZETA= value|EST
specifies a value for the scale parameter [IMAGE] for the lognormal probability plots when you request the LOGNORMAL option.
Requirement: You must enclose this suboption in parentheses after the LOGNORMAL option.
Interaction: To request a distribution reference line with intercept [IMAGE] and slope [IMAGE], specify THETA= [IMAGE] and ZETA= [IMAGE].


Chapter Contents

Previous

Next

Top of Page

Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.