Chapter Contents

Previous

Next
The UNIVARIATE Procedure

HISTOGRAM Statement


Creates histograms using high-resolution graphics and optionally superimposes parametric and nonparametric density curve estimates.

Alias: HIST
Tip: You can use multiple HISTOGRAM statements.
Featured in: Fitting Density Curves and Creating a Two-Way Comparative Histogram


HISTOGRAM <variable(s)> </ option(s)>;

To do this Use this option
Create output data set with information on histogram intervals OUTHISTOGRAM=
Request estimated density curve

Fit beta density with threshold parameter [IMAGE], scale parameter [IMAGE], and shape parameters [IMAGE] and [IMAGE] BETA(beta-suboptions)

Fit exponential density with threshold parameter [IMAGE] and scale parameter [IMAGE] EXPONENTIAL(exponential-suboptions)

Fit gamma density with threshold parameter [IMAGE], scale parameter [IMAGE], and shape parameter [IMAGE] GAMMA(gamma-suboptions)

Fit nonparametric kernel density estimates KERNEL(kernel-suboptions)

Fit lognormal density with threshold parameter [IMAGE], scale parameter [IMAGE], and shape parameter [IMAGE] LOGNORMAL(lognormal-suboptions)

Fit normal density with mean [IMAGE] and standard deviation [IMAGE] NORMAL(normal-suboptions)

Fit Weibull density with threshold parameter [IMAGE], scale parameter [IMAGE], and shape parameter [IMAGE] WEIBULL(Weibull-suboptions)
Parametric density curve suboptions

Specify shape parameter [IMAGE] for fitted beta or gamma curve ALPHA=

Specify second shape parameter [IMAGE] for beta fitted curve BETA=

Specify shape parameter [IMAGE] for fitted Weibull curve C=

Specify the mean [IMAGE] for fitted normal curve MU=

Specify scale parameter [IMAGE] for the fitted beta curve, exponential curve, gamma curve and Weibull curve; standard deviation [IMAGE] for fitted normal curve; or the scale parameter [IMAGE] for the fitted lognormal curve SIGMA=

Specify threshold parameter [IMAGE] for fitted beta curve, exponential curve, gamma curve, lognormal curve, and Weibull curve THETA=

Specify scale parameter [IMAGE] for fitted lognormal curve ZETA=
Nonparametric density curve suboptions

Specify standardized bandwidth parameter [IMAGE] for fitted kernel density estimates C=

Specify type of kernel density curve K=
Control appearance of fitted density curves

Specify color of fitted curve COLOR=

Fill area under fitted curve FILL

Specify line type of fitted curve L=

Display table of histogram interval midpoints MIDPERCENTS

Suppress the table summarizing the fitted curve NOPRINT

List percentages for calculated and estimated quantiles PERCENTS=

Specify width of fitted density curve W=
Control general histogram layout

Specify width for the bars BARWIDTH=

Force creation of a histogram FORCEHIST

Create a grid GRID

Specify offset for horizontal axis HOFFSET=

Specify reference lines perpendicular to the horizontal axis HREF=

Specify labels for HREF= lines HREFLABELS=

Specify vertical position of labels for HREF= lines HREFLABPOS=

Specify a line style for grid lines LGRID=

List percentages for histogram intervals MIDPOINTS=

Suppress histogram bars NOBARS

Suppress frame around plotting area NOFRAME

Suppress label for horizontal axis NOHLABEL

Suppress plot NOPLOT

Suppress label for vertical axis NOVLABEL

Suppress tick marks and tick mark labels for vertical axis NOVTICK

Include right endpoint in interval RTINCLUDE

Turn and vertically string out characters in labels for vertical axis TURNVLABELS

Specify tick mark values for vertical axis VAXIS=

Specify label for vertical axis VAXISLABEL=

Specify length of offset at upper end of vertical axis VOFFSET=

Specify reference lines perpendicular to the vertical axis VREF=

Specify labels for VREF= lines VREFLABELS=

Specify horizontal position of labels for VREF= lines VREFLABPOS=

Specify scale for vertical axis VSCALE=

Specify line thickness for axes and frame WAXIS=

Specify line thickness for grid WGRID=
Enhance the graph

Specify annotate data set ANNOTATE=

Specify color for axis CAXIS=

Specify color of outlines of histogram bars CBARLINE=

Specify color for filling under curve CFILL=

Specify color for frame CFRAME=

Specify color for grid lines CGRID=

Specify color for HREF= lines CHREF=

Specify color for text CTEXT=

Specify color for VREF= lines CVREF=

Specify description for plot in graphics catalog DESCRIPTION=

Specify software font for text FONT=

Specify height of text used outside framed areas HEIGHT=

Specify number of horizontal minor tick marks HMINOR=

Specify software font for text inside framed areas INFONT=

Specify height of text inside framed areas INHEIGHT=

Specify line style for HREF= lines LHREF=

Specify line style for VREF= lines LVREF=

Specify name for plot in graphics catalog NAME=

Specify pattern for filling under curve PFILL=

Specify number of vertical minor tick marks VMINOR=

Specify line thickness for bar outlines WBARLINE=
Enhance comparative histograms

Apply annotation requested in ANNOTATE= data set to key cell only ANNOKEY

Specify color for filling frame for row labels CFRAMESIDE=

Specify color for filling frame for column labels CFRAMETOP=

Specify color for proportion of frequency bar CPROP=

Specify distance between tiles INTERTILE=

Specify maximum number of bins to display MAXNBIN=

Limit the number of bins that display to within a specified number of standard deviations above and below mean of data in key cell MAXSIGMAS=

Specify number of columns in comparative histogram NCOLS=

Specify number of rows in comparative histogram NROWS=


Arguments

variable(s)
identifies one or more analysis variables that the procedure uses to create histograms.
Default: If you omit variable(s) in the HISTOGRAM statement, then the procedure creates a histogram for each variable that you list in the VAR statement, or for each numeric variable in the DATA= data set if you omit a VAR statement.
Requirement: If you specify a VAR statement, use a subset of the variable(s) that you list in the VAR statement. Otherwise, variable(s) are any numeric variables in the DATA= data set.


Options

ALPHA=value
specifies the shape parameter [IMAGE] for fitted density curves when you request the BETA and GAMMA options.
Alias: A= if you use it as a beta-suboption. SHAPE= if you use it as a gamma-suboption
Default: a maximum likelihood estimate
Requirement: Enclose this suboption in parentheses after the BETA option or GAMMA option.

ANNOKEY
specifies to apply the annotation requested with the ANNOTATE= option to the key cell only. By default, PROC UNIVARIATE applies annotation to all of the cells.
Requirement: This option is ignored unless you specify the CLASS statement.
Tip: Use the KEYLEVEL= option in the CLASS statement to specify the key cell.
See also: the KEYLEVEL= option

ANNOTATE=SAS-data-set
specifies an input data set that contains annotate variables as described in SAS/GRAPH Software: Reference.
Alias: ANNO=
Tip: You can also specify an ANNOTATE= data set in the PROC UNIVARIATE statement to enhance all the graphic displays that the procedure creates.
See also: ANNOTATE= in the PROC UNIVARIATE statement

BARWIDTH=value
specifies the width of the histogram bars in screen percent units.

BETA<(beta-suboptions)>
displays a fitted beta density curve on the histogram.
Restriction: The BETA option can occur only once in a HISTOGRAM statement.
Interaction: The beta distribution is bounded below by the parameter [IMAGE] and above by the value [IMAGE]. Use the THETA= and SIGMA= suboptions to specify these parameters. The default values for THETA= and SIGMA= are 0 and 1, respectively. You can specify THETA=EST and SIGMA=EST to request maximum likelihood estimates for [IMAGE] and [IMAGE].

Note:   Three- and four-parameter maximum likelihood estimation may not always converge.  [cautionend]

Interaction: The beta distribution has two shape parameters, [IMAGE] and [IMAGE]. If these parameters are known, you can specify their values with the ALPHA= and BETA= options. By default, PROC UNIVARIATE computes maximum likelihood estimates for [IMAGE] and [IMAGE].
Main Discussion: See Beta Distribution
See also: the ALPHA= suboption , BETA= suboption , SIGMA= suboption , and THETA= suboption

BETA=value
specifies the second shape parameter [IMAGE] for the fitted beta density curves when you request the BETA option.
Alias: B=
Default: a maximum likelihood estimate
Requirement: Enclose this suboption in parentheses after the BETA option.

C=value
specifies the shape parameter [IMAGE] for the fitted Weibull density curve when you request the WEIBULL option.
Default: a maximum likelihood estimate
Requirement: Enclose this suboption in parentheses after the WEIBULL option.

C=value(s)|MISE
specifies the standardized bandwidth parameter [IMAGE] for kernel density estimates when you request the KERNEL option.
Default: the bandwidth that minimizes the approximate MISE.
Restriction: You can specify up to five values to request multiple estimates.
Requirement: Enclose this suboption in parentheses after the KERNEL option.
Interaction: You can also use the C= suboption with the K= suboption, which specifies the kernel function, to compute multiple estimates. If you specify more kernel functions than bandwidths, PROC UNIVARIATE repeats the last bandwidth in the list for the remaining estimates. Likewise, if you specify more bandwidths than kernel functions, then PROC UNIVARIATE repeats the last kernel function for the remaining estimates. For example, the following statements compute three density estimates:
proc univariate;
   var length;
   histogram length / kernel(c=1 2 3 k=normal quadratic);
   run;
The first uses a normal kernel and a bandwidth of 1, the second uses a quadratic kernel and a bandwidth of 2, and the third uses a quadratic kernel and a bandwidth of 3.
Tip: To estimate a bandwidth that minimizes the approximate mean integrated square error (MISE) use the C=MISE suboption. For example, the following statements compute three density estimates:
 proc univariate;
    var length;
    histogram length / kernel(c=0.5 1.0 mise);
   run;
The first two estimates have standardized bandwidths of 0.5 and 1.0, respectively, and the third has a bandwidth that minimizes the approximate MISE.

CAXIS=color
specifies the color for the axes and tick marks.
Alias: CAXES= and CA=
Default: the first color in the device color list

CBARLINE=color
specifies the color for the outline of the histogram bars.
Default: the first color in the device color list
Featured in: Fitting Density Curves

CFILL=color
specifies the color to fill the bars of the histogram (or the area under a fitted density curve if you also specify the FILL option).
See also: FILL option and PFILL=option
Featured in: Fitting Density Curves and Creating a Two-Way Comparative Histogram

CFRAME=color
specifies the color for the area that is enclosed by the axes and frame.
Alias: CRF=
Default: The area is not filled.

CFRAMESIDE=color
specifies the color to fill the frame area for the row labels that display along the left side of the comparative histogram. This color also fills the frame area for the label of the corresponding class variable (if you associate a label with the variable).
Default: These areas are not filled.
Requirement: This option is ignored unless you specify the CLASS statement.

CFRAMETOP=color
specifies the color to fill the frame area for the column labels that display across the top of the comparative histogram. This color also fills the frame area for the label of the corresponding class variable (if you associate a label with the variable).
Default: These areas are not filled.
Requirement: This option is ignored unless you specify the CLASS statement.

CGRID=color
specifies the color for grid lines when a grid displays on the histogram.
Default: the first color in the device color list
Interaction: This option automatically invokes the GRID= option.

CHREF=color
specifies the color for horizontal axis reference lines when you specify the HREF= option.
Default: the first color in the device color list

COLOR=color
specifies the color of the density curve.
Requirement: You must enclose this suboption in parentheses after the density curve option or the KERNEL option.
Interaction: You can specify as a KERNEL suboption a list of up to five colors in parentheses for multiple kernel density estimates. If there are more estimates than colors, the remaining estimates use the last color that you specify.

CPROP=color| EMPTY
specifies the color for a horizontal bar whose length (relative to the width of the tile) indicates the proportion of the total frequency that is represented by the corresponding cell in a comparative histogram.
Default: bars do not display
Requirement: This option is ignored unless you specify the CLASS statement.
Tip: Use the keyword EMPTY to display empty bars.

CTEXT=color
specifies the color for tick mark values and axis labels.
Alias: CT=
Default: The color that you specify for the CTEXT= option in the GOPTIONS statement. If you omit the GOPTIONS statement, the default is the first color in the device color list.

CVREF=color
specifies the color for the reference lines that you request with the VREF= option.
Alias: CV=
Default: the first color in the device color list

DESCRIPTION='string'
specifies a description, up to 40 characters long, that appears in the PROC GREPLAY master menu.
Alias: DES=
Default: the variable name

EXPONENTIAL<(exponential-suboptions)>
displays a fitted exponential density curve on the histogram.
Alias EXP
Restriction: The EXPONENTIAL option can occur only once in a HISTOGRAM statement.
Interaction: The parameter [IMAGE] must be less than or equal to the minimum data value. Use the THETA= suboption to specify [IMAGE]. The default value for [IMAGE] is zero. Specify THETA=EST to request the maximum likelihood estimate for [IMAGE].
Interaction: Use the SIGMA= suboption to specify [IMAGE]. By default, PROC UNIVARIATE computes a maximum likelihood estimate for [IMAGE]. For example, the following statements fit an exponential curve with [IMAGE] and with a maximum likelihood estimate for [IMAGE]:
proc univariate; 
   var length;
   histogram / exponential(theta=10 l=2 color=red);
run;
Main discussion: See Exponential Distribution
See also: the SIGMA= suboption and THETA= suboption
Featured in: Fitting Density Curves

FILL
fills areas under the fitted density curve or the kernel density estimate with colors and patterns.
Restriction: The FILL suboption can occur with only one fitted curve.
Requirement: Enclose the FILL suboption in parentheses after a density curve option or the KERNEL option.
Interaction: The CFILL= and PFILL= options specify the color and pattern for the area under the curve.
See also: For a list of available colors and patterns, see SAS/GRAPH Software: Reference
Featured in: Fitting Density Curves

FONT=font
specifies a software font for the axis labels.
Default: hardware characters
Interaction: The FONT= font takes precedence over the FTEXT= font that you specify in the GOPTIONS statement.

FORCEHIST
forces PROC UNIVARIATE to create a histogram when there is only one unique observation. By default, if the standard deviation of the data is zero then PROC UNIVARIATE does not create a histogram.

GAMMA<(gamma-suboptions)>
displays a fitted gamma density curve on the histogram.
Restriction: The GAMMA option can occur only once in a HISTOGRAM statement.
Interaction: The parameter [IMAGE] must be less than the minimum data value. Use the THETA= suboption to specify [IMAGE]. The default value for [IMAGE] is zero. Specify THETA=EST to request the maximum likelihood estimate for [IMAGE].
Interaction: Use the ALPHA= and the SIGMA= suboptions to specify the shape parameter [IMAGE] and the scale parameter [IMAGE]. By default, PROC UNIVARIATE computes maximum likelihood estimates for [IMAGE] and [IMAGE]. For example, the following statements fit a gamma curve with [IMAGE] and with a maximum likelihood estimate for [IMAGE] and [IMAGE]:
proc univariate; 
   var length;
   histogram length/ gamma(theta=4);
run;
PROC UNIVARIATE calculates the maximum likelihood estimate of [IMAGE] iteratively using the Newton-Raphson approximation.
Main discussion: See Gamma Distribution
See also: the SIGMA= suboption , ALPHA= suboption , and the THETA= suboption

GRID
specifies to display a grid on the histogram. Grid lines are horizontal lines that are positioned at major tick marks on the vertical axis.
See also: the CGRID= option

HEIGHT=value
specifies the height in percentage screen units of text for axis labels, tick mark labels, and legends. This option takes precedence over the HTEXT= option in the GOPTIONS statement.

HMINOR=n
specifies the number of minor tick marks between each major tick mark on the horizontal axis. PROC UNIVARIATE does not label minor tick marks.
Alias: HM=
Default: 0

HOFFSET=value
specifies the offset in percentage screen units at both ends of the horizontal axis.
Tip: Use HOFFSET=0 to eliminate the default offset.

HREF=value(s)
draws reference lines that are perpendicular to the horizontal axis at the values that you specify.
See also: CHREF= option and LHREF= option .

HREFLABELS='label1' ... ' labeln'
specifies labels for the reference lines that you request with the HREF= option.
Alias: HREFLABEL= and HREFLAB=
Restriction: The number of labels must equal the number of reference lines. Labels can have up to 16 characters.

HREFLABPOS=n
specifies the vertical position of HREFLABELS= labels, where n is
1 positions the labels along the top of the histogram
2 staggers the labels from top to bottom
3 positions the labels along the bottom.
Default: 1

INFONT=font
specifies a software font to use for text inside the framed areas of the histogram. The INFONT= option takes precedence over the FTEXT= option in the GOPTIONS statement.
See also: For a list of fonts, see SAS/GRAPH Software: Reference.

INHEIGHT=value
specifies the height, in percentage screen units of text, to use inside the framed areas of the histogram.
Default: The height that you specify with the HEIGHT= option. If you do not specify the HEIGHT= option, the default height is the height that you specify with the HTEXT= option in the GOPTIONS statement.

INTERTILE=value
specifies the distance in horizontal percentage screen units between the framed areas, which are called tiles.
Default: .75 in percentage screen units.
Requirement: This option is ignored unless you specify the CLASS statement.
Featured in: Creating a Two-Way Comparative Histogram

K=NORMAL | QUADRATIC | TRIANGULAR
specifies the kernel function (normal, quadratic, or triangular) that PROC UNIVARIATE uses to compute a kernel density estimate.
Default: normal kernel
Restriction: You can specify up to five values to request multiple estimates.
Requirement: You must enclose this suboption in parentheses after the KERNEL option.
Interaction: You can also use the K= suboption with the C= suboption, which specifies standardized bandwidths. If you specify more kernel functions than bandwidths, PROC UNIVARIATE repeats the last bandwidth in the list for the remaining estimates. Likewise, if you specify more bandwidths than kernel functions, PROC UNIVARIATE repeats the last kernel function for the remaining estimates. For example, the following statements compute three estimates with bandwidths of 0.5, 1.0, and 1.5:
proc univariate;
   var length;
   histogram length / kernel(c=0.5 1.0 1.5 k=normal quadratic);
run;
The first estimate uses a normal kernel, and the last two estimates use a quadratic kernel.

KERNEL<(kernel-suboptions)>
superimposes up to five kernel density estimates on the histogram. By default, PROC UNIVARIATE uses the AMISE method to compute kernel density estimates.
Tip: To request multiple kernel density estimates on the same histogram, specify a list of values for either the C= suboption or K= suboption.
Main discussion: Kernel Density Estimates
See also: C= suboption and K= suboption

L=linetype
specifies the line type for a fitted density curve or kernel density estimate curve.
Default: 1, which produces a solid line.
Requirement: You must enclose the L= suboption in parentheses after a density curve option or the KERNEL option.
Interaction: If you use the L= suboption with the KERNEL option, you can specify a single line type or a list of line types.
See also: For a list of available line types, see SAS/GRAPH Software: Reference
Featured in: Fitting Density Curves

LGRID=linetype
specifies the line type for the grid when a grid displays on the histogram.
Default: 1, which produces a solid line
Interaction: This option automatically invokes the GRID= option.

LHREF=linetype
specifies the line type for the reference lines that you request with the HREF= option.
Alias: LH=
Default: 2, which produces a dashed line

LOGNORMAL<(lognormal-suboptions)>
displays a fitted lognormal density curve on the histogram.
Restriction: The LOGNORMAL option can occur only once in a HISTOGRAM statement.
Interaction: The parameter [IMAGE] must be less than the minimum data value. Use the THETA= suboption to specify [IMAGE]. The default value for [IMAGE] is zero. Specify THETA=EST to request the maximum likelihood estimate for [IMAGE].
Interaction: Use the SIGMA= and ZETA= suboptions to specify [IMAGE] and [IMAGE]. By default, PROC UNIVARIATE computes a maximum likelihood estimate for [IMAGE] and [IMAGE]. For example, the following statements fit a lognormal distribution function with a default value of [IMAGE] and with maximum likelihood estimates for [IMAGE] and [IMAGE]:
proc univariate; 
   var length;
   histogram length/ lognormal;
run;
Main discussion: See Lognormal Distribution
See also: the ZETA= suboption , SIGMA= suboption , and THETA= suboption

LVREF=linetype
specifies the line type for the reference lines that you request with the VREF= option.
Alias: LV=
Default: 2, which produces a dashed line

MAXNBIN=n
specifies the maximum number of bins in the comparative histogram that display. This option is useful when the scales or ranges of the data distributions differ greatly from cell to cell.

By default, PROC UNIVARIATE determines the bin size and midpoints for the key cell, and then extends the midpoint list to accommodate the data ranges for the remaining cells. However, if the cell scales differ considerably, the resulting number of bins may be so great that each cell histogram is scaled into a narrow region. By using MAXNBIN= to limit the number of bins, you can narrow the window about the data distribution in the key cell.
Requirement: This option is ignored unless you specify the CLASS statement.
Tip: MAXNBIN= provides an alternative to the MAXSIGMAS= option.

MAXSIGMAS=value
specifies to limit the number of bins in the comparative histogram that display to a range of value standard deviations (of the data in the key cell) above and below the mean of the data in the key cell. This option is useful when the scales or ranges of the data distributions differ greatly from cell to cell.

By default, PROC UNIVARIATE determines the bin size and midpoints for the key cell, and then extends the midpoint list to accommodate the data ranges for the remaining cells. However, if the cell scales differ considerably, the resulting number of bins may be so great that each cell histogram is scaled into a narrow region. By using MAXSIGMAS= to limit the number of bins, you can narrow the window that surrounds the data distribution in the key cell.
Requirement: This option is ignored unless you specify the CLASS statement.

MIDPERCENTS
requests a table that lists the midpoints and percentage of observations in each histogram interval.
Interaction: If you specify MIDPERCENTS in parentheses after a density estimate option, PROC UNIVARIATE displays a table that lists the midpoints, the observed percentage of observations, and the estimated percentage of the population in each interval (estimated from the fitted distribution).

MIDPOINTS=value(s)|KEY|UNIFORM
specifies how to determine the midpoints for the histogram intervals, where

value(s)
determines the width of the histogram bars as the difference between consecutive midpoints. PROC UNIVARIATE uses the same value(s) for all variables.
Range: The range of midpoints, extended at each end by half of the bar width, must cover the range of the data. For example, if you specify
midpoints=2 to 10 by 0.5 
then all of the observations should fall between 1.75 and 10.25.
Requirement: You must use evenly spaced midpoints which you list in increasing order.

KEY
determines the midpoints for the data in the key cell. The initial number of midpoints is based on the number of observations in the key cell that use the method of Terrell and Scott (1985). PROC UNIVARIATE extends the midpoint list for the key cell in either direction as necessary until it spans the data in the remaining cells.
Requirement: This option is ignored unless you specify the CLASS statement.

UNIFORM
determines the midpoints by using all the observations as if there were no cells. In other words, the number of midpoints is based on the total sample size by using the method of Terrell and Scott (1985).
Requirement: This option does not apply unless you specify the CLASS statement.

Default: If you use a CLASS statement, MIDPOINTS=KEY; however, if the key cell is empty then MIDPOINTS=UNIFORM. Otherwise, PROC UNIVARIATE computes the midpoints by using an algorithm (Terrell and Scott, 1985) that is primarily applicable to continuous data that are approximately normally distributed.
Featured in: Fitting Density Curves and Creating a Two-Way Comparative Histogram

MU=value
specifies the parameter [IMAGE] for normal density curves.
Default: the sample mean
Requirement: You must enclose this suboption in parentheses after the NORMAL option.

NAME='string'
specifies a name for the plot, up to eight characters long, that appears in the PROC GREPLAY master menu.
Default: UNIVAR

NCOLS=n
specifies the number of columns in the comparative histogram.
Alias: NCOL=
Default: NCOLS=1, if you specify only one class variable, and NCOLS=2, if you specify two class variables.
Requirement: This option is ignored unless you specify the CLASS statement.
Interaction: If you specify two class variables, you can use the NCOLS= option with the NROWS= option.
Featured in: Creating a Two-Way Comparative Histogram

NOBARS
suppresses drawing of histogram bars.
Tip: Use this option to display only the fitted curves.

NOFRAME
suppresses the frame that surrounds the subplot area.

NOHLABEL
suppresses the label for the horizontal axis.
Tip: Use this option to reduce clutter.

NOPLOT
suppresses the creation of a plot.
Alias: NOCHART
Tip: Use NOPLOT when you want to display only descriptive statistics for a fitted density or create an OUTHISTOGRAM= data set.

NOPRINT
suppresses the table of statistics that summarizes the fitted density curve.
Requirement: Enclose this option in the parentheses that follow the density curve option.
Featured in: Fitting Density Curves

NORMAL<(normal-suboptions)>
displays a fitted lognormal density curve on the histogram.
Restriction: The NORMAL option can occur only once in a HISTOGRAM statement.
Interaction: Use the MU= and SIGMA= suboptions to specify [IMAGE] and [IMAGE]. By default, PROC UNIVARIATE uses the sample mean and sample standard deviation for [IMAGE] and [IMAGE].
Main discussion: See Normal Distribution
See also: the MU= suboption and the SIGMA= suboption
Featured in: Fitting Density Curves

NOVLABEL
suppresses the label for the vertical axis.

NOVTICK
suppresses the tick marks and tick mark labels for the vertical axis.
Interaction: This option automatically invokes the NOVLABEL option.

NROWS=n
specifies the number of rows in the comparative histogram.
Alias: NROW=
Default: 2
Requirement: This option is ignored unless you specify the CLASS statement.
Interaction: If you specify two class variables, you can use the NCOLS= option with the NROWS= option.
Featured in: Creating a Two-Way Comparative Histogram

OUTHISTOGRAM=SAS-data-set
creates a SAS data set that contains information about histogram intervals. Specifically, the data set contains the midpoints of the histogram intervals, the observed percentage of observations in each interval, and the estimated percentage of observations in each interval (estimated from each of the specified fitted curves).
Alias: OUTHIST=
See also: OUTHISTOGRAM= Data Set

PERCENTS=value(s)
specifies a list of percentages that PROC UNIVARIATE uses to calculate quantiles from the data and to estimate quantiles from the fitted density curve.
Alias: PERCENT=
Default: 1, 5, 10, 25, 50, 75, 90, 95, and 99 percent
Range: between 0 and 100
Requirement: You must enclose this suboption in parentheses after the curve option.

PFILL=pattern
specifies a pattern to fill the bars of the histograms (or the areas that are under a fitted density curve if you also specify the FILL option).
Default: The bars and curve areas are not filled.
See also: CFILL= option and FILL option
See also: SAS/GRAPH Software: Reference

RTINCLUDE
includes the right endpoint of each histogram interval in that interval. By default, PROC UNIVARIATE includes the left endpoint in the histogram interval.

SCALE=value
is an alias for the SIGMA= suboption when you request density curves with the BETA, EXPONENTIAL, GAMMA, and WEIBULL options and an alias for the ZETA= suboption when you request density curves with the LOGNORMAL option.
See also: SIGMA= suboption and ZETA= suboption

SHAPE=value
is an alias for the ALPHA= suboption when you request gamma curves with the GAMMA option, the SIGMA= suboption when you request lognormal curves with the LOGNORMAL option, and the C= suboption when you request Weibull curves with the WEIBULL option.
See also: ALPHA suboption , SIGMA suboption , and C= suboption

SIGMA=value|EST
specifies the parameter [IMAGE] for the fitted density curve when you request the BETA, EXPONENTIAL, GAMMA, LOGNORMAL, NORMAL, and WEIBULL options. See Uses of the SIGMA suboption for a summary of how to use the SIGMA= suboption.
Default: see Uses of the SIGMA suboption
Requirement: You must enclose this suboption in parentheses after the density curve option.
Tip: As a BETA suboption, you can specify SIGMA=EST to request a maximum likelihood estimate for [IMAGE].

Uses of the SIGMA suboption
Distribution Keyword SIGMA= Specifies Default Value Alias
BETA scale parameter [IMAGE]
1 SCALE=
EXPONENTIAL scale parameter [IMAGE]
maximum likelihood estimate SCALE=
GAMMA scale parameter [IMAGE]
maximum likelihood estimate SCALE=
WEIBULL scale parameter [IMAGE]
maximum likelihood estimate SCALE=
LOGNORMAL shape parameter [IMAGE]
maximum likelihood estimate SCALE=
NORMAL scale parameter [IMAGE]
standard deviation SHAPE=

THETA=value|EST
specifies the lower threshold parameter [IMAGE] for the fitted density curve when you request the BETA, EXPONENTIAL, GAMMA, LOGNORMAL, and WEIBULL options.
Default: 0
Requirement: You must enclose this suboption in parentheses after the curve option.
Tip: To compute a maximum likelihood estimate for [IMAGE], specify THETA=EST.

THRESHOLD= value
is an alias for the THETA= option. See the THETA= suboption .

TURNVLABELS
specifies that PROC UNIVARIATE turn the characters in the vertical axis labels so that they display vertically. This happens by default when you use a hardware font.
Alias: TURNVLABEL

VAXIS=value(s)
specifies tick mark values for the vertical axis.
Requirement: Use evenly spaced values which you list in increasing order. The first value must be zero and the last value must be greater than or equal to the height of the largest bar. You must scale the values in the same units as the bars.
See also: the VSCALE= option
Featured in: Creating a Two-Way Comparative Histogram

VAXISLABEL='label'
specifies a label for the vertical axis.
Requirement: Labels can have up to 40 characters.
Featured in: Creating a Two-Way Comparative Histogram

VMINOR=n
specifies the number of minor tick marks between each major tick mark on the vertical axis. PROC UNIVARIATE does not label minor tick marks.
Alias: VM=
Default: 0

VOFFSET=value
specifies the offset in percentage screen units at the upper end of the vertical axis.

VREF=value(s)
draws reference lines that are perpendicular to the vertical axis at the value(s) that you specify.
See also: CVREF= option and LVREF= option .

VREFLABELS=' label1'... 'labeln'
specifies labels for the reference lines that you request with the VREF= option.
Alias: VREFLABEL= and VREFLAB=
Restriction: The number of labels must equal the number of reference lines. Labels can have up to 16 characters.

VREFLABPOS=n
specifies the horizontal position of VREFLABELS= labels, where n is
1 positions the labels at the left of the histogram.
2 positions the labels at the right of the histogram.
Default: 1

VSCALE=scale
specifies the scale of the vertical axis, where scale is

COUNT
scales the data in units of the number of observations per data unit.

PERCENT
scales the data in units of percentage of observations per data unit.

PROPORTION
scales the data in units of proportion of observations per data unit.

Default: PERCENT
Featured in: Creating a Two-Way Comparative Histogram

W=n
specifies the width in pixels of the fitted density curve or the kernel density estimate curve.
Default: 1
Requirement: You must enclose this suboption in parentheses after the density curve option or the KERNEL option.
Interaction: As a KERNEL suboption, you can specify a list of up to five W= values.

WAXIS=n
specifies the line thickness (in pixels) for the axes and frame.
Default: 1

WBARLINE=n
specifies the line thickness for the histogram bar outlines.
Default: 1

WEIBULL<(Weibull-suboptions)>
displays a fitted Weibull density curve on the histogram.
Restriction: The WEIBULL option can occur only once in a HISTOGRAM statement.
Interaction: The parameter [IMAGE] must be less than the minimum data value. Use the THETA= suboption to specify [IMAGE]. The default value for [IMAGE] is zero. Specify THETA=EST to request the maximum likelihood estimate for [IMAGE].
Interaction: Use ALPHA= and the SIGMA= suboptions to specify the shape parameter [IMAGE] and the scale parameter [IMAGE]. By default, PROC UNIVARIATE computes the maximum likelihood estimates for [IMAGE] and [IMAGE]. For example, the following statements fit a Weibull curve with [IMAGE] and with a maximum likelihood estimate for [IMAGE] and [IMAGE]:
proc univariate; 
   var length;
   histogram length/ weibull(theta=4);
run;
PROC UNIVARIATE calculates the maximum likelihood estimate of [IMAGE] iteratively by using the Newton-Raphson approximation.
Main discussion: See Weibull Distribution
See also: the C= suboption , SIGMA= suboption , and THETA= suboption

WGRID=n
specifies the line thickness for the grid.

ZETA= value
specifies a value for the scale parameter [IMAGE] for the lognormal density curve when you request the LOGNORMAL option.
Default: a maximum likelihood estimate
Requirement: You must enclose this suboption in parentheses after the LOGNORMAL option.


Chapter Contents

Previous

Next

Top of Page

Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.