Chapter Contents

Previous

Next
The UNIVARIATE Procedure

INSET Statement


Places a box or table of summary statistics, called an inset, directly in the high-resolution graph.

Requirement: The INSET statement must follow the HISTOGRAM, PROBPLOT, or QQPLOT statement that creates the plot that you want to augment. The inset appears in all the graphs that the preceding plot statement produces.
Tip: You can use multiple INSET statements.
Featured in: Displaying a Reference Line on a Normal Probability Plot and Creating a Two-Way Comparative Histogram


INSET <keyword(s) DATA=SAS-data-set> </ option(s)>;


Arguments

keyword(s)
specifies one or more keywords that identify the information to display in the inset. PROC UNIVARIATE displays the information in the order that you request the keywords.

You can specify statistical keywords, primary keywords, and secondary keywords. The available statistical keywords are

Available Statistical Keywords in the INSET statement
Descriptive statistic keywords

CSS CV KURTOSIS

MAX MEAN N

MIN MODE RANGE

NMISS NOBS STDMEAN

SKEWNESS STD USS

SUM SUMWGT VAR
Quantile statistic keywords

MEDIAN P1 P5

P10 P90 P95

P99 Q1 Q3

QRANGE

Robust statistic keywords

GINI MAD QN

SN STD_GINI STD_MAD

STD_QN STD_QRANGE STD_SN
Hypothesis testing keywords

MSIGN PROBM PROBT

NORMALTEST PROBN SIGNRANK

PNORMAL PROBS T

A primary keyword allows you to specify secondary keywords in parentheses immediately after the primary keyword. Primary keywords are BETA, EXPONENTIAL, GAMMA, LOGNORMAL, NORMAL, WEIBULL, WEIBULL2, KERNEL, and KERNELn. If you specify a primary keyword but omit a secondary keyword, the inset displays a colored line and the distribution name as a key for the density curve. For a list of the secondary keywords, see Available Secondary Keywords .

By default, PROC UNIVARIATE identifies inset statistics with appropriate labels and prints numeric values using appropriate formats. To customize the label, specify the keyword followed by an equal sign (=) and the desired label in quotes. To customize the format, specify a numeric format in parentheses after the keyword. Labels can have up to 24 characters. If you specify both a label and a format for a statistic, the label must appear before the format. For example,

inset n='Sample Size' std='Std Dev' (5.2);
requests customized labels for two statistics and displays the standard deviation with field width of 5 and two decimal places.

Available Secondary Keywords
Keyword Alias Description
For BETA primary keyword

ALPHA SHAPE1 first shape parameter [IMAGE]

BETA SHAPE2 second shape parameter [IMAGE]

SIGMA SCALE scale parameter [IMAGE]

THETA THRESHOLD lower threshold parameter [IMAGE]

For EXP primary keyword


SIGMA SCALE scale parameter [IMAGE]

THETA THRESHOLD threshold parameter [IMAGE]

For GAMMA primary keyword


ALPHA SHAPE shape parameter [IMAGE]

SIGMA SCALE scale parameter [IMAGE]

THETA THRESHOLD threshold parameter [IMAGE]

For LOGNORMAL primary keyword


SIGMA SHAPE shape parameter [IMAGE]

THETA THRESHOLD threshold parameter [IMAGE]

ZETA SCALE scale parameter [IMAGE]

For NORMAL primary keyword


MU MEAN mean parameter [IMAGE]

SIGMA STD shape parameter [IMAGE]

For WEIBULL primary keyword


C SHAPE shape parameter [IMAGE]

SIGMA SCALE scale parameter [IMAGE]

THETA THRESHOLD threshold parameter [IMAGE]

For WEIBULL2 primary keyword



C

SHAPE shape parameter [IMAGE]

SIGMA SCALE scale parameter [IMAGE]

THETA THRESHOLD known lower threshold parameter [IMAGE]

For any parametric distribution primary keyword*


AD
Anderson-Darling EDF test statistic

ADPVAL
Anderson-Darling EDF test p-value

CVM
Cramer-von Mises EDF test statistic

CVMPVAL
Cramer-von Mises EDF test p-value

KSD
Kolmogorov-Smirnov EDF test statistic

KSDPVAL
Kolmogorov-Smirnov EDF test p-value

For KERNEL or KERNELn primary keyword*


TYPE
kernel type: normal, quadratic, or triangular

BANDWIDTH BWIDTH bandwidth [IMAGE] for the density estimate

C
standardized bandwidth [IMAGE] for the density estimate: [IMAGE] where [IMAGE]sample size, [IMAGE]bandwidth, and [IMAGE]interquartile range

AMISE
approximate mean integrated square error (MISE) for the kernel density
* Available with only the HISTOGRAM statement and a BETA, EXPONENTIAL, LOGNORMAL, NORMAL, or WEIBULL distribution.

Requirement: Some inset statistics are not available unless you request a plot statement and options that calculate these statistics. For example:
proc univariate data=score;
    histogram final / normal;
    inset mean std normal(ad adpval);
run;
The MEAN and STD keywords display the sample mean and standard deviation of FINAL. The NORMAL keyword with the secondary keywords AD and ADPVAL display the Anderson-Darling goodness-of-fit test statistic and p-value. The statistics that are specified with the NORMAL keyword are available only because the NORMAL option is requested in the HISTOGRAM statement.

The KERNEL or KERNELn keyword is available only if you request a kernel density estimate in a HISTOGRAM statement. The WEIBULL2 keyword is available only if you request a two-parameter Weibull distribution in the PROBPLOT or QQPLOT statement.

Tip: To specify the same format for all the statistics in the INSET statement, use the FORMAT= option.
Tip: To create a completely customized inset, use a DATA= data set. The data set contains the label and the value that you want to display in the inset.
Tip: If you specify multiple kernel density estimates, you can request inset statistics for all the estimates with the KERNEL keyword. Alternatively, you can display inset statistics for individual curves with KERNELn keyword, where n is the curve number between 1 and 5.
Featured in: Displaying a Reference Line on a Normal Probability Plot and Creating a Two-Way Comparative Histogram

DATA=SAS-data-set
requests that PROC UNIVARIATE display customized statistics from a SAS data set in the inset table. The data set must contain two variables:
_LABEL_ a character variable whose values provide labels for inset entries.
_VALUE_ a variable that is either character or numeric and whose values provide values for inset entries.
The label and value from each observation in the data set occupy one line in the inset. The position of the DATA= keyword in the keyword list determines the position of its lines in the inset.


Options
The Inset illustrates the meaning of terms that are used in this section.

The Inset

[IMAGE]

CFILL=color | BLANK
specifies the color of the background which, if you omit the CFILLH= option, includes the header background.

Default The background is empty which causes items that overlap the inset (such as curves, histogram bars, or specification limits) to show through the inset.
Tip: Specify a value for CFILL= so that items that overlap no longer show through the inset. Use CFILL=BLANK to leave the background uncolored.

CFILLH=color
specifies the color of the header background.
Default: the CFILL= color

CFRAME=color
specifies the color of the frame.
Default: the same color as the axis of the plot

CHEADER=color
specifies the color of the header text.
Default: the CTEXT=color

CSHADOW=color
specifies the color of the drop shadow.
Default: A drop shadow is not displayed.

CTEXT=color
specifies the color of the text.
Default: the same color as the other text on the plot

DATA
specifies how to use data coordinates to position the inset with the POSITION= option.
Requirement: The DATA option is available only when you specify POSITION=(x,y). You must place DATA immediately after the coordinates (x,y).
Main Discussion: Positioning the Inset Using Coordinates
See also: POSITION= option

FONT=font
specifies the font of the text.
Default: If you locate the inset in the interior of the plot then the font is SIMPLEX. If you locate the inset in the exterior of the plot then the font is the same as the other text on the plot.
Featured in: Creating a Two-Way Comparative Histogram

FORMAT=format
specifies a format for all the values in the inset.
Interaction: If you specify a format for a particular statistic, then this format overrides FORMAT=format.
See also: For more information about SAS formats, see SAS Language Reference: Dictionary
Featured in: Displaying a Reference Line on a Normal Probability Plot

HEADER=string
specifies the header text where string cannot exceed 40 characters.
Default: No header line appears in the inset.
Interaction: If all the keywords that you list in the INSET statement are secondary keywords that correspond to a fitted curve on a histogram, PROC UNIVARIATE displays a default header that indicates the distribution and identifies the curve.
Featured in: Displaying a Reference Line on a Normal Probability Plot

HEIGHT=value
specifies the height of the text.
Featured in: Creating a Two-Way Comparative Histogram

NOFRAME
suppresses the frame drawn around the text.
Featured in: Creating a Two-Way Comparative Histogram

POSITION=position
determines the position of the inset. The position is a compass point keyword, a margin keyword, or a pair of coordinates (x,y).
Alias: POS=
Default: NW, which positions the inset in the upper left (northwest) corner of the display.
Requirement: You must specify coordinates in axis percentage units or axis data units.
Main discussion: Positioning the Inset Using Compass Point , Positioning the Inset in the Margins , and Positioning the Inset Using Coordinates
Featured in: Displaying a Reference Line on a Normal Probability Plot and Creating a Two-Way Comparative Histogram

REFPOINT=BR | BL | TR | TL
specifies the reference point for an inset that PROC UNIVARIATE positions by a pair of coordinates with the POSITION= option. The REFPOINT= option specifies which corner of the inset frame that you want to position at coordinates (x,y). The reference points are
BL bottom left
BR bottom right
TL top left
TR top right
Default: BL
Requirement: You must use REFPOINT= with POSITION=(x,y) coordinates.
Featured in: Displaying a Reference Line on a Normal Probability Plot


Positioning the Inset Using Compass Point
To position the inset by using a compass point position, use the keyword N, NE, E, SE, S, SW, W, or NW in the POSITION= option. The default position of the inset is NW.

The following statements produce a histogram to show the position of the inset for the eight compass points:

proc univariate data=score noprint;
   histogram final / cfill=gray midpoints=45 to 95 by 10 barwidth=5;
   inset n     / cfill=blank header='Position = NW' pos=nw;
   inset mean  / cfill=blank header='Position = N ' pos=n ;
   inset sum   / cfill=blank header='Position = NE' pos=ne;
   inset max   / cfill=blank header='Position = E ' pos=e ;
   inset min   / cfill=blank header='Position = SE' pos=se;
   inset nobs  / cfill=blank header='Position = S ' pos=s ;
   inset range / cfill=blank header='Position = SW' pos=sw;
   inset mode  / cfill=blank header='Position = W ' pos=w ;
   label final='Final Examination Score';
   title 'Test Scores for a College Course';
run;

[IMAGE]


Positioning the Inset in the Margins
To position the inset in one of the four margins that surround the plot area use the margin keywords LM, RM, TM, or BM in the POSITION= option. Locating the Inset in the Margins shows the location of the inset in the margin.

Locating the Inset in the Margins

[IMAGE]

Margin positions are recommended if you list a large number of statistics in the INSET statement. If you attempt to display a lengthy inset in the interior of the plot, it is most likely that the inset will collide with the data display.


Positioning the Inset Using Coordinates
To position the inset with coordinates, use POSITION=(x,y). You specify the coordinates in axis data units or in axis percentage units (the default).

data unit
If you specify the DATA option immediately following the coordinates, PROC UNIVARIATE positions the inset by using axis data units. For example, the following statements place the bottom left corner of the inset at 12.5 on the horizontal axis and 10 on the vertical axis:
proc univariate data=score;
   histogram final / midpoints 45 to 95 by 10 barwidth=5
                     cfill=gray ;
   inset n / header   = 'Position=(12.5,10)'
             position = (12.5,10) data;
run;

[IMAGE]

By default, the specified coordinates determine the position of the bottom left corner of the inset. To change this reference point, use the REFPOINT= option (see the next example).

axis percent unit
If you omit the DATA option, PROC UNIVARIATE positions the inset by using axis percentage units. The coordinates in axis percentage units must be between 0 and 100. The coordinates of the bottom left corner of the display are (0,0), while the upper right corner is (100,100). For example, the following statements create a histogram and use coordinates in axis percentage units to position the two insets:
proc univariate data=sccore;
   histogram final / midpoints 45 to 95 by 10 barwidth=5
                     cfill=gray;
   inset min / position = (5,25)
               header   = 'Position=(5,25)'
               refpoint = tl;
   inset max / position = (95,95)
               header   = 'Position=(95,95)'
               refpoint = tr;
   run;
The REFPOINT= option determines which corner of the inset to place at the coordinates that are specified with the POSITION= option. The first inset uses REFPOINT=TL, so that the top left corner of the inset is positioned 5% of the way across the horizontal axis and 25% of the way up the vertical axis. The second inset uses REFPOINT=TR, so that the top right corner of the inset is positioned 95% of the way across the horizontal axis and 95% of the way up the vertical axis.

[IMAGE]


Chapter Contents

Previous

Next

Top of Page

Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.