Chapter Contents

Previous

Next
The MEANS Procedure

CLASS Statement


Specifies the variables whose values define the subgroup combinations for the analysis.

Tip: You can use multiple CLASS statements.
Tip: Some CLASS statement options are also available in the PROC MEANS statement. They affect all CLASS variables rather than just to the one(s) you specify in a CLASS statement.
See also: For information about how the CLASS statement groups formatted values, see Formatted Values .
Featured in: Computing Descriptive Statistics with Class Variables , Using a CLASSDATA= Data Set with Class Variables , Using Multi-label Value Formats with Class Variables , Using Preloaded Formats with Class Variables , and Computing Output Statistics with Missing Class Variable Values


CLASS variable(s) </ options>;


Required Arguments

variable(s)
specifies one or more variables that the procedure uses to group the data. Variables in a CLASS statement are referred to as class variables. Class variables are numeric or character. Class variables can have continuous values, but they typically have a few discrete values that define levels of the variable. You do not have to sort the data by class variables.
Interaction: Use the TYPES statement and the WAYS statement to control which class variables that PROC MEANS uses to group the data.
Tip: To reduce the number of class variable levels, use a FORMAT statement to combine variable values. When a format combines several internal values into one formatted value, PROC MEANS outputs the lowest internal value.
See also: Using Class Variables


Options

ASCENDING
specifies to sort the class variable levels in ascending order.
Alias: ASCEND
Interaction: PROC MEANS issues a warning message if you specify both ASCENDING and DESCENDING and ignores both options.
Featured in: Computing Output Statistics with Missing Class Variable Values

DESCENDING
specifies to sort the class variable levels in descending order.
Alias: DESCEND
Interaction: PROC MEANS issues a warning message if you specify both ASCENDING and DESCENDING and ignores both options.

EXCLUSIVE
excludes from the analysis all combinations of the class variables that are not found in the preloaded range of user-defined formats.
Requirement: You must specify PRELOADFMT to preload the class variable formats.
Featured in: Using Preloaded Formats with Class Variables

GROUPINTERNAL
specifies not to apply formats to the class variables when PROC MEANS groups the values to create combinations of class variables.
Interaction: If you specify the PRELOADFMT option, PROC MEANS ignores this option and uses the formatted values.
Tip: This option saves computer resources when the numeric class variables contain discrete values.
See also: Computer Resources

MISSING
considers missing values as valid values for the class variable levels. Special missing values that represent numeric values (the letters A through Z and the underscore (_) character) are each considered as a separate value.
Default: If you omit MISSING, PROC MEANS excludes the observations with a missing class variable value from the analysis.
See also: SAS Language Reference: Concepts for a discussion of missing values with special meanings.
Featured in: Computing Output Statistics with Missing Class Variable Values

MLF
enables PROC MEANS to use the primary and secondary format labels for a given range or overlapping ranges to create subgroup combinations when a multilabel format is assigned to a class variable.
Requirement: You must use PROC FORMAT and the MULTILABEL option in the VALUE statement to create a multilabel format.
Interaction: If you use the OUTPUT statement with MLF, the class variable contains a character string that corresponds to the formatted value. Because the formatted value becomes the internal value, the length of this variable is the number of characters in the longest format label.
Interaction: Using MLF with ORDER=FREQ may not produce the order that you expect for the formatted values.
Tip: If you omit MLF, PROC MEANS uses the primary format labels, which corresponds to using the first external format value, to determine the subgroup combinations.
See also: The MULTILABEL option in the VALUE statement of the FORMAT procedure.
Featured in: Using Multi-label Value Formats with Class Variables

Note:   When the formatted values overlap, one internal class variable value maps to more than one class variable subgroup combination. Therefore, the sum of the N statistics for all subgroups is greater the number of observations in the data set (the overall N statistic).  [cautionend]

ORDER=DATA | FORMATTED | FREQ | UNFORMATTED
specifies the order to group the levels of the class variables in the output, where

DATA
orders values according to their order in the input data set.
Interaction: If you use PRELOADFMT, the order for the values of each class variable matches the order that PROC FORMAT uses to store the values of the associated user-defined format. If you use the CLASSDATA= option in the PROC statement, PROC MEANS uses the order of the unique values of each class variable in the CLASSDATA= data set to order the output levels. If you use both options, PROC MEANS first uses the user-defined formats to order the output. If you omit EXCLUSIVE in the PROC statement, PROC MEANS appends after the user-defined format and the CLASSDATA= values the unique values of the class variables in the input data set based on the order that they are encountered.
Tip: By default, PROC FORMAT stores a format definition in sorted order. Use the NOTSORTED option to store the values or ranges of a user defined format in the order that you define them.
Featured in: Computing Output Statistics with Missing Class Variable Values

FORMATTED
orders values by their ascending formatted values. This order depends on your operating environment.
Alias: FMT | EXTERNAL
Featured in: Using Multi-label Value Formats with Class Variables

FREQ
orders values by descending frequency count so that levels with the most observations are listed first.
Interaction: For multiway combinations of the class variables, PROC MEANS determines the order of a level from the individual class variable frequencies.
Interaction: Use the ASCENDING option to order values by ascending frequency count.
Featured in: Using Multi-label Value Formats with Class Variables

UNFORMATTED
orders values by their unformatted values, which yields the same order as PROC SORT. This order depends on your operating environment. This sort sequence is particularly useful for displaying dates chronologically.
Alias: UNFMT | INTERNAL

Default: UNFORMATTED
Tip: By default, all orders except FREQ are ascending. For descending orders, use the DESCENDING option.
See also: Ordering the Class Values

PRELOADFMT
specifies that all formats are preloaded for the class variables.
Requirement: PRELOADFMT has no effect unless you specify either COMPLETETYPES, EXCLUSIVE, or ORDER=DATA and you assign formats to the class variables.
Interaction: To limit PROC MEANS output to the combinations of formatted class variable values present in the input data set, use the EXCLUSIVE option in the CLASS statement.
Interaction: To include all ranges and values of the user-defined formats in the output, even when the frequency is zero, use COMPLETETYPES in the PROC statement.
Featured in: Using Preloaded Formats with Class Variables


Comparison of the BY and CLASS Statements
Using the BY statement is similar to using the CLASS statement and the NWAY option in that PROC MEANS summarizes each BY group as an independent subset of the input data. Therefore, no overall summarization of the input data is available. However, unlike the CLASS statement, the BY statement requires that you previously sort BY variables.

When you use the NWAY option, PROC MEANS may encounter insufficient memory to the summarization all the class variables. You can move some class variables to the BY statement. For maximum benefit, move class variables to the BY statement that are already sorted or that have the greatest number of unique values.

You can use the CLASS and BY statements together to analyze the data by the levels of class variables within BY groups. See Using the BY Statement with Class Variables .


How PROC MEANS Handles Missing Values for Class Variables
By default, if an observation contains a missing value for any class variable, PROC MEANS excludes that observation from the analysis. If you specify the MISSING option in the PROC statement, the procedure considers missing values as valid levels for the combination of class variables.

Specifying the MISSING option in the CLASS statement allows you to control the acceptance of missing values for individual class variables.


Computer Resources
The total of unique class values that PROC MEANS allows depends on the amount of computer memory that is available. See Computational Resources for more information.

The GROUPINTERNAL option can improve computer performance because the grouping process is based on the internal values of the class variables. If a numeric class variable is not assigned a format and you do not specify GROUPINTERNAL, PROC MEANS uses the default format to format numeric values as character strings. Then PROC MEAN groups these numeric variables by their character values, which takes additional time and computer memory.


Chapter Contents

Previous

Next

Top of Page

Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.