Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The LOGISTIC Procedure

CLASS Statement

CLASS variable <(v-options)> <variable <(v-options)>... >
                          < / v-options >;
The CLASS statement names the classification variables to be used in the analysis. The CLASS statement must precede the MODEL statement. You can specify various v-options for each variable by enclosing them in parentheses after the variable name. You can also specify global v-options for the CLASS statement by placing them after a slash (/). Global v-options are applied to all the variables specified in the CLASS statement. However, individual CLASS variable v-options override the global v-options.

CPREFIX= n
specifies that, at most, the first n characters of a CLASS variable name be used in creating names for the corresponding dummy variables. The default is 32 - min( 32, max(2,f)), where f is the formatted length of the CLASS variable.

DESCENDING
DESC
reverses the sorting order of the classification variable.

LPREFIX= n
specifies that, at most, the first n characters of a CLASS variable label be used in creating labels for the corresponding dummy variables.

ORDER=DATA | FORMATTED | FREQ | INTERNAL
specifies the sorting order for the levels of classification variables. This ordering determines which parameters in the model correspond to each level in the data, so the ORDER= option may be useful when you use the CONTRAST statement. When ORDER=FORMATTED (the default) for numeric variables for which you have supplied no explicit format (that is, for which there is no corresponding FORMAT statement in the current PROC LOGISTIC run or in the DATA step that created the data set), the levels are ordered by their internal (numeric) value. Note that this represents a change from previous releases for how class levels are ordered. In releases previous to Version 8, numeric class levels with no explicit format were ordered by their BEST12. formatted values, and in order to revert to the previous ordering you can specify this format explicitly for the affected classification variables. The change was implemented because the former default behavior for ORDER=FORMATTED often resulted in levels not being ordered numerically and usually required the user to intervene with an explicit format or ORDER=INTERNAL to get the more natural ordering. The following table shows how PROC LOGISTIC interprets values of the ORDER= option.

Value of ORDER= Levels Sorted By
DATAorder of appearance in the input data set
FORMATTEDexternal formatted value, except for numeric
 variables with no explicit format, which are
 sorted by their unformatted (internal) value
FREQdescending frequency count; levels with the
 most observations come first in the order
INTERNALunformatted value


By default, ORDER=FORMATTED. For FORMATTED and INTERNAL, the sort order is machine dependent. For more information on sorting order, see the chapter on the SORT procedure in the SAS Procedures Guide and the discussion of BY-group processing in SAS Language Reference: Concepts.

PARAM=keyword
specifies the parameterization method for the classification variable or variables. Design matrix columns are created from CLASS variables according to the following coding schemes. The default is PARAM=EFFECT. If PARAM=ORTHPOLY or PARAM=POLY, and the CLASS levels are numeric, then the ORDER= option in the CLASS statement is ignored, and the internal, unformatted values are used.
EFFECT
specifies effect coding
GLM
specifies less than full rank, reference cell coding; this option can only be used as a global option
ORTHPOLY
specifies orthogonal polynomial coding
POLYNOMIAL  |  POLY
specifies polynomial coding
REFERENCE  |  REF
specifies reference cell coding

The EFFECT, POLYNOMIAL, REFERENCE, and ORTHPOLY parameterizations are full rank. For the EFFECT and REFERENCE parameterizations, the REF= option in the CLASS statement determines the reference level.

Consider a model with one CLASS variable A with four levels, 1, 2, 5, and 7. Details of the possible choices for the PARAM= option follow.

EFFECT
Three columns are created to indicate group membership of the nonreference levels. For the reference level, all three dummy variables have a value of -1. For instance, if the reference level is 7 (REF=7), the design matrix columns for A are as follows.



Effect Coding
ADesign Matrix
1100
2010
5001
7-1-1-1


Parameter estimates of CLASS main effects using the effect coding scheme estimate the difference in the effect of each nonreference level compared to the average effect over all 4 levels.

GLM
As in PROC GLM, four columns are created to indicate group membership. The design matrix columns for A are as follows.



GLM Coding
ADesign Matrix
11000
20100
50010
70001


Parameter estimates of CLASS main effects using the GLM coding scheme estimate the difference in the effects of each level compared to the last level.

ORTHPOLY
The columns are obtained by applying the Gram-Schmidt orthogonalization to the columns for PARAM=POLY. The design matrix columns for A are as follows.



Orthogonal Polynomial Coding
ADesign Matrix
1-1.1530.907-0.921
2-0.734-0.5401.473
50.524-1.370-0.921
71.3631.0040.368


POLYNOMIAL
POLY
Three columns are created. The first represents the linear term (x), the second represents the quadratic term (x2), and the third represents the cubic term (x3), where x is the level value. If the CLASS levels are not numeric, they are translated into 1, 2, 3, ... according to their sorting order. The design matrix columns for A are as follows.



Polynomial Coding
ADesign Matrix
1111
2248
5525125
7749343


REFERENCE
REF
Three columns are created to indicate group membership of the nonreference levels. For the reference level, all three dummy variables have a value of 0. For instance, if the reference level is 7 (REF=7), the design matrix columns for A are as follows.



Reference Coding
ADesign Matrix
1100
2010
5001
7000


Parameter estimates of CLASS main effects using the reference coding scheme estimate the difference in the effect of each nonreference level compared to the effect of the reference level.

REF='level' | keyword
specifies the reference level for PARAM=EFFECT or PARAM=REFERENCE. For an individual (but not a global) variable REF= option, you can specify the level of the variable to use as the reference level. For a global or individual variable REF= option, you can use one of the following keywords. The default is REF=LAST.
FIRST
designates the first ordered level as reference
LAST
designates the last ordered level as reference

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.