FACTORS Statement
- FACTORS factor-description < , ... ,
factor-description >< / options > ;
where a factor-description is
- factor-name < $ >< levels >
and factor-descriptions are separated from each other
by a comma. The $ is required for character-valued
factors. The value of levels provides the number of
levels of the factor identified by a given
factor-name. For only one factor, levels is
optional; for two or more factors, it is required.
The FACTORS statement identifies factors that distinguish
response functions from others in the same population. It
also specifies how those factors are incorporated into the
model. You can use the FACTORS statement whenever there is more than one
response function per population and the keyword
_RESPONSE_ is specified in the MODEL statement. You can
specify the name, type, and number of levels of each factor
and the identification of each level.
The FACTORS statement is most useful when the response
functions and their covariance matrix are read directly from
the input data set. In this case, PROC CATMOD reads the
response functions as though they are from one population
(this poses no problem in the multiple-population case
because the appropriately constructed covariance matrix is
also read directly). Thus, you can use the FACTORS statement
to partition the variation among the response functions
into appropriate sources, even when the functions actually
represent separate populations.
The format of the FACTORS statement is identical to that of
the REPEATED statement. In fact, repeated measurement
factors are simply special cases of factors in which some of
the response functions correspond to multiple dependent
variables that are measurements on the same experimental (or
sampling) units.
You cannot specify the FACTORS statement for an analysis
that also contains the REPEATED or LOGLIN statement since
all of them specify the same information: how to partition
the variation among the response functions within a
population.
In the FACTORS statement,
- factor-name
- names a factor that corresponds to two or more
response functions. This name must be a valid SAS variable
name, and it should not be the same as the name of a variable
that already exists in the data set being analyzed.
- $
- indicates that the factor is character-valued. If the $ is
omitted, then PROC CATMOD assumes that the factor is
numeric. The type of the factor is relevant only when you
use the PROFILE= option or when the _RESPONSE_= option
(described later in this section) specifies nested-by-value
effects.
- levels
- specifies the number of levels of the corresponding factor.
If there is only one such factor, and the number is omitted,
then PROC CATMOD assumes that the number of levels is equal
to the number of response functions per population (q).
Unless you specify the PROFILE= option, the number q must
either be equal to or be a multiple of the product of the
number of levels of all the factors.
You can specify the following options in the FACTORS
statement after a slash.
- PROFILE=(matrix)
-
specifies the values assumed by the factors for each
response function. There should be one column for each
factor, and the values in a given column (character or
numeric) should match the type of the corresponding factor.
Character values are restricted to 16 characters or less.
If there are q response functions per population, then the
matrix must have i rows, where q must either be equal
to or be a multiple of i. Adjacent rows of the matrix
should be separated by a comma.
The values in the PROFILE matrix are useful for specifying
models in those situations where the study design is not a
full factorial with respect to the factors. They can also
be used to specify nested-by-value effects in the
_RESPONSE_= option. If you specify character values in
both places (the PROFILE= option and the _RESPONSE_=
option), then the values must match with respect to whether
or not they are enclosed in quotes (that is, enclosed in
quotes in both places or in neither place).
For an example of using the PROFILE= option,
see Example 22.10.
- _RESPONSE_=effects
-
specifies design effects. The variables named in the
effects must be factor-names that appear in the
FACTORS statement. If the _RESPONSE_= option is omitted,
then PROC CATMOD builds a full factorial _RESPONSE_ effect
with respect to the factors.
- TITLE='title'
-
displays the title at the top of certain pages of
output that correspond to the current FACTORS statement.
For an example of how the FACTORS statement is useful,
consider the case where the response functions and their
covariance matrix are read directly from the input data set.
The TYPE=EST data set might be created in the following
manner:
data direct(type=est);
input b1-b4 _type_ $ _name_ $8.;
datalines;
0.590463 0.384720 0.273269 0.136458 parms .
0.001690 0.000911 0.000474 0.000432 cov b1
0.000911 0.001823 0.000031 0.000102 cov b2
0.000474 0.000031 0.001056 0.000477 cov b3
0.000432 0.000102 0.000477 0.000396 cov b4
;
Suppose the response functions correspond to four
populations that represent the cross-classification of age
(two groups) by sex. You can use the FACTORS statement
to identify these two factors and to name the effects in the
model. The statements required to fit a main-effects model
to these data are
proc catmod data=direct;
response read b1-b4;
model _f_=_response_;
factors age 2, sex 2 / _response_=age sex;
run;
If you want to specify some nested-by-value effects,
you can change the FACTORS statement to
factors age $ 2, sex $ 2 /
_response_=age sex(age='under 30') sex(age='30 & over')
profile=('under 30' male,
'under 30' female,
'30 & over' male,
'30 & over' female);
If, by design or by chance, the study contains no male subjects
under 30 years of age, then there are only three response
functions, and you can specify a main-effects model as
proc catmod data=direct;
response read b2-b4;
model _f_=_response_;
factors age $ 2, sex $ 2 / _response_=age sex
profile=('under 30' female,
'30 & over' male,
'30 & over' female);
run;
When you specify two or more factors and omit the PROFILE=
option, PROC CATMOD presumes that the response functions are
ordered so that the levels of the rightmost factor change
most rapidly. For the preceding example, the order implied
by the FACTORS statement is as follows.
Response
|
Dependent
|
|
|
Function
|
Variable
|
Age
|
Sex
|
1 | b1 | 1 | 1 |
2 | b2 | 1 | 2 |
3 | b3 | 2 | 1 |
4 | b4 | 2 | 2 |
For additional examples of how to use the FACTORS statement,
see the section "Repeated Measures Analysis". All of the examples in that section
are applicable, with the REPEATED statement replaced by the
FACTORS statement.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.