PROC FACTOR Statement

PROC FACTOR < options > ;

Table 26.1: Options Available in the PROC FACTOR Statement

Task		Option
Data sets		DATA=
		OUT=
		OUTSTAT=
		TARGET=
Extract factors and communalities		HEYWOOD
		METHOD=
		PRIORS=
		RANDOM=
		ULTRAHEYWOOD
Analyze data		COVARIANCE
		NOINT
		VARDEF=
		WEIGHT
Specify number of factors		MINEIGEN=
		NFACTORS=
		PROPORTION=
Specify numerical properties		CONVERGE=
		MAXITER=
		SINGULAR=
Specify rotation method		GAMMA=
		HKPOWER=
		NORM=
		POWER=
		PREROTATE=
		ROTATE=
Control displayed output		ALL
		CORR
		EIGENVECTORS
		MSA
		NOPRINT
		NPLOT=
		PLOT
		PREPLOT
		PRINT
		REORDER
		RESIDUALS
		SCORE
		SCREE
		SIMPLE
Exclude the correlation matrix		NOCORR
from the OUTSTAT= data set
Miscellaneous		NOBS=

ALL

displays all optional output except plots. When the input data set is TYPE=CORR, TYPE=UCORR, TYPE=COV, TYPE=UCOV or TYPE=FACTOR, simple statistics, correlations, and MSA are not displayed.

CONVERGE=p

CONV=p

specifies the convergence criterion for the METHOD=PRINIT, METHOD=ULS, METHOD=ALPHA, or METHOD=ML option. Iteration stops when the maximum change in the communalities is less than the value of the CONVERGE= option. The default value is 0.001. Negative values are not allowed.

CORR

displays the correlation matrix or partial correlation matrix.

COVARIANCE

COV

requests factoring of the covariance matrix instead of the correlation matrix. The COV option can be used only with the METHOD=PRINCIPAL, METHOD=PRINIT, METHOD=ULS, or METHOD=IMAGE option.

DATA=SAS-data-set

specifies the input data set, which can be an ordinary SAS data set or a specially structured SAS data set as described in the section "Input Data Set". If the DATA= option is omitted, the most recently created SAS data set is used.

EIGENVECTORS

displays the eigenvectors. PROC FACTOR chooses the solution that makes the sum of the elements of each eigenvector nonnegative. If the sum of the elements is equal to zero, then the sign depends on how the number is rounded off.

GAMMA=p

specifies the orthomax weight used with the option ROTATE=ORTHOMAX or PREROTATE=ORTHOMAX. There is no restriction on valid values.

HEYWOOD

HEY

sets to 1 any communality greater than 1, allowing iterations to proceed.

HKPOWER=p

HKP=p

specifies the power of the square roots of the eigenvalues used to rescale the eigenvectors for Harris-Kaiser (ROTATE=HK) rotation. Values between 0.0 and 1.0 are reasonable. The default value is 0.0, yielding the independent cluster solution (each variable tends to have a large loading on only one factor). A value of 1.0 is equivalent to a varimax rotation. You can also specify the HKPOWER= option with the ROTATE=QUARTIMAX, ROTATE=VARIMAX, ROTATE=EQUAMAX, or ROTATE=ORTHOMAX option, in which case the Harris-Kaiser rotation uses the specified orthogonal rotation method.

MAXITER=n

specifies the maximum number of iterations. You can use the MAXITER= option with the PRINIT, ULS, ALPHA, or ML methods. The default is 30.

METHOD=name

M=name

specifies the method for extracting factors. The default is METHOD=PRINCIPAL unless the DATA= data set is TYPE=FACTOR, in which case the default is METHOD=PATTERN. Valid values for name are as follows:

ALPHA | A: produces alpha factor analysis.
HARRIS | H: yields Harris component analysis of S^-1RS^-1 (Harris 1962), a noniterative approximation to canonical component analysis.
IMAGE | I: yields principal component analysis of the image covariance matrix, not Kaiser's (1963, 1970) or Kaiser and Rice's (1974) image analysis. A nonsingular correlation matrix is required.
ML | M: performs maximum-likelihood factor analysis with an algorithm due, except for minor details, to Fuller (1987). The option METHOD=ML requires a nonsingular correlation matrix.
PATTERN: reads a factor pattern from a TYPE=FACTOR, TYPE=CORR, TYPE=UCORR, TYPE=COV or TYPE=UCOV data set. If you create a TYPE=FACTOR data set in a DATA step, only observations containing the factor pattern (_TYPE_='PATTERN') and, if the factors are correlated, the interfactor correlations (_TYPE_='FCORR') are required.
PRINCIPAL | PRIN | P: yields principal component analysis if no PRIORS option or statement is used or if you specify PRIORS=ONE; if you specify a PRIORS statement or a PRIORS= value other than PRIORS=ONE, a principal factor analysis is performed.
PRINIT: yields iterated principal factor analysis.
SCORE: reads scoring coefficients (_TYPE_='SCORE') from a TYPE=FACTOR, TYPE=CORR, TYPE=UCORR, TYPE=COV, or TYPE=UCOV data set. The data set must also contain either a correlation or a covariance matrix. Scoring coefficients are also displayed if you specify the OUT= option.
ULS | U: produces unweighted least squares factor analysis.

MINEIGEN=p

MIN=p

specifies the smallest eigenvalue for which a factor is retained. If you specify two or more of the MINEIGEN=, NFACTORS=, and PROPORTION= options, the number of factors retained is the minimum number satisfying any of the criteria. The MINEIGEN= option cannot be used with either the METHOD=PATTERN or the METHOD=SCORE option. Negative values are not allowed. The default is 0 unless you omit both the NFACTORS= and the PROPORTION= options and one of the following conditions holds:

If you specify the METHOD=ALPHA or METHOD=HARRIS option, then MINEIGEN=1.
If you specify the METHOD=IMAGE option, then
MINEIGEN = [ total image variance/ number of variables]
For any other METHOD= specification, if prior communality estimates of 1.0 are used, then
MINEIGEN = [ total weighted variance/ number of variables]
When an unweighted correlation matrix is factored, this value is 1.

MSA

produces the partial correlations between each pair of variables controlling for all other variables (the negative anti-image correlations) and Kaiser's measure of sampling adequacy (Kaiser 1970; Kaiser and Rice 1974; Cerny and Kaiser 1977).

NFACTORS=n

NFACT=n

N=n

specifies the maximum number of factors to be extracted and determines the amount of memory to be allocated for factor matrices. The default is the number of variables. Specifying a number that is small relative to the number of variables can substantially decrease the amount of memory required to run PROC FACTOR, especially with oblique rotations. If you specify two or more of the NFACTORS=, MINEIGEN=, and PROPORTION= options, the number of factors retained is the minimum number satisfying any of the criteria. If you specify the option NFACTORS=0, eigenvalues are computed, but no factors are extracted. If you specify the option NFACTORS=-1, neither eigenvalues nor factors are computed. You can use the NFACTORS= option with the METHOD=PATTERN or METHOD=SCORE option to specify a smaller number of factors than are present in the data set.

NOBS=n

specifies the number of observations. If the DATA= input data set is a raw data set, nobs is defined by default to be the number of observations in the raw data set. The NOBS= option overrides this default definition. If the DATA= input data set contains a covariance, correlation, or scalar product matrix, the number of observations can be specified either by using the NOBS= option in the PROC FACTOR statement or by including a _TYPE_='N' observation in the DATA= input data set.

NOCORR

prevents the correlation matrix from being transferred to the OUTSTAT= data set when you specify the METHOD=PATTERN option. The NOCORR option greatly reduces memory requirements when there are many variables but few factors. The NOCORR option is not effective if the correlation matrix is required for other requested output; for example, if the scores or the residual correlations are displayed (using SCORE, RESIDUALS, ALL options).

NOINT

omits the intercept from the analysis; covariances or correlations are not corrected for the mean.

NOPRINT

suppresses the display of all output. Note that this option temporarily disables the Output Delivery System (ODS). For more information, see Chapter 15, "Using the Output Delivery System."

NORM=COV | KAISER | NONE | RAW | WEIGHT

specifies the method for normalizing the rows of the factor pattern for rotation. If you specify the option NORM=KAISER, Kaiser's normalization is used $(\sum_j p^2_{ij} = 1)$ .If you specify the option NORM=WEIGHT, the rows are weighted by the Cureton-Mulaik technique (Cureton and Mulaik 1975). If you specify the option NORM=COV, the rows of the pattern matrix are rescaled to represent covariances instead of correlations. If you specify the option NORM=NONE or NORM=RAW, normalization is not performed. The default is NORM=KAISER.

NPLOT=n

specifies the number of factors to be plotted. The default is to plot all factors. The smallest allowable value is 2. If you specify the option NPLOT=n, all pairs of the first n factors are plotted, producing a total of n(n - 1)/2 plots.

OUT=SAS-data-set

creates a data set containing all the data from the DATA= data set plus variables called Factor1, Factor2, and so on, containing estimated factor scores. The DATA= data set must contain multivariate data, not correlations or covariances. You must also specify the NFACTORS= option to determine the number of factor score variables. If you want to create a permanent SAS data set, you must specify a two-level name. Refer to "SAS Files" in SAS Language Reference: Concepts for more information on permanent data sets.

OUTSTAT=SAS-data-set

specifies an output data set containing most of the results of the analysis. The output data set is described in detail in the the section "Output Data Sets". If you want to create a permanent SAS data set, you must specify a two-level name. Refer to "SAS Files" in SAS Language Reference: Concepts for more information on permanent data sets.

PLOT

plots the factor pattern after rotation.

POWER=n

specifies the power to be used in computing the target pattern for the option ROTATE=PROMAX. Valid values must be integers $\geq 1$ .The default value is 3.

PREPLOT

plots the factor pattern before rotation.

PREROTATE=name

PRE=name

specifies the prerotation method for the option ROTATE=PROMAX. Any rotation method other than PROMAX or PROCRUSTES can be used. The default is PREROTATE=VARIMAX. If a previously rotated pattern is read using the option METHOD=PATTERN, you should specify the PREROTATE=NONE option.

PRINT

displays the input factor pattern or scoring coefficients and related statistics. In oblique cases, the reference and factor structures are computed and displayed. The PRINT option is effective only with the option METHOD=PATTERN or METHOD=SCORE.

PRIORS=name

specifies a method for computing prior communality estimates. You can specify numeric values for the prior communality estimates by using the PRIORS statement. Valid values for name are as follows:

ASMC | A: sets the prior communality estimates proportional to the squared multiple correlations but adjusted so that their sum is equal to that of the maximum absolute correlations (Cureton 1968).
INPUT | I: reads the prior communality estimates from the first observation with either _TYPE_='PRIORS' or _TYPE_='COMMUNAL' in the DATA= data set (which must be TYPE=FACTOR).
MAX | M: sets the prior communality estimate for each variable to its maximum absolute correlation with any other variable.
ONE | O: sets all prior communalities to 1.0.
RANDOM | R: sets the prior communality estimates to pseudo-random numbers uniformly distributed between 0 and 1.
SMC | S: sets the prior communality estimate for each variable to its squared multiple correlation with all other variables.

The default prior communality estimates are as follows.

METHOD=		PRIORS=
PRINCIPAL		ONE
PRINIT		ONE
ALPHA		SMC
ULS		SMC
ML		SMC
HARRIS		(not applicable)
IMAGE		(not applicable)
PATTERN		(not applicable)
SCORE		(not applicable)

By default, the options METHOD=PRINIT, METHOD=ULS, METHOD=ALPHA, and METHOD=ML stop iterating and set the number of factors to 0 if an estimated communality exceeds 1. The options HEYWOOD and ULTRAHEYWOOD allow processing to continue.

PROPORTION=p

PERCENT=p

P=p

specifies the proportion of common variance to be accounted for by the retained factors using the prior communality estimates. If the value is greater than one, it is interpreted as a percentage and divided by 100. The options PROPORTION=0.75 and PERCENT=75 are equivalent. The default value is 1.0 or 100%. You cannot specify the PROPORTION= option with the METHOD=PATTERN or METHOD=SCORE option. If you specify two or more of the PROPORTION=, NFACTORS=, and MINEIGEN= options, the number of factors retained is the minimum number satisfying any of the criteria.

RANDOM=n

specifies a positive integer as a starting value for the pseudo-random number generator for use with the option PRIORS=RANDOM. If you do not specify the RANDOM= option, the time of day is used to initialize the pseudo-random number sequence. Valid values must be integers $\geq 1$ .

REORDER

causes the rows (variables) of various factor matrices to be reordered on the output. Variables with their highest absolute loading (reference structure loading for oblique rotations) on the first factor are displayed first, from largest to smallest loading, followed by variables with their highest absolute loading on the second factor, and so on. The order of the variables in the output data set is not affected. The factors are not reordered.

RESIDUALS

RES

displays the residual correlation matrix and the associated partial correlation matrix. The diagonal elements of the residual correlation matrix are the unique variances.

ROTATE=name

R=name

specifies the rotation method. The default is ROTATE=NONE. The following orthogonal rotation methods are available in the FACTOR procedure: EQUAMAX, ORTHOMAX, QUARTIMAX, PARSIMAX, and VARIMAX.

After the initial factor extraction, the common factors are uncorrelated with each other. If the factors are rotated by an orthogonal transformation, the rotated factors are also uncorrelated. If the factors are rotated by an oblique transformation, the rotated factors become correlated. Oblique rotations often produce more useful patterns than do orthogonal rotations. However, a consequence of correlated factors is that there is no single unambiguous measure of the importance of a factor in explaining a variable. Thus, for oblique rotations, the pattern matrix does not provide all the necessary information for interpreting the factors; you must also examine the factor structure and the reference structure. Refer to Harman (1976) and Mulaik (1972) for further information. Valid values for name are as follows:

EQUAMAX | E

specifies orthogonal equamax rotation. This corresponds to the specification ROTATE=ORTHOMAX with GAMMA=number of factors/2.

HK

specifies Harris-Kaiser case II orthoblique rotation. You can use the HKPOWER= option to set the power of the square roots of the eigenvalues by which the eigenvectors are scaled.

NONE | N

specifies that no rotation be performed.

ORTHOMAX

specifies general orthomax rotation with the weight specified by the GAMMA= option.

PARSIMAX

specifies orthogonal Parsimax rotation. This corresponds to the specification ROTATE=ORTHOMAX with

GAMMA = [( nvar ×( nfact - 1))/( nvar + nfact - 2)]

where nvar is the number of variables, and nfact is the number of factors.

PROCRUSTES

specifies oblique Procrustes rotation with target pattern provided by the TARGET= data set. The unrestricted least squares method is used with factors scaled to unit variance after rotation.

PROMAX | P

specifies oblique promax rotation. The PREROTATE= and POWER= options can be used with the option ROTATE=PROMAX.

QUARTIMAX | Q

specifies orthogonal quartimax rotation. This corresponds to the specification ROTATE=ORTHOMAX with GAMMA=0.

VARIMAX | V

specifies orthogonal varimax rotation. This corresponds to the specification ROTATE=ORTHOMAX with GAMMA=1.

SCORE

displays the factor scoring coefficients. The squared multiple correlation of each factor with the variables is also displayed except in the case of unrotated principal components.

SCREE

displays a scree plot of the eigenvalues (Cattell 1966, 1978; Cattell and Vogelman 1977; Horn and Engstrom 1979).

SIMPLE

displays means, standard deviations, and the number of observations.

SINGULAR=p

SING=p

specifies the singularity criterion, where 0<p<1. The default value is 1E-8.

TARGET=SAS-data-set

specifies an input data set containing the target pattern for Procrustes rotation (see the description of the ROTATE= option). The TARGET= data set must contain variables with the same names as those being factored. Each observation in the TARGET= data set becomes one column of the target factor pattern. Missing values are treated as zeros. The _NAME_ and _TYPE_ variables are not required and are ignored if present.

ULTRAHEYWOOD

ULTRA

allows communalities to exceed 1. The ULTRAHEYWOOD option can cause convergence problems because communalities can become extremely large, and ill-conditioned Hessians may occur.

VARDEF=DF | N | WDF | WEIGHT | WGT

specifies the divisor used in the calculation of variances and covariances. The default value is VARDEF=DF. The values and associated divisors are displayed in the following table where i= 0 if the NOINT option is used and i= 1 otherwise, and where k is the number of partial variables specified in the PARTIAL statement.

Value	Description	Divisor
DF	degrees of freedom	n-k-i
N	number of observations	n-k
WDF	sum of weights DF	$\sum_i w_i-k-i$
WEIGHT \| WGT	sum of weights	$\sum_i w_i-k$

WEIGHT

factors a weighted correlation or covariance matrix. The WEIGHT option can be used only with the METHOD=PRINCIPAL, METHOD=PRINIT, METHOD=ULS, or METHOD=IMAGE option. The input data set must be of type CORR, UCORR, COV, UCOV or FACTOR, and the variable weights are obtained from an observation with _TYPE_='WEIGHT'.