Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The CANCORR Procedure

PROC CANCORR Statement

PROC CANCORR < options > ;

The PROC CANCORR statement starts the CANCORR procedure and optionally identifies input and output data sets, specifies the analyses performed, and controls displayed output. Table 20.1 summarizes the options.

Table 20.1: PROC CANCORR Statement Options
Task Options Description
Specify computational detailsEDF=specify error degrees of freedom if input observations are regression residuals
 NOINTomit intercept from canonical correlation and regression models
 RDF=specify regression degrees of freedom if input observations are regression residuals
 SINGULAR=specify the singularity criterion
Specify input and output data setsDATA=specify input data set name
 OUT=specify output data set name
 OUTSTAT=specify output data set name containing various statistics
Specify labeling optionsVNAME=specify a name to refer to VAR statement variables
 VPREFIX=specify a prefix for naming VAR statement canonical variables
 WNAME=specify a name to refer to WITH statement variables
 WPREFIX=specify a prefix for naming WITH statement canonical variables
Control amount of outputALLproduce simple statistics, input variable correlations, and canonical redundancy analysis
 CORRproduce input variable correlations
 NCAN=specify number of canonical variables for which full output is desired
 NOPRINTsuppress all displayed output
 REDUNDANCYproduce canonical redundancy analysis
 SHORTsuppress default output from canonical analysis
 SIMPLEproduce means and standard deviations
Request regression analysesVDEPrequest multiple regression analyses with the VAR variables as dependents and the WITH variables as regressors
 VREGrequest multiple regression analyses with the VAR variables as regressors and the WITH variables as dependents
 WDEPsame as VREG
 WREGsame as VDEP
Specify regression statisticsALLproduce all regression statistics and includes these statistics in the OUTSTAT= data set
 Bproduce raw regression coefficients
 CLBproduce 95% confidence interval limits for the regression coefficients
 CORRBproduce correlations among regression coefficients
 INTrequest statistics for the intercept when you specify the B, CLB, SEB, T, or PROBT option
 PCORRdisplay partial correlations between regressors and dependents
 PROBTdisplay probability levels for t statistics
 SEBdisplay standard errors of regression coefficients
 SMCdisplay squared multiple correlations and F tests
 SPCORRdisplay semipartial correlations between regressors and dependents
 SQPCORRdisplay squared partial correlations between regressors and dependents
 SQSPCORRdisplay squared semipartial correlations between regressors and dependents
 STBdisplay standardized regression coefficients
 Tdisplay t statistics for regression coefficients
Following are explanations of the options that can be used in the PROC CANCORR statement (in alphabetic order):

ALL
displays simple statistics, correlations among the input variables, the confidence limits for the regression coefficients, and the canonical redundancy analysis. If you specify the VDEP or WDEP option, the ALL option displays all related regression statistics (unless the NOPRINT option is specified) and includes these statistics in the OUTSTAT= data set.

B
produces raw regression coefficients from the regression analyses.

CLB
produces the 95% confidence limits for the regression coefficients from the regression analyses.

CORR
C
produces correlations among the original variables. If you include a PARTIAL statement, the CORR option produces a correlation matrix for all variables in the analysis, the regression statistics (R2, RMSE), the standardized regression coefficients for both the VAR and WITH variables as predicted from the PARTIAL statement variables, and partial correlation matrices.

CORRB
produces correlations among the regression coefficient estimates.

DATA=SAS-data-set
names the SAS data set to be analyzed by PROC CANCORR. It can be an ordinary SAS data set or a TYPE=CORR, COV, FACTOR, SSCP, UCORR, or UCOV data set. By default, the procedure uses the most recently created SAS data set.

EDF=error-df
specifies the error degrees of freedom if the input observations are residuals from a regression analysis. The effective number of observations is the EDF= value plus one. If you have 100 observations, then specifying EDF=99 has the same effect as omitting the EDF= option.

INT
requests that statistics for the intercept be included when B, CLB, SEB, T, or PROBT is specified for the regression analyses.

NCAN=number
specifies the number of canonical variables for which full output is desired. The number must be less than or equal to the number of canonical variables in the analysis.

The value of the NCAN= option specifies the number of canonical variables for which canonical coefficients and canonical redundancy statistics are displayed, and the number of variables shown in the canonical structure matrices. The NCAN= option does not affect the number of displayed canonical correlations.

If an OUTSTAT= data set is requested, the NCAN= option controls the number of canonical variables for which statistics are output. If an OUT= data set is requested, the NCAN= option controls the number of canonical variables for which scores are output.

NOINT
omits the intercept from the canonical correlation and regression models. Standard deviations, variances, covariances, and correlations are not corrected for the mean. If you use a TYPE=SSCP data set as input to the CANCORR procedure and list the variable Intercept in the VAR or WITH statement, the procedure runs as if you also specified the NOINT option. If you use NOINT and also create an OUTSTAT= data set, the data set is TYPE=UCORR.

NOPRINT
suppresses the display of all output. Note that this option temporarily disables the Output Delivery System (ODS). For more information, see Chapter 15, "Using the Output Delivery System."
OUT=SAS-data-set
creates an output SAS data set to contain all the original data plus scores on the canonical variables. If you want to create a permanent SAS data set, you must specify a two-level name. The OUT= option cannot be used when the DATA= data set is TYPE=CORR, COV, FACTOR, SSCP, UCORR, or UCOV. For details on OUT= data sets, see the section "Output Data Sets". Refer to SAS Language Reference: Concepts for more information on permanent SAS data sets.

OUTSTAT=SAS-data-set
creates an output SAS data set containing various statistics, including the canonical correlations and coefficients and the multiple regression statistics you request. If you want to create a permanent SAS data set, you must specify a two-level name. For details on OUTSTAT= data sets, see the section "Output Data Sets". Refer to SAS Language Reference: Concepts for more information on permanent SAS data sets.

PCORR
produces partial correlations between regressors and dependent variables, removing from each dependent variable and regressor the effects of all other regressors.

PROBT
produces probability levels for the t statistics in the regression analyses.

RDF=regression-df
specifies the regression degrees of freedom if the input observations are residuals from a regression analysis. The effective number of observations is the actual number minus the RDF= value. The degrees of freedom for the intercept should not be included in the RDF= option.

REDUNDANCY
RED
produces canonical redundancy statistics.

SEB
produces standard errors of the regression coefficients.

SHORT
suppresses all default output from the canonical analysis except the tables of canonical correlations and multivariate statistics.

SIMPLE
S
produces means and standard deviations.

SINGULAR=p
SING=p
specifies the singularity criterion, where 0<p<1. If a variable in the PARTIAL statement has an R2 as large as 1-p (where p is the value of the SINGULAR= option) when predicted from the variables listed before it in the statement, the variable is assigned a standardized regression coefficient of 0, and the LOG generates a linear dependency warning message. By default, SINGULAR=1E-8.

SMC
produces squared multiple correlations and F tests for the regression analyses.

SPCORR
produces semipartial correlations between regressors and dependent variables, removing from each regressor the effects of all other regressors.

SQPCORR
produces squared partial correlations between regressors and dependent variables, removing from each dependent variable and regressor the effects of all other regressors.

SQSPCORR
produces squared semipartial correlations between regressors and dependent variables, removing from each regressor the effects of all other regressors.

STB
produces standardized regression coefficients.

T
produces t statistics for the regression coefficients.

VDEP
WREG
requests multiple regression analyses with the VAR variables as dependent variables and the WITH variables as regressors.

VNAME='label'
VN='label'
specifies a character constant to refer to variables from the VAR statement on the output. Enclose the constant in single quotes. If you omit the VNAME= option, these variables are referred to as the VAR Variables. The number of characters in the label should not exceed the label length defined by the VALIDVARNAME= system option. For more information on the VALIDVARNAME= system option, refer to SAS Language Reference: Dictionary.

VPREFIX=name
VP=name
specifies a prefix for naming canonical variables from the VAR statement. By default, these canonical variables are given the names V1, V2, and so on. If you specify VPREFIX=ABC, the names are ABC1, ABC2, and so forth. The number of characters in the prefix plus the number of digits required to designate the variables should not exceed the name length defined by the VALIDVARNAME= system option. For more information on the VALIDVARNAME= system option, refer to SAS Language Reference: Dictionary.

WDEP
VREG
requests multiple regression analyses with the WITH variables as dependent variables and the VAR variables as regressors.

WNAME='label'
WN='label'
specifies a character constant to refer to variables in the WITH statement on the output. Enclose the constant in quotes. If you omit the WNAME= option, these variables are referred to as the WITH Variables. The number of characters in the label should not exceed the label length defined by the VALIDVARNAME= system option. For more information, on the VALIDVARNAME= system option, refer to SAS Language Reference: Dictionary.

WPREFIX=name
WP=name
specifies a prefix for naming canonical variables from the WITH statement. By default, these canonical variables are given the names W1, W2, and so on. If you specify WPREFIX=XYZ, then the names are XYZ1, XYZ2, and so forth. The number of characters in the prefix plus the number of digits required to designate the variables should not exceed the label length defined by the VALIDVARNAME= system option. For more information, on the VALIDVARNAME= system option, refer to SAS Language Reference: Dictionary.

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.