Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The REG Procedure

PROC REG Statement

PROC REG < options > ;
The PROC REG statement is required. If you want to fit a model to the data, you must also use a MODEL statement. If you want to use only the PROC REG options, you do not need a MODEL statement, but you must use a VAR statement. If you do not use a MODEL statement, then the COVOUT and OUTEST= options are not available.

Table 55.1 lists the options you can use with the PROC REG statement. Note that any option specified in the PROC REG statement applies to all MODEL statements.

Table 55.1: PROC REG Statement Options
Option Description
Data Set Options
DATA=names a data set to use for the regression
OUTEST=outputs a data set that contains parameter estimates and other model fit summary statistics
OUTSSCP=outputs a data set that contains sums of squares and crossproducts
COVOUToutputs the covariance matrix for parameter estimates to the OUTEST= data set
EDFoutputs the number of regressors, the error degrees of freedom, and the model R2 to the OUTEST= data set
OUTSTBoutputs standardized parameter estimates to the OUTEST= data set. Use only with the RIDGE= or PCOMIT= option.
OUTSEBoutputs standard errors of the parameter estimates to the OUTEST= data set
OUTVIFoutputs the variance inflation factors to the OUTEST= data set. Use only with the RIDGE= or PCOMIT= option.
PCOMIT=performs incomplete principal component analysis and outputs estimates to the OUTEST= data set
PRESSoutputs the PRESS statistic to the OUTEST= data set
RIDGE=performs ridge regression analysis and outputs estimates to the OUTEST= data set
RSQUAREsame effect as the EDF option
TABLEOUToutputs standard errors, confidence limits, and associated test statistics of the parameter estimates to the OUTEST= data set
High Resolution Graphics Options
ANNOTATE=specifies an annotation data set
GOUT=specifies the graphics catalog in which graphics output is saved
Display Options
CORRdisplays correlation matrix for variables listed in MODEL and VAR statements
SIMPLEdisplays simple statistics for each variable listed in MODEL and VAR statements
USCCPdisplays uncorrected sums of squares and crossproducts matrix
ALLdisplays all statistics (CORR, SIMPLE, and USSCP)
NOPRINTsuppresses output
LINEPRINTERcreates plots requested as line printer plot
Other Options
ALPHA=sets significance value for confidence and prediction intervals and tests
SINGULAR=sets criterion for checking for singularity


Following are explanations of the options that you can specify in the PROC REG statement (in alphabetical order).

Note that any option specified in the PROC REG statement applies to all MODEL statements.

ALL
requests the display of many tables. Using the ALL option in the PROC REG statement is equivalent to specifying ALL in every MODEL statement. The ALL option also implies the CORR, SIMPLE, and USSCP options.

ALPHA=number
sets the significance level used for the construction of confidence intervals. The value must be between 0 and 1; the default value of 0.05 results in 95% intervals. This option affects the PROC REG option TABLEOUT; the MODEL options CLB, CLI, and CLM; the OUTPUT statement keywords LCL, LCLM, UCL, and UCLM; the PLOT statement keywords LCL., LCLM., UCL., and UCLM.; and the PLOT statement options CONF and PRED.

ANNOTATE=SAS-data-set
ANNO= SAS-data-set
specifies an input data set containing annotate variables, as described in SAS/GRAPH Software: Reference. You can use this data set to add features to plots. Features provided in this data set are applied to all plots produced in the current run of PROC REG. To add features to individual plots, use the ANNOTATE= option in the PLOT statement. This option cannot be used if the LINEPRINTER option is specified.

CORR
displays the correlation matrix for all variables listed in the MODEL or VAR statement.

COVOUT
outputs the covariance matrices for the parameter estimates to the OUTEST= data set. This option is valid only if the OUTEST= option is also specified. See the "OUTEST= Data Set" section.

DATA=SAS-data-set
names the SAS data set to be used by PROC REG. The data set can be an ordinary SAS data set or a TYPE=CORR, TYPE=COV, or TYPE=SSCP data set. If one of these special TYPE= data sets is used, the OUTPUT, PAINT, PLOT, and REWEIGHT statements and some options in the MODEL and PRINT statements are not available. See Appendix A, "Special SAS Data Sets," for more information on TYPE= data sets. If the DATA= option is not specified, PROC REG uses the most recently created SAS data set.

EDF
outputs the number of regressors in the model excluding and including the intercept, the error degrees of freedom, and the model R2 to the OUTEST= data set.

GOUT=graphics-catalog
specifies the graphics catalog in which graphics output is saved. The default graphics-catalog is WORK.GSEG. The GOUT= option cannot be used if the LINEPRINTER option is specified.

LINEPRINTER | LP
creates plots requested as line printer plots. If you do not specify this option, requested plots are created on a high resolution graphics device. This option is required if plots are requested and you do not have SAS/GRAPH software.

NOPRINT
suppresses the normal display of results. Using this option in the PROC REG statement is equivalent to specifying NOPRINT in each MODEL statement. Note that this option temporarily disables the Output Delivery System (ODS); see Chapter 15, "Using the Output Delivery System," for more information.

OUTEST=SAS-data-set
requests that parameter estimates and optional model fit summary statistics be output to this data set. See the "OUTEST= Data Set" section for details. If you want to create a permanent SAS data set, you must specify a two-level name (refer to the section "SAS Files" in SAS Language Reference: Concepts for more information on permanent SAS data sets).

OUTSEB
outputs the standard errors of the parameter estimates to the OUTEST= data set. The value SEB for the variable _TYPE_ identifies the standard errors. If the RIDGE= or PCOMIT= option is specified, additional observations are included and identified by the values RIDGESEB and IPCSEB, respectively, for the variable _TYPE_. The standard errors for ridge regression estimates and IPC estimates are limited in their usefulness because these estimates are biased. This option is available for all model selection methods except RSQUARE, ADJRSQ, and CP.

OUTSSCP=SAS-data-set
requests that the sums of squares and crossproducts matrix be output to this TYPE=SSCP data set. See the "OUTSSCP= Data Sets" section for details. If you want to create a permanent SAS data set, you must specify a two-level name (refer to the section "SAS Files" in SAS Language Reference: Concepts for more information on permanent SAS data sets).

OUTSTB
outputs the standardized parameter estimates as well as the usual estimates to the OUTEST= data set when the RIDGE= or PCOMIT= option is specified. The values RIDGESTB and IPCSTB for the variable _TYPE_ identify ridge regression estimates and IPC estimates, respectively.

OUTVIF
outputs the variance inflation factors (VIF) to the OUTEST= data set when the RIDGE= or PCOMIT= option is specified. The factors are the diagonal elements of the inverse of the correlation matrix of regressors as adjusted by ridge regression or IPC analysis. These observations are identified in the output data set by the values RIDGEVIF and IPCVIF for the variable _TYPE_.

PCOMIT=list
requests an incomplete principal components (IPC) analysis for each value m in the list. The procedure computes parameter estimates using all but the last m principal components. Each value of m produces a set of IPC estimates, which are output to the OUTEST= data set. The values of m are saved by the variable _PCOMIT_, and the value of the variable _TYPE_ is set to IPC to identify the estimates. Only nonnegative integers can be specified with the PCOMIT= option.

If you specify the PCOMIT= option, RESTRICT statements are ignored.

PRESS
outputs the PRESS statistic to the OUTEST= data set. The values of this statistic are saved in the variable _PRESS_. This option is available for all model selection methods except RSQUARE, ADJRSQ, and CP.

RIDGE=list
requests a ridge regression analysis and specifies the values of the ridge constant k (see the "Computations for Ridge Regression and IPC Analysis" section). Each value of k produces a set of ridge regression estimates that are placed in the OUTEST= data set. The values of k are saved by the variable _RIDGE_, and the value of the variable _TYPE_ is set to RIDGE to identify the estimates.

Only nonnegative numbers can be specified with the RIDGE= option. Example 55.10 illustrates this option.

If you specify the RIDGE= option, RESTRICT statements are ignored.

RSQUARE
has the same effect as the EDF option.

SIMPLE
displays the sum, mean, variance, standard deviation, and uncorrected sum of squares for each variable used in PROC REG.

SINGULAR=n
tunes the mechanism used to check for singularities. The default value is machine dependent but is approximately 1E-7 on most machines. This option is rarely needed. Singularity checking is described in the "Computational Methods" section.

TABLEOUT
outputs the standard errors and 100(1-\alpha)% confidence limits for the parameter estimates, the t statistics for testing if the estimates are zero, and the associated p-values to the OUTEST= data set. The _TYPE_ variable values STDERR, LnB, UnB, T, and PVALUE, where n=100(1-\alpha), identify these rows in the OUTEST= data set. The \alpha-level can be set with the ALPHA= option in the PROC REG or MODEL statement. The OUTEST= option must be specified in the PROC REG statement for this option to take effect.

USSCP
displays the uncorrected sums-of-squares and crossproducts matrix for all variables used in the procedure.

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.