Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The REG Procedure

Output Data Sets

OUTEST= Data Set

The OUTEST= specification produces a TYPE=EST output SAS data set containing estimates and optional statistics from the regression models. For each BY group on each dependent variable occurring in each MODEL statement, PROC REG outputs an observation to the OUTEST= data set. The variables output to the data set are as follows:

If you specify the COVOUT option, the covariance matrix of the estimates is output after the estimates; the _TYPE_ variable is set to the value 'COV' and the names of the rows are identified by the 8-byte character variable, _NAME_.

If you specify the TABLEOUT option, the following statistics listed by _TYPE_ are added after the estimates:

Specifying the option ADJRSQ, AIC, BIC, CP, EDF, GMSEP, JP, MSE, PC, RSQUARE, SBC, SP, or SSE in the PROC REG or MODEL statement automatically outputs these statistics and the model R2 for each model selected, regardless of the model selection method. Additional variables, in order of occurrence, are as follows:

The following is an example with a display of the OUTEST= data set. This example uses the population data given in the section "Polynomial Regression". Figure 55.15 through Figure 55.17 show the regression equations and the resulting OUTEST= data set.

   proc reg data=USPopulation outest=est;
      m1: model Population=Year;
      m2: model Population=Year YearSq;
   proc print data=est;
   run;

 
The REG Procedure
Model: M1
Dependent Variable: Population

Analysis of Variance
Source DF Sum of
Squares
Mean
Square
F Value Pr > F
Model 1 66336 66336 201.87 <.0001
Error 17 5586.29253 328.60544    
Corrected Total 18 71923      
 
Root MSE 18.12748 R-Square 0.9223
Dependent Mean 69.76747 Adj R-Sq 0.9178
Coeff Var 25.98271    
 
Parameter Estimates
Variable DF Parameter
Estimate
Standard
Error
t Value Pr > |t|
Intercept 1 -1958.36630 142.80455 -13.71 <.0001
Year 1 1.07879 0.07593 14.21 <.0001
Figure 55.16: Regression Output for Model M1

 
The REG Procedure
Model: M2
Dependent Variable: Population

Analysis of Variance
Source DF Sum of
Squares
Mean
Square
F Value Pr > F
Model 2 71799 35900 4641.72 <.0001
Error 16 123.74557 7.73410    
Corrected Total 18 71923      
 
Root MSE 2.78102 R-Square 0.9983
Dependent Mean 69.76747 Adj R-Sq 0.9981
Coeff Var 3.98613    
 
Parameter Estimates
Variable DF Parameter
Estimate
Standard
Error
t Value Pr > |t|
Intercept 1 20450 843.47533 24.25 <.0001
Year 1 -22.78061 0.89785 -25.37 <.0001
YearSq 1 0.00635 0.00023877 26.58 <.0001
Figure 55.17: Regression Output for Model M2

 
Obs _MODEL_ _TYPE_ _DEPVAR_ _RMSE_ Intercept Year Population YearSq
1 M1 PARMS Population 18.1275 -1958.37 1.0788 -1 .
2 M2 PARMS Population 2.7810 20450.43 -22.7806 -1 .006345585
Figure 55.18: OUTEST= Data Set

The following modification of the previous example uses the TABLEOUT and ALPHA= options to obtain additional information in the OUTEST= data set:

 
   proc reg data=USPopulation outest=est tableout alpha=0.1;
      m1: model Population=Year/noprint;
      m2: model Population=Year YearSq/noprint;
   proc print data=est;
   run;

Notice that the TABLEOUT option causes standard errors, t statistics, p-values, and confidence limits for the estimates to be added to the OUTEST= data set. Also note that the ALPHA= option is used to set the confidence level at 90%. The OUTEST= data set follows.

 
Obs _MODEL_ _TYPE_ _DEPVAR_ _RMSE_ Intercept Year Population YearSq
1 M1 PARMS Population 18.1275 -1958.37 1.0788 -1 .
2 M1 STDERR Population 18.1275 142.80 0.0759 . .
3 M1 T Population 18.1275 -13.71 14.2082 . .
4 M1 PVALUE Population 18.1275 0.00 0.0000 . .
5 M1 L90B Population 18.1275 -2206.79 0.9467 . .
6 M1 U90B Population 18.1275 -1709.94 1.2109 . .
7 M2 PARMS Population 2.7810 20450.43 -22.7806 -1 0.0063
8 M2 STDERR Population 2.7810 843.48 0.8978 . 0.0002
9 M2 T Population 2.7810 24.25 -25.3724 . 26.5762
10 M2 PVALUE Population 2.7810 0.00 0.0000 . 0.0000
11 M2 L90B Population 2.7810 18977.82 -24.3481 . 0.0059
12 M2 U90B Population 2.7810 21923.04 -21.2131 . 0.0068
Figure 55.19: The OUTEST= Data Set When TABLEOUT is Specified

A slightly different OUTEST= data set is created when you use the RSQUARE selection method. This example requests only the "best" model for each subset size but asks for a variety of model selection statistics, as well as the estimated regression coefficients. An OUTEST= data set is created and displayed. See Figure 55.19 and Figure 55.20 for results.

   proc reg data=fitness outest=est;
      model Oxygen=Age Weight RunTime RunPulse RestPulse MaxPulse
            / selection=rsquare mse jp gmsep cp aic bic sbc b best=1;
   proc print data=est;
   run;

 
The REG Procedure
Model: MODEL1
Dependent Variable: Oxygen
R-Square Selection Method

Number in
Model
R-Square C(p) AIC BIC Estimated MSE
of Prediction
J(p) MSE SBC Parameter Estimates
Intercept Age Weight RunTime RunPulse RestPulse MaxPulse
1 0.7434 13.6988 64.5341 65.4673 8.0546 8.0199 7.53384 67.40210 82.42177 . . -3.31056 . . .
2 0.7642 12.3894 63.9050 64.8212 7.9478 7.8621 7.16842 68.20695 88.46229 -0.15037 . -3.20395 . . .
3 0.8111 6.9596 59.0373 61.3127 6.8583 6.7253 5.95669 64.77326 111.71806 -0.25640 . -2.82538 -0.13091 . .
4 0.8368 4.8800 56.4995 60.3996 6.3984 6.2053 5.34346 63.66941 98.14789 -0.19773 . -2.76758 -0.34811 . 0.27051
5 0.8480 5.1063 56.2986 61.5667 6.4565 6.1782 5.17634 64.90250 102.20428 -0.21962 -0.07230 -2.68252 -0.37340 . 0.30491
6 0.8487 7.0000 58.1616 64.0748 6.9870 6.5804 5.36825 68.19952 102.93448 -0.22697 -0.07418 -2.62865 -0.36963 -0.02153 0.30322

Figure 55.20: PROC REG Output for Physical Fitness Data: Best Models

 
Obs _MODEL_ _TYPE_ _DEPVAR_ _RMSE_ Intercept Age Weight RunTime RunPulse RestPulse MaxPulse Oxygen _IN_ _P_ _EDF_ _MSE_ _RSQ_ _CP_ _JP_ _GMSEP_ _AIC_ _BIC_ _SBC_
1 MODEL1 PARMS Oxygen 2.74478 82.422 . . -3.31056 . . . -1 1 2 29 7.53384 0.74338 13.6988 8.01990 8.05462 64.5341 65.4673 67.4021
2 MODEL1 PARMS Oxygen 2.67739 88.462 -0.15037 . -3.20395 . . . -1 2 3 28 7.16842 0.76425 12.3894 7.86214 7.94778 63.9050 64.8212 68.2069
3 MODEL1 PARMS Oxygen 2.44063 111.718 -0.25640 . -2.82538 -0.13091 . . -1 3 4 27 5.95669 0.81109 6.9596 6.72530 6.85833 59.0373 61.3127 64.7733
4 MODEL1 PARMS Oxygen 2.31159 98.148 -0.19773 . -2.76758 -0.34811 . 0.27051 -1 4 5 26 5.34346 0.83682 4.8800 6.20531 6.39837 56.4995 60.3996 63.6694
5 MODEL1 PARMS Oxygen 2.27516 102.204 -0.21962 -0.072302 -2.68252 -0.37340 . 0.30491 -1 5 6 25 5.17634 0.84800 5.1063 6.17821 6.45651 56.2986 61.5667 64.9025
6 MODEL1 PARMS Oxygen 2.31695 102.934 -0.22697 -0.074177 -2.62865 -0.36963 -0.021534 0.30322 -1 6 7 24 5.36825 0.84867 7.0000 6.58043 6.98700 58.1616 64.0748 68.1995
Figure 55.21: PROC PRINT Output for Physical Fitness Data: OUTEST= Data Set

OUTSSCP= Data Sets

The OUTSSCP= option produces a TYPE=SSCP output SAS data set containing sums of squares and crossproducts. A special row (observation) and column (variable) of the matrix called Intercept contain the number of observations and sums. Observations are identified by the character variable _NAME_. The data set contains all variables used in MODEL statements. You can specify additional variables that you want included in the crossproducts matrix with a VAR statement.

The SSCP data set is used when a large number of observations are explored in many different runs. The SSCP data set can be saved and used for subsequent runs, which are much less expensive since PROC REG never reads the original data again. If you run PROC REG once to create only a SSCP data set, you should list all the variables that you may need in a VAR statement or include all the variables that you may need in a MODEL statement.

The following example uses the fitness data from Example 55.1 to produce an output data set with the OUTSSCP= option. The resulting output is shown in Figure 55.21.

   proc reg data=fitness outsscp=sscp;
      var Oxygen RunTime Age Weight RestPulse RunPulse MaxPulse;
   proc print data=sscp;
   run;
Since a model is not fit to the data and since the only request is to create the SSCP data set, a MODEL statement is not required in this example. However, since the MODEL statement is not used, the VAR statement is required.

 
Obs _TYPE_ _NAME_ Intercept Oxygen RunTime Age Weight RestPulse RunPulse MaxPulse
1 SSCP Intercept 31.00 1468.65 328.17 1478.00 2400.78 1657.00 5259.00 5387.00
2 SSCP Oxygen 1468.65 70429.86 15356.14 69767.75 113522.26 78015.41 248497.31 254866.75
3 SSCP RunTime 328.17 15356.14 3531.80 15687.24 25464.71 17684.05 55806.29 57113.72
4 SSCP Age 1478.00 69767.75 15687.24 71282.00 114158.90 78806.00 250194.00 256218.00
5 SSCP Weight 2400.78 113522.26 25464.71 114158.90 188008.20 128409.28 407745.67 417764.62
6 SSCP RestPulse 1657.00 78015.41 17684.05 78806.00 128409.28 90311.00 281928.00 288583.00
7 SSCP RunPulse 5259.00 248497.31 55806.29 250194.00 407745.67 281928.00 895317.00 916499.00
8 SSCP MaxPulse 5387.00 254866.75 57113.72 256218.00 417764.62 288583.00 916499.00 938641.00
9 N   31.00 31.00 31.00 31.00 31.00 31.00 31.00 31.00
Figure 55.22: SSCP Data Set Created with OUTSSCP= Option: REG Procedure

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.