Chapter Contents |
Previous |
Next |
The GLM Procedure |
title 'Regression in PROC GLM'; data iron; input fe loss @@; datalines; 0.01 127.6 0.48 124.0 0.71 110.8 0.95 103.9 1.19 101.5 0.01 130.1 0.48 122.0 1.44 92.3 0.71 113.1 1.96 83.7 0.01 128.0 1.44 91.4 1.96 86.2 ;
The GPLOT procedure is used to request a scatter plot of the response variable versus the independent variable.
symbol1 c=blue; proc gplot; plot loss*fe / vm=1; run;
The plot in Figure 30.3 displays a strong negative relationship between iron content and corrosion resistance, but it is not clear whether there is curvature in this relationship.
The following statements fit a quadratic regression model to the data. This enables you to estimate the linear relationship between iron content and corrosion resistance and test for the presence of a quadratic component. The intercept is automatically fit unless the NOINT option is specified.
proc glm; model loss=fe fe*fe; run;
The CLASS statement is omitted because a regression line is being fitted. Unlike PROC REG, PROC GLM allows polynomial terms in the MODEL statement.
|
The preliminary information in Figure 30.4 informs you that the GLM procedure has been invoked and states the number of observations in the data set. If the model involves classification variables, they are also listed here, along with their levels.
Figure 30.5 shows the overall ANOVA table and some simple statistics. The degrees of freedom can be used to check that the model is correct and that the data have been read correctly. The Model degrees of freedom for a regression is the number of parameters in the model minus 1. You are fitting a model with three parameters in this case,
The R2 indicates that the model accounts for 97% of the variation in LOSS. The coefficient of variation (C.V.), Root MSE (Mean Square for Error), and mean of the dependent variable are also listed.
The overall F test is significant (F=164.68, p<0.0001), indicating that the model as a whole accounts for a significant amount of the variation in LOSS. Thus, it is appropriate to proceed to testing the effects.
Figure 30.6 contains tests of effects and parameter estimates. The latter are displayed by default when the model contains only continuous variables.
|
The t tests provided are equivalent to the Type III F tests. The quadratic term is not significant (F=0.28, p=0.6107; t=0.53, p=0.6107) and thus can be removed from the model; the linear term is significant (F=35.64, p=0.0001; t=-5.97, p=0.0001). This suggests that there is indeed a straight line relationship between loss and fe.
Fitting the model without the quadratic term provides more accurate estimates for and .PROC GLM allows only one MODEL statement per invocation of the procedure, so the PROC GLM statement must be issued again. The statements used to fit the linear model are
proc glm; model loss=fe; run;
Figure 30.7 displays the output produced by these statements. The linear term is still significant (F=352.27, p<0.0001). The estimated model is now
|
Chapter Contents |
Previous |
Next |
Top |
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.