|
Chapter Contents |
Previous |
Next |
| The LOESS Procedure |
Investigators studied the exhaust emissions of a one cylinder engine (Brinkman 1981). The SAS data set Gas contains the results data. The dependent variable, NOx, measures the concentration, in micrograms per joule, of nitric oxide and nitrogen dioxide normalized by the amount of work of the engine. The independent variable, E, is a measure of the richness of the air and fuel mixture.
data Gas;
input NOx E;
format NOx f3.1;
format E f3.1;
datalines;
4.818 0.831
2.849 1.045
3.275 1.021
4.691 0.97
4.255 0.825
5.064 0.891
2.118 0.71
4.602 0.801
2.286 1.074
0.97 1.148
3.965 1
5.344 0.928
3.834 0.767
1.99 0.701
5.199 0.807
5.283 0.902
3.752 0.997
0.537 1.224
1.64 1.089
5.055 0.973
4.937 0.98
1.561 0.665
;
The following PROC GPLOT statements produce the simple scatter plot of these data, displayed in Output 38.1.1.
symbol1 color=black value=dot ;
proc gplot data=Gas;
plot NOx*E;
run;
Output 38.1.1: Scatter Plot of Gas Data
|
proc loess data=Gas;
ods output OutputStatistics = GasFit
FitSummary=Summary;
model NOx = E / degree=2 direct smooth = 0.6 1.0
alpha=.01 all details;
run;
The "Fit Summary" table for smoothing parameter 0.6, shown in Output 38.1.2, records the fitting parameters specified and some overall fit statistics.
Output 38.1.2: Fit Summary Table

The equivalent number of parameters and residual standard error in the "Fit Summary" table are defined by

The "Output Statistics" table for smoothing parameter 0.6 is shown in Output 38.1.3. Note that, as the ALL option in the MODEL statement is specified, this table includes all the relevant optional columns. Furthermore, because the ALPHA=0.01 option is specified in the MODEL statement, the confidence limits in this table are 99% limits.
Output 38.1.3: Output Statistics Table
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Plots of the data points and fitted models with 99% confidence bands are shown in Output 38.1.4.
proc sort data=GasFit;
by SmoothingParameter E;
run;
symbol1 color=black value=dot ;
symbol2 color=black interpol=spline value=none;
symbol3 color=green interpol=spline value=none;
symbol4 color=green interpol=spline value=none;
%let opts=vaxis=axis1 hm=3 vm=3 overlay;
goptions nodisplay hsize=3.75;
axis1 label=(angle=90 rotate=0);
proc gplot data=GasFit;
by SmoothingParameter;
plot (DepVar Pred LowerCL UpperCL)*E/ &opts name='fitGas';
run; quit;
goptions display hsize=0 hpos=0;
proc greplay nofs tc=sashelp.templt template=h2;
igout gseg;
treplay 1:fitGas 2:fitGas1;
run; quit;
Output 38.1.4: Loess Fits with 99% Confidence Bands for Gas Data
|
It is evident from the preceding figure that the better fit is obtained with smoothing parameter 0.6. Scatter plots of the fit residuals confirm this observation. Note also that PROC LOESS is again used to produce the Residual variable on these plots.
proc loess data=GasFit;
by SmoothingParameter;
ods output OutputStatistics=residout;
model Residual=E;
run;
axis1 label = (angle=90 rotate=0)
order = (-1 to 1 by 0.5);
goptions nodisplay hsize=3.75;
proc gplot data=residout;
by SmoothingParameter;
plot DepVar*E Pred*E/ &opts vref=0 lv=2 vm=1
name='resGas';
run; quit;
goptions display hsize=0 hpos=0;
proc greplay nofs tc=sashelp.templt template=h2;
igout gseg;
treplay 1:resGas 2:resGas1;
run; quit;
Output 38.1.5: Scatter Plots of Loess Fit Residuals
|
The residual plots show that with smoothing parameter 1, the loess model exhibits a lack of fit. Analysis of variance can be used to compare the model with smoothing parameter 1, which serves as the null model, to the model with smoothing parameter 0.6.
The statistic


The "Fit Summary" tables contain the information needed to carry out such an analysis. These tables have been captured in the output data set named Summary using an ODS OUTPUT statement. The following statements extract the relevant information from this data set and carry out the analysis of variance:
data h0 h1;
set Summary(keep=SmoothingParameter Label1 nValue1
where=(Label1 in ('Residual Sum of Squares',
'Delta1',
'Delta2',
'Lookup Degrees of Freedom')));
if SmoothingParameter = 1 then output h0;
else output h1;
run;
proc transpose data=h0(drop=SmoothingParameter Label1)
out=h0;
data h0(drop=_NAME_); set h0;
rename Col1 = RSSNull
Col2 = delta1Null
Col3 = delta2Null;
proc transpose data=h1(drop=SmoothingParameter Label1)
out=h1;
data h1(drop=_NAME_); set h1;
rename Col1 = RSS
Col2 = delta1
Col3 = delta2
Col4 = rho;
data ftest; merge h0 h1;
nu = (delta1Null - delta1)**2 / (delta2Null - delta2);
Numerator = (RSSNull - RSS)/(delta1Null - delta1);
Denominator = RSS/delta1;
FValue = Numerator / Denominator;
PValue = 1 - ProbF(FValue, nu, rho);
label nu = 'Num DF'
rho = 'Den DF'
FValue = 'F Value'
PValue = 'Pr > F';
proc print data=ftest label;
var nu rho Numerator Denominator FValue PValue;
format nu rho FValue 7.2 PValue 6.4;
run;
The results are shown in Output 38.1.6.
Output 38.1.6: Test ANOVA for LOESS MODELS of Gas Data
|
|
Chapter Contents |
Previous |
Next |
Top |
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.