Statistical Inference

The LOESS Procedure

Statistical Inference

If you denote the ith measurement of the response by y_i and the corresponding measurement of predictors by x_i, then

$y_i=g(x_i) + \epsilon_i$

where g is the regression function and $\epsilon_i$ are independent random errors with mean zero. If the errors are normally distributed with constant variance, then you can obtain confidence intervals for the predictions from PROC LOESS. You can also obtain confidence limits in the case where $\epsilon_i$ is heteroscedastic but $a_i \epsilon_i$ has constant variance and a_i are a priori weights that are specified using the WEIGHT statement of PROC LOESS. You can do inference in the case in which the error distribution is symmetric by using iterative reweighting.

Formulae for doing statistical inference under the preceding conditions can be found in Cleveland and Grosse (1991) and Cleveland, Grosse, and Shyu (1992). The main result of their analysis is that a standardized residual for a loess model follows a t distribution with $\rho$ degrees of freedom, where $\rho$ is called the "lookup degrees of freedom." $\rho$ is a function of the smoothing matrix L, which defines the linear relationship between the fitted and observed dependent variable values of a loess model.

The determination of $\rho$ is computationally expensive and is not done by default. It is computed if you specify the DFMETHOD=EXACT option in the MODEL statement. It is also computed if you specify any of the options CLM, STD, or T in the MODEL statement.

If you specify the CLM option in the MODEL statement, confidence limits are added to the OutputStatistics table. By default, 95% limits are computed, but you can change this by using the ALPHA= option in the MODEL statement.

Chapter Contents
Previous
Next
Top