Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The LOESS Procedure

kd Trees and Blending

PROC LOESS uses a kd tree to divide the box (also called the initial cell or bucket) enclosing all the predictor data points into rectangular cells. The vertices of these cells are the points at which local least squares fitting is done.

Starting from the initial cell, the direction of the longest cell edge is selected as the split direction. The median of this coordinate of the data in the cell is the split value. The data in the starting cell are partitioned into two child cells. The left child consists of all data from the parent cell whose coordinate in the split direction is less than the split value. The above procedure is repeated for each child cell that has more than a prespecified number of points, called the bucket size of the kd tree.

The value of the bucket size used by PROC LOESS can be specified using the BUCKET= option in the MODEL statement. If the BUCKET= option is not specified, the default value used is

floor ( [n s/5] )
where n is the number of observations and s is the smoothing parameter. Note that if fitting is being done for a range of smoothing parameters, the bucket size may change for each smoothing parameter.

The set of vertices of all the cells of the kd tree are the points at which PROC LOESS performs its local fitting. The fitted value at an original data point (or at any other point within the original data cell) is obtained by blending the fitted values at the vertices of the kd tree cell that contains that data point. Currently, PROC LOESS uses linear interpolation from the enclosing kd tree cell vertex values. Future releases of PROC LOESS will incorporate higher-order blending methods.

While the details of the kd tree and the fitted values at the vertices of the kd tree are implementation details that seldom need to be examined, PROC LOESS does provide options for their display. Each kd tree subdivision of the data used by PROC LOESS is placed in a kdTree table. The predicted values at the vertices of each kd tree are placed in a PredAtVertices table. These tables can be optionally displayed or placed in output data sets (as described in the section "Output Data Sets"), or both.

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.