Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The LOESS Procedure

MODEL Statement

     

MODEL dependents=independent variables < / options > ;
The MODEL statement names the dependent variables and the independent variables. Variables specified in the MODEL statement must be numeric variables in the data set being analyzed.

Table 38.1 lists the options available in the MODEL statement.

Table 38.1: Model Statement Options
Option Description
Fitting Parameters
DIRECTspecifies direct fitting at every data point
SMOOTH=specifies the list of smoothing values
DEGREE=specifies the degree of local polynomials (1 or 2)
DROPSQUARE=specifies the variables whose squares are to be dropped from local quadratic polynomials
BUCKET=specifies the number of points in kd tree buckets
ITERATIONS=specifies the number of reweighting iterations
DFMETHOD=specifies the method of computing lookup degrees of freedom
Residuals and Confidence limits
ALLrequests the following options: CLM, RESIDUAL, STD, SCALEDINDEP
CLMdisplays 100(1-\alpha)% confidence interval for the mean predicted value
RESIDUALdisplays residual statistics
STDdisplays estimated prediction standard deviation
Tdisplays t statistics
Display Options
DETAILS=specifies which tables are to be displayed
Other options
ALPHA=sets significance value for confidence intervals
SCALE=specifies the method used to scale the regressor variables
SCALEDINDEPdisplays scaled independent variable coordinates


The following options are available in the MODEL statement after a slash (/).

ALL
requests all these options: CLM, RESIDUAL, SCALEDINDEP, STD, and T.

ALPHA=number
sets the significance level used for the construction of confidence intervals for the current MODEL statement. The value must be between 0 and 1; the default value of 0.05 results in 95% intervals.

BUCKET=number
specifies the maximum number of points in the leaf nodes of the kd tree. The default value used is s*n/5, where s is a smoothing parameter specified using the SMOOTH= option and n is the number of observations being used in the current BY group. The BUCKET= option is ignored if the DIRECT option is specified.

CLM
requests that 100(1-\alpha) confidence limits on the mean predicted value be added to the "Output Statistics" table. By default, 95% limits are computed; the ALPHA= option in the MODEL statement can be used to change the \alpha-level. The use of this option implicitly selects the model option DFMETHOD=EXACT if the DFMETHOD= option has not been explicitly used.

DEGREE= 1 | 2
sets the degree of the local polynomials to use for each local regression. The valid values are 1 for local linear fitting or 2 for local quadratic fitting, with 1 being the default.

DETAILS < ( tables ) >
selects which tables to display, where tables is one or more of kdTree (or TREE), PredAtVertices (or FITPOINTS), and OutputStatistics (or STATOUT). A specification of kdTree outputs the kd tree structure, PredAtVertices outputs fitted values and coordinates of the kd tree vertices where the local least squares fitting is done, and OutputStatistics outputs the predicted values and other requested statistics at the points in the input data set. The kdTree and PredAtVertices specifications are ignored if the DIRECT option is specified in the MODEL statement. Specifying the option DETAILS with no qualifying list outputs all tables.

DFMETHOD= NONE | EXACT
specifies the method used to calculate the "lookup" degrees of freedom used in performing statistical inference. The default is DFMETHOD=NONE. Approximate methods for computing the "lookup" degrees of freedom are not currently supported. The use of any of the MODEL statement options ALL, CLM or T or any SCORE statement CLM option implicitly selects the DFMETHOD=EXACT option.

DIRECT
specifies that local least squares fits are to be done at every point in the input data set. When the direct option is not specified, a computationally faster method is used. This faster method performs local fitting at vertices of a kd tree decomposition of the predictor space followed by blending of the local polynomials to obtain a regression surface.

DROPSQUARE=(variables)
specifies the quadratic monomials to exclude from the local quadratic fits. This option is ignored unless the DEGREE=2 option has been specified. For example,

   model z=x y / degree=2 dropsquare=(y)


uses the monomials 1, x, y, x2, and x y in performing the local fitting.

ITERATIONS=number
specifies the number of iterative reweighting steps to be done. Such iterations are appropriate when there are outliers in the data or when the error distribution is a symmetric long-tailed distribution. The default number of iterations is 1.

RESIDUAL | R
specifies that residuals are to be included in the "Output Statistics" table.

SCALE= NONE | SD < (number) >
specifies the scaling method to be applied to scale the regressors. The default is NONE, in which case no scaling is applied. A specification of SD(number) indicates that a trimmed standard deviation is to be used as a measure of scale, where number is the trimming fraction. A specification of SD with no qualification defaults to 10% trimmed standard deviation.

SCALEDINDEP
specifies that scaled regressor coordinates be included in the output tables. This option is ignored if the SCALE= model option is not used or if SCALE=NONE is specified.

SMOOTH=value-list
specifies a list of positive smoothing parameter values. A separate fit is obtained for each smoothing value specified.

STD
specifies that standardized errors are to be included in the "Output Statistics" table.

T
specifies that t statistics are to be included in the "Output Statistics" table.

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.