Example 64.1: Partial Spline Model Fit

The TPSPLINE Procedure

Example 64.1: Partial Spline Model Fit

The following example analyzes the data set Measure that was introduced in the "Getting Started" section. That analysis determined that the final estimated surface can be represented by a quadratic function for one or both of the independent variables. This example illustrates how you can use PROC TPSPLINE to fit a partial spline model. The data set Measure is fit using the following model:

f(x1, x2) = 1 + x₁ + x₁² + h(x₂)

The model has a parametric component (associated with the x₁ variable) and a nonparametric component (associated with the x₂ variable). The following statements fit a partial spline model.

   data Measure; 
      set Measure;
      x1sq = x1*x1;
   run;

   data pred;
      do x1=-1 to 1 by 0.1;
         do x2=-1 to 1 by 0.1;
            x1sq = x1*x1;
            output;
         end;
      end;
   run;

   proc tpspline data= measure;
      model y = x1 x1sq (x2);
      score data = pred 
            out  = predy;
   run;

Output 64.1.1 displays the results from these statements.

Output 64.1.1: Output from PROC TPSPLINE

The TPSPLINE Procedure

Dependent Variable: y

Summary of Input Data Set
Number of Non-Missing Observations	50
Number of Missing Observations	0
Unique Smoothing Design Points	5

Summary of Final Model
Number of Regression Variables	2
Number of Smoothing Variables	1
Order of Derivative in the Penalty	2
Dimension of Polynomial Space	4

Summary Statistics of Final Estimation
*log10(nLambda)**	-2.237410
Smoothing Penalty	205.346097
Residual SS	8.582131
Tr(I-A)	43.153394
Model DF	6.846606
Standard Deviation	0.445954

As displayed in Output 64.1.1, there are five unique design points for the smoothing variable x2 and two regression variables in the model (x1,x1sq). The dimension of the null space (polynomial space) is 4. The standard deviation of the estimate is much larger than the one based on the model with both x1 and x2 as smoothing variables (0.445954 compared to 0.098421). One of the many possible explanations may be that the number of unique design points of the smoothing variable is too small to warrant an accurate estimate for h(x2).

The following statements produce a surface plot for the partial spline model:

   title 'Plot of Fitted Surface on a Fine Grid';

   proc g3d data=predy;
      plot x2*x1=p_y/grid
                     zmin=9
                     zmax=21
                     zticknum=4;
   run;

The surface displayed in Output 64.1.2 is similar to the one estimated by using the full nonparametric model (displayed in Figure 64.5).

Output 64.1.2: Plot of TPSPLINE Fit from the Partial Spline Model

Chapter Contents
Previous
Next
Top