Chapter Contents 
Previous 
Next 
The STATESPACE Procedure 
The STATESPACE procedure is designed to automatically select the best state space model for forecasting the series. You can specify your own model if you wish, and you can use the output from PROC STATESPACE to help you identify a state space model. However, the easiest way to use PROC STATESPACE is to let it choose the model.
The series shown in Figure 18.1 are nonstationary. In order to forecast X and Y with a state space model, you must difference them (or use some other detrending method). If you fail to difference when needed and try to use PROC STATESPACE with nonstationary data, an inappropriate state space model may be selected, and the model estimation may fail to converge.
The following statements identify and fit a state space model for the first differences of X and Y, and forecast X and Y 10 periods ahead:
proc statespace data=in out=out lead=10; var x(1) y(1); id t; run;
The DATA= option specifies the input data set and the OUT= option specifies the output data set for the forecasts. The LEAD= option specifies forecasting 10 observations past the end of the input data. The VAR statement specifies the variables to forecast and specifies differencing. The notation X(1) Y(1) specifies that the state space model analyzes the first differences of X and Y.
Descriptive statistics are printed first, giving the number of nonmissing observations after differencing, and the sample means and standard deviations of the differenced series. The sample means are subtracted before the series are modeled (unless the NOCENTER option is specified), and the sample means are added back when the forecasts are produced.
Let X_{t} and Y_{t} be the observed values of X and Y, and let x_{t} and y_{t} be the values of X and Y after differencing and subtracting the mean difference. The series x_{t} modeled by the STATEPSPACE procedure is
where B represents the backshift operator.
After the descriptive statistics, PROC STATESPACE prints the Akaike information criterion (AIC) values for the autoregressive models fit to the series. The smallest AIC value, in this case 5.517 at lag 2, determines the number of autocovariance matrices analyzed in the canonical correlation phase.
A schematic representation of the autocorrelations is printed next. This indicates which elements of the autocorrelation matrices at different lags are significantly greater or less than 0.
The second page of the STATESPACE printed output is shown in Figure 18.3.

Figure 18.3 shows a schematic representation of the partial autocorrelations, similar to the autocorrelations shown in Figure 18.2. The selection of a second order autoregressive model by the AIC statistic looks reasonable in this case because the partial autocorrelations for lags greater than 2 are not significant.
Next, the YuleWalker estimates for the selected autoregressive model are printed. This output shows the coefficient matrices of the vector autoregressive model at each lag.
Once the state vector is selected the state space model is estimated by approximate maximum likelihood. Information from the canonical correlation analysis and from the preliminary autoregression is used to form preliminary estimates of the state space model parameters. These preliminary estimates are used as starting values for the iterative estimation process.
The form of the state vector and the preliminary estimates are printed next, as shown in Figure 18.4.

Figure 18.4 first prints the state vector as X[T;T] Y[T;T] X[T+1;T]. This notation indicates that the state vector is
The notation x_{t+1t} indicates the conditional expectation or prediction of x_{t+1} based on the information available at time t, and x_{tt} and y_{tt} are x_{t} and y_{t} respectively.
The remainder of Figure 18.4 shows the preliminary estimates of the transition matrix F, the input matrix G, and the covariance matrix .

The estimated state space model shown in Figure 18.5 is
The next page of the STATESPACE output lists the estimates of the free parameters in the F and G matrices with standard errors and t statistics, as shown in Figure 18.6.

If you encounter convergence problems, you should recheck the stationarity of the data and ensure that the specified differencing orders are correct. Attempting to fit state space models to nonstationary data is a common cause of convergence failure. You can also use the MAXIT= option to increase the number of iterations allowed, or experiment with the convergence tolerance options DETTOL= and PARMTOL=.
proc print data=out; id t; where t > 190; run;
The PROC PRINT output is shown in Figure 18.7.

The OUT= data set produced by PROC STATESPACE contains the VAR and ID statement variables. In addition, for each VAR statement variable, the OUT= data set contains the variables FORi, RESi, and STDi. These variables contain the predicted values, residuals, and forecast standard errors for the ith variable in the VAR statement list. In this case, X is listed first in the VAR statement, so FOR1 contains the forecasts of X, while FOR2 contains the forecasts of Y.
The following statements plot the forecasts and actuals for the series.
proc gplot data=out; plot for1*t=1 for2*t=1 x*t=2 y*t=2 / overlay HREF=200.5; symbol1 v=circle i=join; symbol2 v=star i=none; where t > 150; run;
The forecast plot is shown in Figure 18.8. The last 50 observations are also plotted to provide context, and a reference line is drawn between the historical and forecast periods. The actual values are plotted with asterisks.
Chapter Contents 
Previous 
Next 
Top 
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.