Postscript version of these notes

Forecasting: an introduction

Given data
our goal will be to guess, or
forecast, *X*_{T} or more generally *X*_{T+r}. There are a variety
of *ad hoc* methods as well as a variety of statistically
derived methods. I illustrate the *ad hoc* methods with the
exponentially weighted moving average (EWMA). In this case
we simply take

where

Statistically based methods concentrate on some measure of the size of
;
the **mean squared prediction error**
is
the most common.

In general
must be some function
.
The mean squared prediction error can be seen by conditioning
on the data to be minimized by

For most distributions of the

where the coefficient vector

When *T* is large the computation of these forecasts is difficult
in general. There are some shortcuts, however.

**Forecasting AR(***p***) processes**

When the process is an AR the computation of the conditional
expectation is easier:

For

Notice the the forecast into the future uses current values
where these are available and forecasts already calculated
for the other *X*'s.

**Forecasting ARMA(***p*,*q***) processes**

An ARMA(*p*,*q*) can be inverted to be an infinite order AR
process. We could then use the method just given for the
AR except that now the formula actually mentions values of *X*_{t}for *t* < 0. In practice we simply truncate the series and
ignore the missing terms in the forecast, assuming that the
coefficients of these omitted terms are very small. Remember
each term is built up out of a geometric series for
with
.

A more direct method goes like this:

where now the conditioning ``|

Whenever the time index on an epsilon is *T* or more the conditional
expectations are 0. For *T*+*r*-*i* < *T* we need to guess the value
of
.
The same recurtion can be re-arranged to
help compute
for
,
at least
approximately:

This recursion works you backward but you have to get it started. Generally we start the recursion by putting

for negative

As we discussed in the section on estimation these computed
estimates of the epsilon's can be improved by backcasting the
values of
for negative *t* and then forecasting and
backcasting, etc.

**Forecasting ARIMA(***p*,*d*,*q***) series**

If
*Z*=(*I*-*B*)^{d} *X* and *X* is ARIMA(*p*,*d*,*q*) then we:
compute *Z*, forecast *Z* and reconstruct *X* by
undoing the differencing. For *d*=1 for example we
just have

**Forecast standard errors**

You should remind yourself that the computations of conditional
expectations we have just made used the fact that the *a*'s and
*b*'s are constants - the true parameter values. In fact we
then replace the parameter values with estimates. The quality of
our forecasts will be summarized by the forecast standard error:

We will compute this ignoring the estimation of the parameters and then discuss how much that might have cost us.

If then so that our forecast standard error is just the variance of .

Consider first the case of an AR(1) and one step ahead forecasting:

The variance of this forecast is so that the forecast standard error is just .

For forecasts further ahead in time we have

and

Subtracting we see that

so that we may calculate forecast standard errors recursively. As we can check that the forecast variance converges to

which is simply the variance of individual

Turn now to a general ARMA(*p*,*q*). Rewrite the process as the infinite
order AR

to see that again, ignoring the truncation of the infinite sum in the forecast we have

so that the one step ahead forecast standard error is again .

Parallel to the AR(1) argument we see that

The errors on the right hand side are not independent of one another so that computation of the variance requires either computation of the covariances or recognition of the fact that the right hand side is a linear combination of .

A simpler approach is to write the process as an infinite order MA:

for suitable coefficients

and the forecast error is just

so that the forecast standard error is

Again as this converges to .

Finally consider forecasting the ARIMA(*p*,*d*,*q*) process
(*I*-*B*)^{d} *X*= *W* where *W* is ARMA(*p*,*q*).
The forecast errors in *X* can clearly be written as a linear combination of
forecast errors for *W* permitting the forecast error in *X* to be written as
a linear combination of the underlying errors
.
As an example consider
first the ARIMA(0,1,0) process
.
The forecast of
is just 0 and so the forcast of *X*_{T+r} is just

The forecast error is

whose standard deviation is . Notice that the forecast standard error grows to infinity as . For a general ARIMA(

and

which can be combined with the expression above for the forecast error for an ARMA(

**Software**

The S-Plus function *arima.forecast* can do the forecasting.

**Comments**

I have ignored the effects of parameter estimation throughout. In ordinary least squares
when we predict the *Y* corresponding to a new *x* we get a forecast standard error
of

which is

The procedure used here corresponds to ignoring the term

1999-10-13