Assignment 2 Solutions

Show that the autocorrelation function is

and

Plot the autocorrelation functions for the ARMA(1,1) process above, the AR(1) process with

and the MA(1) process

on the same plot when
and
.
Compute and plot the partial autocorrelation functions up to lag 30.
Comment on the usefulness
of these plots in distinguishing the three models.
Explain what goes wrong when
is close to
.
Solution: The most important part of this problem is that when
the autocorrelation is identically 0. This means that
gives simply white noise. In general in the ARMA
model

any common root of the polynomials
and
gives a common
factor on both sides of the model equation which can effectively be
cancelled. In other words, if
for some particular
x then we can write
for a suitable
and also
. In the model equation
we can cancel the common factor
and reduce the model to
an ARMA
.
A second important point is that the autocorrelation of an ARMA(1,1) decreases geometrically just like that of an AR(1) but only starting from lag 2 on.
attach(`/home/math/lockhart/teaching/courses/804/datasets')
will make a dataset influenza available. If you type
ls(pos=2)
you will see the data set for this question and the next two.
The data consist of monthly counts of influenza cases over a 9 and a half year period. Fit an ARIMA model to the data.
Solution
I began my analysis of this data set by plotting the data; see Figure 1.
I tried the square root transformation and the logarithmic transformation. The plot of the square roots and the corresponding monthly decomposition are in Figures 3 and 4 respectively.
has probably been growing roughly
exponentially over the interval in question. If influenza rates
are a stationary time series then the series
where
is stationary. Since
is a linear function of time
will have a linear trend which we could remove by regression.
Thus I studied
with
and
estimated using lsfit, i.e. by ordinary least squares. The
fitted values of slope and intercept are
and
when t is
measured in months and t=1 corresponds to January 1965. A time series
plot of the residuals is in Figure 6; the monthly decomposition of the
residual series is in Figure 7.
There is still a strong seasonal component. Two ways suggest themselves to remove the effect: seasonal differencing and subtraction of monthly means. I used the latter. The table of monthly means is in Table 1. A plot of the detrended and deseasonalized series obtained by subtracting these monthly means is in Figure 8.
. I
now started fitting ARMA models to
. Plots of the autocorrelation
and partial autocorrelation functions are in Figures 9 and 10. They
clearly suggest a simple AR(1) model since the partial autocorrelation is
essentially 0 at lags over 1 month.
Table 1: Monthly means of the detrended logarithm

I fitted the resulting model using arima.mle. The estimated
autoregression parameter is
with a standard error of
0.076 while the residual standard
error is
. The series
has mean 0.
The resulting model fit must now be checked to see if further modelling is necessary. I used the function arima.diag to get diagnostic graphs and residuals. The basic diagnostic plot is in Figure 11.
The autocorrelation function and partial autocorrelation function of the residuals are consistent with white noise as are the values of the portmanteau test P-values given in the bottom frame of Figure 11.
In summary we have been led to the model equation

where
is the monthly mean given in table 1 and

The standard deviation of the white noise
series
is 0.38.
Solution: My personal favourite model is obtained by taking logs
first and then, if
is the series of logarithms, fitting the model

where W is an ARIMA(1,0,0)
series whose fitted model is

with
a white noise series with variance 0.0065.
The fitted values of
and
are -0.675
and 0.0415; SPlus does not provide standard errors. The standard errors
in the AR coefficients are 0.069 (for the 0.8 value) and 0.11 for the 0.266
value so that both are significant.
Many students fitted a model with one or more differences taken at lag 4. I found it quite hard to get a really good fit this way. There is an important conceptual difference between fitting the linear trend as I have and taking differences. The differencing method supposes that earnings respond more or less directly to earlier values of earnings whereas I suppose the linear trend to arise as a result of an external driving force pushing the earnings up exponentially. Moreover the differenced models require increasing variance with time; my model is stationary about the linear trend.
I accepted a wide variety of answers, grading on the basis of how convincing and well reasoned I thought your process was. Generally I would prefer to see that fitted models were presented more or less as above, complete with clear formulas parameter estimates and standard errors.
Solution: This is simply an AR(1) and the diagnostics make this very clear. Part of the point here is that for data which genuinely follow an AR(1) model the model selection techniques work pretty well. Real data is not so clean.
DUE: Friday, 10 October.