STAT 804: 97-3

Assignment 3

  1. Suppose is a Uniform random variable. Define

    Show that X is weakly stationary. (In fact it is strongly stationary so show that if you can.) Compute the autocorrelation function of X.




    Expand each and you get integrals involving , and . The latter integrates to 0 while each of the former integrates to 1/2. Thus

    and again using we get

    Put h=0 to get and divide to see that .

    This shows that X is wide sense stationary.

    If and are independent standard normals then

    has mean 0 and covariance

    so that Y is also weakly stationary. Since Y is Gaussian it is also strongly stationary. Now write in polar co-ordinates with and being the polar co-ordinates. Then you can check that

    This is a proof that is strongly stationary.

  2. Show that X of the previous question satisfies the AR(2) model

    for some value of . Show that the roots of the characteristic polynomial lie on the boundary of the unit circle in the complex plain. (Hint: show that is a root if is chosen correctly. Do not spend too much time on this question; the point is to illustrate that AR(2) models can be found whose behaviour is much like a sinusoid.)

    Solution: We have

    which is 0 provided .

    The characteristic polynomial is

    whose roots are

    which is whose modulus is 1 for both choices of the sign.

  3. For the data set raindiff in the same directory as earlier datasets fit the model

    where the are iid in each of the following ways

    1. By full maximum likelihood. (Maximize the joint density of the X's over the parameters , and .)

    2. By estimating by the sample mean and then doing full maximum likelihood for the model

      where .

    3. By estimating by the sample mean and then doing conditional maximum likelihood for the previous model.

    Remarks: I do not want you to use ar or other built-in S time-series functions to do these but you may want to use them to get starting values for estimates. It is probably easier to do the methods in reverse order, using the results as starting points for iterative solutions.

    Solution: The conditional log-likelihood in the 3rd part of this problem is

    The resulting score has components


    The estimate of is and that of is

    Thus the answers to part 3 are , and . (All are given to many decimal places only for comparison purposes.)

    In part 2 the log-likelihood is augmented by another , by and by the added term . The two components of the score are now



    To find we plug this formula for into the log likelihood and obtain the profile likelihood for ,

    where now you must keep in mind that depends on . I used the following S-Plus functions:

    tau0 <-function(x, phi)
            n <- length(x)
            m <- outer(rep(1, length(phi)), x[-1]) - outer(phi, x[ - n])
            t1 <- m^2 %*% rep(1, n - 1)
            (t1 + (1 - phi^2) * x[1]^2)/n
    ell0 <- function(phi)
            n <- length(x0)
            n * log(tau0(x0, phi)) - log(1 - phi^2)/2
    I had created the data set x0 <- raindiff - mean(raindiff) ahead of time because this is a simple way to pass the data to the function ell0 which is to be minimized by nlmin.

    For part 2 then we have , and .

    For part 1 the y's of part 2 are replaced by and the component of the score function can be set equal to 0 to find

    I used

    tau <-function(x, phi)
            y <- t(outer(x, mu(x, phi), "-"))
            n <- length(x)
            m <- y[, -1] - phi * y[,  - n]
            t1 <- m^2 %*% rep(1, n - 1)
            (t1 + (1 - phi^2) * y[, 1]^2)/n
    ell <- function(phi)
            n <- length(y)
            n * log(tau(y, phi)) - log(1 - phi^2)/2
    Again the data set is y <- raindiff. Notice the use of the outer function to fill a matrix with .

    For part 3 I get , and .

    Note on SPlus: I used the following function to call nlmin and to make the data set available to the functions ell and ell0:

    estim.ar1 <- function(x)
            n <- length(x)
            m <- mean(x)
            assign("y", x, 1)
            assign("y0", x - m, 1)
            phi.hat.3 <- sum(y0[-1] * y0[ - n])/sum(y0[ - n]^2)
            sigma.hat.3 <- sqrt(sum((y0[-1] - phi.hat.3 * y0[ - n])^2)/(n - 1))
            phi.hat.2 <- nlmin(ell0, phi.hat.3)$x
            sigma.hat.2 <- sqrt(tau0(y0, phi.hat.2))
            phi.hat.1 <- nlmin(ell, phi.hat.2)$x
            sigma.hat.1 <- sqrt(tau(y, phi.hat.1))
            mu.hat.1 <- mu(y, phi.hat.1)
            matrix(c(m, phi.hat.3, sigma.hat.3, m, phi.hat.2, sigma.hat.2,
               mu.hat.1, phi.hat.1, sigma.hat.1), nrow = 3)
    The function assign makes the variables y and y0 available to functions called by estim.ar1 without creating them permanently. This permits ell and ell0 to access y and y0 even though nlmin does not permit any argument but to be passed through the argument list of ell or ell0.

    Overall there is very little point in the more difficult computations; these estimates agree to a small fraction of a standard error.

  4. Derive formulas for the L-step ahead forecast standard error for the AR(1) model and for the ARIMA(1,1,0) model . Compute the limits of the forecast standard errors for these two models as La tends to infinity.

    Solution: For the model

    so that the L-step ahead forecast standard error is

    which converges to as .

    For the ARIMA(1,1,0) model we have


    Subtracting, putting r=L-1 and using the first part we see that the L step ahead forecast error is

    I used maple to compute the variance and get

    which is proportional to for large L and goes, therefore, to .

  5. Delete the last 4 values for the earnings data set and use the model you selected in the previous assignment to re-estimate the parameters and forecast the deleted values. Compare the actual errors with the forecasts and the forecast standard errors.

    Solution: This depends on the model you used. However, one point about your answers is that when you undo the logarithm by exponentiating you should not just exponentiate the standard error. If the interval for the log process is then the interval for the original process runs from to and the forecast is not in the center. Now if has a distribution then the mean of Y is (use the formula for the mgf of a normal). This correction is sometimes used for the forecast itself. Technically it is a mean unbiased but median biased forecast. It will be an overestimate more than half the time.

  6. For the dataset faketrend in the usual directory remove a trend by ordinary least squares. Then fit the model you used with fake in the previous assignment to the residuals. Use the autocovariance of the fitted residuals to re-estimate the trend by generalized squares. Does the fit change much?

    Solution: This was done well. The estimates change very little indeed. However, I gave most students only 9.5 because they didn't look at standard errors of the estimates to see if the change was appreciable compared to the SE. One other point. The estimate is, formally,

    where A is the design matrix and V is the fitted variance covariance matrix. In Splus you should compute using code like solve(V,A) since this is often much faster than inverting all of V. Similarly, having computed say you compute using solve(R,t(A)).

DUE: Friday, 21 November.

Richard Lockhart
Wed Nov 26 09:12:44 PST 1997