next up previous


Postscript version of these notes

STAT 804: Notes on Lecture 3

Definition: If $\{\epsilon_t\}$ is a white noise series and $\mu$ and $b_0,\ldots,b_p$ are constants then

\begin{displaymath}X_t = \mu + b_0\epsilon_t + b_1 \epsilon_{t-1} + \cdots + b_p \epsilon_{t-p}
\end{displaymath}

is a moving average of order p; we write MA(p).

Question: From observations on X can we estimate the b's and $\sigma^2=\text{Var}(\epsilon_t)$ accurately? NO.

Definition: A model for data X is a family $\{P_\theta;
\theta\in\Theta\}$ of possible distributions for X.

Definition: A model is identifiable if $\theta_1 \neq \theta_2$ implies that $P_{\theta_1} \neq P_{\theta_2}$; that is different $\theta$'s give different distributions for the data.

When a model is unidentifiable there are different values of $\theta$which make exactly the same predictions about the data so the data do not permit you to distinguish between these $\theta$ values.

Example: Suppose $\epsilon$ is an iid $N(0,\sigma^2)$ series and that $X_t = b_0 \epsilon_t + b_1 \epsilon_{t-1} $. Then the series X has mean 0 and covariance

\begin{displaymath}C_X(h) = \begin{cases}
(b_0^2+b_1^2) \sigma^2 & h=0
\\
b_0 b_1 \sigma^2 & h=1
\\
0 & \text{otherwise}
\end{cases}\end{displaymath}

Now a normal distribution is specified by its mean and its variance so two normal time series with mean 0 and the same covariance function have the same distribution. You can see that if you multiply the $\epsilon$'s by a and divide both b0 and b1 by a then the covariance function of X is unchanged. Thus we cannot hope to estimate all three parameters, b0, b1 and $\sigma$. We choose to set the parameter b0 to be 1. Now are the parameters b1 and $\sigma$identifiable? We try to solve the equations

\begin{displaymath}C(0) = (1+b^2) \sigma^2
\end{displaymath}

and

\begin{displaymath}C(1) = b\sigma^2
\end{displaymath}

to see if the solution is unique. Divide the two equations to see

\begin{displaymath}\frac{C(1)}{C(0)} = \frac{b}{1+b^2}
\end{displaymath}

or

\begin{displaymath}b^2 - \frac{C(0)}{C(1)} b + 1 = 0
\end{displaymath}

which has the solutions

\begin{displaymath}\frac{ \frac{C(0)}{C(1)} \pm \sqrt{\left( \frac{C(0)}{C(1)}\right)^2 - 4}}{
2}
\end{displaymath}

You should notice two things:

1.
If

\begin{displaymath}\left\vert \frac{C(0)}{C(1)}\right\vert > 2
\end{displaymath}

there are no solutions. Since $C(0) = \sqrt{\text{Var}(X_t)
\text{Var}(X_{t+1})}$ we can see that C(1)/C(1) is the correlation between Xt and Xt+1. We have proved that for an MA(1) process this correlation cannot be more than 1/2 in absolute value.

2.
If

\begin{displaymath}\left\vert \frac{C(0)}{C(1)}\right\vert < 2
\end{displaymath}

there are two solutions.

The two solutions multiply together to give the constant term 1 in the quadratic equation. If the two roots are distinct it follows that one of them is larger than 1 and the other smaller in absolute value. Let b and b* denote the two roots. Let $\alpha = C(1)/b$ and $\alpha^* =C(1)/b^*$. Let $\epsilon_t$ be iid $N(0,\alpha)$ and $\epsilon_t^*$ be iid $N(0,\alpha^*)$. Then

\begin{displaymath}X_t \equiv \epsilon_t + b \epsilon_{t-1}
\end{displaymath}

and

\begin{displaymath}X_t^* \equiv \epsilon_t + b^* \epsilon_{t-1}^*
\end{displaymath}

have identical means and covariance functions. Observing Xt you cannot distinguish the first of these models from the second. We will fit MA(1) models by requiring our estimated b to have $\vert\hat{b}\vert \le 1$.

Reason: We can manipulate the model equation for Xjust as we did for and autoregressive process last time:
\begin{align*}\epsilon_t & = X_t - b \epsilon_{t-1}
\\
& = X_t - b(X_{t-1}-b\ep...
...ilon_{t-1}
\\
& \qquad \vdots
\\
& = \sum_0^\infty (-b)^j X_{t-j}
\end{align*}
This manipulation makes sense if |b| < 1. If so then we can rearrange the equation to get

\begin{displaymath}X_t =\epsilon_t - \sum_1^\infty (-b)^j X_{t-j}
\end{displaymath}

which is an autoregressive process.

If, on the other hand, |b| > 1 then we can write

\begin{displaymath}X_t = b(\frac{1}{b} \epsilon_t -\epsilon_{t-1})
\end{displaymath}

Let $\epsilon_t^* = \epsilon_t/b$; $\epsilon^*$ is also white noise. We find
\begin{align*}\epsilon_{t-1}^* & = X_t - \frac{1}{b} \epsilon_{t}^*
\\
& = X_t ...
...
\\
& \qquad \vdots
\\
& = \sum_0^\infty (-\frac{1}{b})^j X_{t+j}
\end{align*}
which means

\begin{displaymath}X_t = \epsilon_{t-1}^* - \sum_1^\infty (-\frac{1}{b})^j X_{t+j}
\end{displaymath}

This represents the current value as depending on the future which seems physically far less natural than the other choice.

Definition: An MA(p) process is invertible if it can be written in the form

\begin{displaymath}X_t = \sum_1^\infty a_j X_{t-j}+\epsilon_t
\end{displaymath}

Definition: A process X is an autoregression of order p (written AR(p)) if

\begin{displaymath}X_t = \sum_1^p a_j X_{t-j}+\epsilon_t
\end{displaymath}

(so an invertible MA is an infinite order autoregression).

Definition: The backshift operator transforms a time series into another time series by shifting it back one time unit; if X is a time series then BX is the time series with

\begin{displaymath}(BX)_t = X_{t-1}\, .
\end{displaymath}

The identity operator I satisfies IX=X. We use Bj for $j=1,2,\ldots$to denote B composed with itself j times so that

(BjX)t = Xt-j

For j=0 this gives B0=I.

Now we use B to develop a formal method for studying the existence of a given AR(p) and the invertibility of a given MA(p). An AR(1) process satisfies

\begin{displaymath}(I-a_1B)X = \epsilon
\end{displaymath}

If you think of I-a1B as some sort of infinite dimensional matrix then you get the formal identity

\begin{displaymath}X = (I-a_1B)^{-1}\epsilon
\end{displaymath}

So how will we define this inverse of an infinite matrix? We use the idea of a geometric series expansion.

If b is a real number then

\begin{displaymath}(1-ab)^{-1} =\frac{1}{1-ab} = \sum_{j=0}^\infty (ab)^j
\end{displaymath}

so we hope that (I-a1B)-1 can be defined by

\begin{displaymath}(I-a_1B)^{-1} = \sum_{j=0}^\infty a_1^j B^j
\end{displaymath}

This would mean

\begin{displaymath}X = \sum_{j=0}^\infty a_1^j B^j \epsilon
\end{displaymath}

or looking at the formula for a particular tand remembering the meaning of Bj we get

\begin{displaymath}X_t = \sum_{j=0}^\infty a_1^j \epsilon_{t-j}
\end{displaymath}

This is the formula I had in lecture 2.

Now consider a general AR(p) process:

\begin{displaymath}(I-\sum_1^p a_j B^j)X = \epsilon
\end{displaymath}

We will factor the operator applied to x. Let

\begin{displaymath}\phi(x) = 1- \sum_1^p a_j x^j
\end{displaymath}

Then $\phi$ is a polynomial of degree p. It thus has (a theorem of C. F. Gauss) p roots $1/b_1,\ldots,1/b_p$. (None of the roots is 0 because the constant term in $\phi$ is 1.) This means we can factor $\phi$ as

\begin{displaymath}\phi(x) = \prod_1^p (1-b_j x)
\end{displaymath}

Now back to the definition of X:

\begin{displaymath}\prod_1^p(I-b_jB) X = \epsilon
\end{displaymath}

can be solved by inverting each term in the product (in any order -- the terms in the product commute) to get

\begin{displaymath}X = \prod_1^p(I-b_jB)^{-1}\epsilon
\end{displaymath}

The inverse of I-b1B will exist if the sum

\begin{displaymath}\sum_{k=0}^\infty b_j^k B^k
\end{displaymath}

converges; this requires |bj| < 1. Thus a stationary AR(p) solution of the equations exists if every root of the characteristic polynomial $\phi$ is larger than 1 in absolute value (actually the roots can be complex and I mean larger than 1 in modulus).

Summary

Definition: A process X is an ARMA(p,q) (mixed autoregressive of order pand moving average of order q) if it satisfies

\begin{displaymath}\phi(B) X = \psi(B)\epsilon
\end{displaymath}

where $\epsilon$ is white noise and

\begin{displaymath}\phi(B) = I - \sum_1^p a_j B^j
\end{displaymath}

and

\begin{displaymath}\psi(B) = I - \sum_1^p b_j B^j
\end{displaymath}

The ideas we used above can be stretched to show that the process X is identifiable and causal (can be written as an infinite order autoregression on the past) if the roots of $\psi(x)$ lie outside the unit circle. A stationary solution, which can be written as an infinite order causal (no future $\epsilon$s in the average) moving average, exists if all the roots of $\phi(x)$ lie outside the unit circle.

Other Stationary Processes:

1.
Periodic processes. Suppose Z1 and Z2 are independent $N(0,\sigma^2)$ random variables and that $\omega$ is a constant. Then

\begin{displaymath}X_t = Z_1 \cos(\omega t) + Z_2 \sin(\omega t)
\end{displaymath}

has mean 0 and
\begin{align*}\text{Cov}(X_t,X_{t+h}) & = \sigma^2 \left[\cos(\omega t)\cos(\ome...
...n(\omega t)\sin(\omega(t+h))\right]
\\
& = \sigma^2 \cos(\omega h)
\end{align*}
Since X is Gaussian we find that X is second order and strictly stationary. In fact (see your homework) You can write

\begin{displaymath}X_t = R \sin(\omega t+\Phi)
\end{displaymath}

where R and $\Phi$ are suitable random variables so that the trajectory of X is just a sine wave.

2.
Poisson shot noise processes:

A Poisson process is a process N(A) indexed by subsets A of the real line with the property that each N(A) has a Poisson distribution with parameter $\lambda\text{length}(A)$ and if $A_1,\ldots
A_p$ are any non-overlapping subsets of R then $N(A_1),\ldots,
N(A_p)$ are independent. We often use N(t) for N([0,t]).

To define a shot noise process we let X(t) =1 at those t where there is a jump in N and 0 elsewhere. The process X is stationary. If we have some function g defined on $[0,\infty)$ and decreasing sufficiently quickly to 0 (like say g(x) =e-x) then the process

\begin{displaymath}Y(t) = \sum g(t-\tau) 1(X(\tau)=1) 1(\tau \le t)
\end{displaymath}

is stationary. It has a jump every time t passes a jump in the Poisson process and otherwise follows the trajectory of the sum of several copies of g (shifted around in time). We commonly write

\begin{displaymath}Y(t) = \int_0^\infty g(t-\tau) dN(\tau)
\end{displaymath}


next up previous



Richard Lockhart
1999-09-21