web

$next$ $up$ $previous$

Postscript version of these notes

STAT 804: Notes on Lecture 3

Definition: If $\{\epsilon_t\}$ is a white noise series and $\mu$ and $b_0,\ldots,b_p$ are constants then

$\displaystyle X_t = \mu + b_0\epsilon_t + b_1 \epsilon_{t-1} + \cdots + b_p \epsilon_{t-p}$

is a moving average of order

; we write

Question: From observations on can we estimate the 's and $\sigma^2=$ Var $(\epsilon_t)$ accurately? NO.

Definition: A model for data is a family $\{P_\theta; \theta\in\Theta\}$ of possible distributions for .

Definition: A model is identifiable if $\theta_1 \neq \theta_2$ implies that $P_{\theta_1} \neq P_{\theta_2}$ ; that is different $\theta$ 's give different distributions for the data.

When a model is unidentifiable there are different values of $\theta$ which make exactly the same predictions about the data so the data do not permit you to distinguish between these $\theta$ values.

Example: Suppose $\epsilon$ is an iid $N(0,\sigma^2)$ series and that $X_t = b_0 \epsilon_t + b_1 \epsilon_{t-1}$ . Then the series has mean 0 and covariance

$\displaystyle C_X(h) = \begin{cases} (b_0^2+b_1^2) \sigma^2 & h=0 \\ b_0 b_1 \sigma^2 & h=1 \\ 0 & \text{otherwise} \end{cases}$

Now a normal distribution is specified by its mean and its variance so two normal time series with mean 0 and the same covariance function have the same distribution. You can see that if you multiply the $\epsilon$ 's by

and divide both

and

then the covariance function of

is unchanged. Thus we cannot hope to estimate all three parameters,

and $\sigma$ . We choose to set the parameter

to be 1. Now are the parameters

and $\sigma$ identifiable? We try to solve the equations

$\displaystyle C(0) = (1+b^2) \sigma^2$

and

$\displaystyle C(1) = b\sigma^2$

to see if the solution is unique. Divide the two equations to see

$\displaystyle \frac{C(1)}{C(0)} = \frac{b}{1+b^2}$

$\displaystyle b^2 - \frac{C(0)}{C(1)} b + 1 = 0$

which has the solutions

$\displaystyle \frac{ \frac{C(0)}{C(1)} \pm \sqrt{\left( \frac{C(0)}{C(1)}\right)^2 - 4}}{ 2}$

You should notice two things:

If

$\displaystyle \left\vert \frac{C(0)}{C(1)}\right\vert < 2$
there are no solutions. Since $C(0) = \sqrt{\text{Var}(X_t) \text{Var}(X_{t+1})}$ we can see that is the correlation between and $X_{t+1}$ . We have proved that for an process this correlation cannot be more than in absolute value.
If

$\displaystyle \left\vert \frac{C(0)}{C(1)}\right\vert > 2$
there are two solutions.

The two solutions multiply together to give the constant term 1 in the quadratic equation. If the two roots are distinct it follows that one of them is larger than 1 and the other smaller in absolute value. Let and denote the two roots. Let $\alpha = C(1)/b$ and $\alpha^* =C(1)/b^*$ . Let $\epsilon_t$ be iid $N(0,\alpha)$ and $\epsilon_t^*$ be iid $N(0,\alpha^*)$ . Then

$\displaystyle X_t \equiv \epsilon_t + b \epsilon_{t-1}$

and

$\displaystyle X_t^* \equiv \epsilon_t + b^* \epsilon_{t-1}^*$

have identical means and covariance functions. Observing

you cannot distinguish the first of these models from the second. We will fit

models by requiring our estimated

to have $\vert\hat{b}\vert \le 1$ .

Reason: We can manipulate the model equation for just as we did for and autoregressive process last time:

$\displaystyle \epsilon_t$	$\displaystyle = X_t - b \epsilon_{t-1}$
	$\displaystyle = X_t - b(X_{t-1}-b\epsilon_{t-2}) -b \epsilon_{t-1}$
	$\displaystyle \qquad \vdots$
	$\displaystyle = \sum_0^\infty (-b)^j X_{t-j}$

This manipulation makes sense if $\vert b\vert < 1$ . If so then we can rearrange the equation to get

$\displaystyle X_t =\epsilon_t - \sum_1^\infty (-b)^j X_{t-j}$

which is an autoregressive process.

If, on the other hand, $\vert b\vert > 1$ then we can write

$\displaystyle X_t = b(\frac{1}{b} \epsilon_t -\epsilon_{t-1})$

Let $\epsilon_t^* = \epsilon_t/b$ ; $\epsilon^*$ is also white noise. We find

$\displaystyle \epsilon_{t-1}^*$	$\displaystyle = X_t - \frac{1}{b} \epsilon_{t}^*$
	$\displaystyle = X_t - \frac{1}{b}(X_{t+1}-\frac{1}{b}\epsilon_{t+1}^*)$
	$\displaystyle \qquad \vdots$
	$\displaystyle = \sum_0^\infty (-\frac{1}{b})^j X_{t+j}$

which means

$\displaystyle X_t = \epsilon_{t-1}^* - \sum_1^\infty (-\frac{1}{b})^j X_{t+j}$

This represents the current value as depending on the future which seems physically far less natural than the other choice.

Definition: An process is invertible if it can be written in the form

$\displaystyle X_t = \sum_1^\infty a_j X_{t-j}+\epsilon_t$

Definition: A process is an autoregression of order (written ) if

$\displaystyle X_t = \sum_1^p a_j X_{t-j}+\epsilon_t$

(so an invertible

is an infinite order autoregression).

Definition: The backshift operator transforms a time series into another time series by shifting it back one time unit; if is a time series then is the time series with

$\displaystyle (BX)_t = X_{t-1}\, .$

The identity operator

satisfies

. We use

for $j=1,2,\ldots$ to denote

composed with itself

times so that

$\displaystyle (B^jX)_t = X_{t-j}$

For

this gives

Now we use to develop a formal method for studying the existence of a given and the invertibility of a given . An process satisfies

$\displaystyle (I-a_1B)X = \epsilon$

If you think of

as some sort of infinite dimensional matrix then you get the formal identity

$\displaystyle X = (I-a_1B)^{-1}\epsilon$

So how will we define this inverse of an infinite matrix? We use the idea of a geometric series expansion.

If is a real number then

$\displaystyle (1-ab)^{-1} =\frac{1}{1-ab} = \sum_{j=0}^\infty (ab)^j$

so we hope that $(I-a_1B)^{-1}$ can be defined by

$\displaystyle (I-a_1B)^{-1} = \sum_{j=0}^\infty a_1^j B^j$

This would mean

$\displaystyle X = \sum_{j=0}^\infty a_1^j B^j \epsilon$

or looking at the formula for a particular

and remembering the meaning of

we get

$\displaystyle X_t = \sum_{j=0}^\infty a_1^j \epsilon_{t-j}$

This is the formula I had in lecture 2.

Now consider a general process:

$\displaystyle (I-\sum_1^p a_j B^j)X = \epsilon$

We will factor the operator applied to

. Let

$\displaystyle \phi(x) = 1- \sum_1^p a_j x^j$

Then $\phi$ is a polynomial of degree

. It thus has (a theorem of C. F. Gauss)

roots $1/b_1,\ldots,1/b_p$ . (None of the roots is 0 because the constant term in $\phi$ is 1.) This means we can factor $\phi$ as

$\displaystyle \phi(x) = \prod_1^p (1-b_j x)$

Now back to the definition of

$\displaystyle \prod_1^p(I-b_jB) X = \epsilon$

can be solved by inverting each term in the product (in any order -- the terms in the product commute) to get

$\displaystyle X = \prod_1^p(I-b_jB)^{-1}\epsilon$

The inverse of

will exist if the sum

$\displaystyle \sum_{k=0}^\infty b_j^k B^k$

converges; this requires $\vert b_j\vert < 1$ . Thus a stationary

solution of the equations exists if every root of the characteristic polynomial $\phi$ is larger than 1 in absolute value (actually the roots can be complex and I mean larger than 1 in modulus).

Summary

An process $X_t = \epsilon_t - \sum_{j=1}^q b_j \epsilon_{t-j}$ is invertible if and only if all roots of the characteristic polynomial $\psi(x) = 1 - \sum_{j=1}^q b_j x^j$ lie outside the unit circle in the complex plain.
For a given covariance function of an process there is only one set of coefficients $b_1,\ldots,b_q$ for which the process is invertible.
An process $X_t - \sum_{j=1}^q a_j X_{t-j} = \epsilon_t$ is asymptotically stationary if and only if all roots of the characteristic polynomial $\phi(x) = 1-\sum_1^p a_j x^j$ lie outside the unit circle in the complex plain.
(Asymptotically stationary means this: if you make $X_{-1},X_{-2}, \ldots,X_{-p}$ anything at all and use the equation defining the to define all the rest of the values then as $t\to\infty$ the process gets closer to being stationary. The assertion of asymptotic stationarity is equivalent here to the existence of an exactly stationary solution of the equations.)

Definition: A process is an (mixed autoregressive of order and moving average of order ) if it satisfies

$\displaystyle \phi(B) X = \psi(B)\epsilon$

where $\epsilon$ is white noise and

$\displaystyle \phi(B) = I - \sum_1^p a_j B^j$

and

$\displaystyle \psi(B) = I - \sum_1^p b_j B^j$

The ideas we used above can be stretched to show that the process is identifiable and causal (can be written as an infinite order autoregression on the past) if the roots of $\psi(x)$ lie outside the unit circle. A stationary solution, which can be written as an infinite order causal (no future $\epsilon$ s in the average) moving average, exists if all the roots of $\phi(x)$ lie outside the unit circle.

Other Stationary Processes:

Periodic processes. Suppose and are independent $N(0,\sigma^2)$ random variables and that $\omega$ is a constant. Then

$\displaystyle X_t = Z_1 \cos(\omega t) + Z_2 \sin(\omega t)$
has mean 0 and

Cov $\displaystyle (X_t,X_{t+h})$ $\displaystyle = \sigma^2 \left[\cos(\omega t)\cos(\omega(t+h)) + \sin(\omega t)\sin(\omega(t+h))\right]$

$\displaystyle = \sigma^2 \cos(\omega h)$

Since is Gaussian we find that is second order and strictly stationary. In fact (see your homework) You can write

$\displaystyle X_t = R \sin(\omega t+\Phi)$
where and $\Phi$ are suitable random variables so that the trajectory of is just a sine wave.
Poisson shot noise processes:
A Poisson process is a process indexed by subsets of the real line with the property that each has a Poisson distribution with parameter $\lambda$ length and if $A_1,\ldots A_p$ are any non-overlapping subsets of then $N(A_1),\ldots, N(A_p)$ are independent. We often use for .
To define a shot noise process we let at those where there is a jump in and 0 elsewhere. The process is stationary. If we have some function defined on $[0,\infty)$ and decreasing sufficiently quickly to 0 (like say $g(x) =e^{-x}$ ) then the process

$\displaystyle Y(t) = \sum g(t-\tau) 1(X(\tau)=1) 1(\tau \le t)$
is stationary. It has a jump every time passes a jump in the Poisson process and otherwise follows the trajectory of the sum of several copies of (shifted around in time). We commonly write

$\displaystyle Y(t) = \int_0^\infty g(t-\tau) dN(\tau)$

$next$ $up$ $previous$

Richard Lockhart
2001-09-17

Cov $\displaystyle (X_t,X_{t+h})$	$\displaystyle = \sigma^2 \left[\cos(\omega t)\cos(\omega(t+h)) + \sin(\omega t)\sin(\omega(t+h))\right]$
	$\displaystyle = \sigma^2 \cos(\omega h)$