next up previous


Postscript version of this file

STAT 380 Lecture 8

Reading for Today's Lecture: Chapter 4 sections 1-3.

Goals of Today's Lecture:

Look at the matrix product ${\bf P}{\bf P}$:

\begin{displaymath}\left[\begin{array}{ll} \frac{3}{5} & \frac{2}{5}
\\
\frac{1...
... \frac{2}{5}
\\
\frac{1}{5} & \frac{4}{5}
\end{array} \right]
\end{displaymath}

Notice that the probability P(X2=W|X0=W) is exactly the formula for the W,W entry in ${\bf P}{\bf P}$.

General case. Define

P(n)i,j =P(Xn=j|X0=i)

Then
\begin{align*}P(X_{m+n}=j &\vert X_m=i, X_{m-1}=i_{m-1},\ldots)
\\
& = P(X_{m+n}=j\vert X_m=i)
\\
& =P(X_n=j\vert X_0=i)
\\
& = P^{(n)}_{i,j}
\end{align*}

Proof of these assertions by induction on m,n.

Example for n=2. Two bits to do:

First suppose U,V,X,Y are discrete variables. Assume
\begin{align*}P(Y=y\vert X=x,&U=u,V=v)
\\
& = P(Y=y\vert X=x)
\end{align*}
for any x,y,u,v. Then I claim
\begin{align*}P(Y=y &\vert X=x,U=u)
\\
&= P(Y=y\vert X=x)
\end{align*}
In words, if knowing both U and V doesn't change the conditional probability then knowing U alone doesn't change the conditional probability.

Proof of claim:

\begin{displaymath}A=\{X=x,U=u\}
\end{displaymath}

Then
\begin{align*}P(Y=y &\vert X=x,U=u)
\\
& = \frac{P(Y=y,A)}{P(A)}
\\
& = \frac{...
...}
\\
& = \frac{ P(Y=y\vert X=x)P(A)}{P(A)}
\\
&= P(Y=y\vert X=x)
\end{align*}

Second step: consider
\begin{align*}P(X_{n+2}& =k \vert X_n=i)
\\
= & \sum_j P(X_{n+2}=k,X_{n+1}=j\v...
...mes P(X_{n+1}=j\vert X_n=i)
\\ =& \sum_j {\bf P}_{i,j}{\bf P}_{j,k}
\end{align*}
This shows that

\begin{displaymath}P(X_{n+2}=k\vert X_n=i) = ({\bf P}^2)_{i,k}
\end{displaymath}

where ${\bf P}^2$ means the matrix product ${\bf P}{\bf P}$. Notice both that the quantity does not depend on n and that we can compute it by taking a power of ${\bf P}$.

More general version

\begin{displaymath}P(X_{n+m}=k\vert X_n=j) = ({\bf P}^m)_{j,k}
\end{displaymath}

Since ${\bf P}^n{\bf P}^m={\bf P}^{n+m}$ we get the Chapman-Kolmogorov equations:
\begin{multline*}P(X_{n+m}=k\vert X_0=i) =
\\
\sum_j P(X_{n+m}=k\vert X_n=j)P(X_n=j\vert X_0=i)
\end{multline*}

Summary: A Markov Chain has stationary n step transition probabilities which are the nth power of the 1 step transition probabilities.

Here is Maple output for the 1,2,4,8 and 16 step transition matrices for our rainfall example:

> p:= matrix(2,2,[[3/5,2/5],[1/5,4/5]]);
                      [3/5    2/5]
                 p := [          ]
                      [1/5    4/5]
> p2:=evalm(p*p):
> p4:=evalm(p2*p2):
> p8:=evalm(p4*p4):
> p16:=evalm(p8*p8):
This computes the powers (evalm understands matrix algebra).

Fact:

\begin{displaymath}\lim_{n\to\infty} {\bf P}^n = \left[
\begin{array}{cc} \frac{...
...{3}
\\
\\
\frac{1}{3} & \frac{2}{3}
\end{array}\right] \, .
\end{displaymath}

> evalf(evalm(p));
            [.6000000000    .4000000000]
            [                          ]
            [.2000000000    .8000000000]
> evalf(evalm(p2));
            [.4400000000    .5600000000]
            [                          ]
            [.2800000000    .7200000000]
> evalf(evalm(p4));
            [.3504000000    .6496000000]
            [                          ]
            [.3248000000    .6752000000]
> evalf(evalm(p8));
            [.3337702400    .6662297600]
            [                          ]
            [.3331148800    .6668851200]
> evalf(evalm(p16));
            [.3333336197    .6666663803]
            [                          ]
            [.3333331902    .6666668098]

Where did 1/3 and 2/3 come from?

Suppose we toss a coin $P(H)=\alpha_D$ and start the chain with Dry if we get heads and Wet if we get tails.

Then

\begin{displaymath}P(X_0=x) = \begin{cases}\alpha_D & x=\text{Dry}
\\
\alpha_W=1-\alpha_D & x=\text{Wet}
\end{cases}\end{displaymath}

and
\begin{align*}P(X_1=x) & = \sum_y P(X_1=x\vert X_0=y)P(X_0=y)
\\
& = \sum_y \alpha_y P_{y,x} \, .
\end{align*}
Notice last line is a matrix multiplication of row vector $\alpha$ by matrix ${\bf P}$.

A special $\alpha$: if we put $\alpha_D=1/3$ and $\alpha_W=2/3$ then

\begin{displaymath}\left[ \frac{1}{3} \quad \frac{2}{3} \right]
\left[\begin{ar...
...y} \right] = \left[ \frac{1}{3} \quad \frac{2}{3} \right] \, .
\end{displaymath}

In other words if we start off P(X0=D)=1/3 then P(X1=D)=1/3and analogously for W. This means that X0 and X1 have the same distribution.

A probability vector $\alpha$ is called an initial distribution for the chain if

\begin{displaymath}P(X_0=i) = \alpha_i \, .
\end{displaymath}

A Markov Chain is stationary if

P(X1=i) = P(X0=i)

for all i.

An initial distribution is called stationary if the chain is stationary. We find that $\alpha$ is a stationary initial distribution if

\begin{displaymath}\alpha {\bf P}=\alpha \, .
\end{displaymath}

Suppose ${\bf P}^n$ converges to some matrix ${\bf P}^\infty$. Notice that

\begin{displaymath}\lim_{n\to\infty} {\bf P}^{n-1} = {\bf P}^\infty
\end{displaymath}

and
\begin{align*}{\bf P}^\infty & = \lim {\bf P}^n
\\
& = \left[\lim {\bf P}^{n-1}\right] {\bf P}
\\
& = {\bf P}^\infty {\bf P}\, .
\end{align*}

This proves that each row $\alpha$ of ${\bf P}^\infty$ satisfies

\begin{displaymath}\alpha = \alpha {\bf P}\, .
\end{displaymath}

Def'n: A row vector x is a left eigenvector of A with eigenvalue $\lambda$ if

\begin{displaymath}xA=\lambda x \, .
\end{displaymath}

So each row of ${\bf P}^\infty$ is a left eigenvector of ${\bf P}$ with eigenvalue 1.


next up previous



Richard Lockhart
2000-10-02