No Title

$next$ $up$ $previous$

Postscript version of these notes.

STAT 450

Lecture 18

Today's notes

Reading for Today's Lecture:

Goals of Today's Lecture:

Define the Fisher information matrix.
Illustrate standard MLE theory on $N(\mu,\sigma^2)$ example.

Today's notes

Estimating Equations

An equation of the form

$\begin{displaymath}U(X_1,\ldots,X_n;\theta)=0 \end{displaymath}$

is called an estimating equation; we find an estimate of $\theta$ by solving the equation.

Examples:

1.

The likelihood equations:

$\begin{displaymath}\frac{\partial\ell(\theta)}{\partial\theta} = U(X_1,\ldots,X_n;\theta)=0 \end{displaymath}$

2.

The normal equations in a linear regression model:

$\begin{displaymath}X^tX\beta = X^tY \end{displaymath}$

are of the form

$\begin{displaymath}X^t(Y-X\beta) \equiv U(X,Y,\beta) = 0 \end{displaymath}$

3.

The method of moments (later in this course)

4.

Quasi-likelihood estimation as in STAT 402.

Definition: : An estimating equation is unbiased if

$\begin{displaymath}\text{E}_\theta\left[U(X_1,\ldots,X_n;\theta)\right] = 0 \end{displaymath}$

It is a fact that in regular models the likelihood equations are unbiased.

It is a fact that unbiased estimating equations often lead to consistent estimates:

Theorem: Assume ``regularity conditions''. If

a): $\hat\theta$ is a root of $U(X_1,\ldots,X_n;\theta)=0$ and
b): $\text{E}_\theta\left[U(X_1,\ldots,X_n;\theta)\right] = 0$

then

1.

$\hat\theta$ is consistent.

2.

$\hat\theta-\theta \approx MVN(0,\Sigma)$ where

$\begin{displaymath}\Sigma = A^{-1}B A^{-1} \end{displaymath}$

with

$\begin{displaymath}B= \text{Var}_\theta (U(X_1,\ldots,X_n;\theta)) \end{displaymath}$

and

$\begin{displaymath}A = - \text{E}\left[\frac{\partial}{\partial\theta} U(X_1,\ldots,X_n;\theta)\right] \end{displaymath}$

Example: $N(\mu,\sigma^2)$ sample: Suppose $X_1,\ldots,X_n$ are iid $N(\mu,\sigma^2)$ . The score function is

$\begin{displaymath}U(\mu,\sigma) = \left[\begin{array}{c} \frac{\sum(X_i-\mu)}{\... ...um(X_i-\mu)^2}{\sigma^3} - \frac{n}{\sigma} \end{array}\right] \end{displaymath}$

I want to illustrate the fact that the likelihood equations are unbiased and to hint at how the first assertion is proved. To do so, I need notation for the true value of $\mu$ and $\sigma$ . I use $\mu_0$ and $\sigma_0$ for these values and then let $\mu$ and $\sigma$ without the subscripts be the arguments of the random function U. Since $\text{E}(X_i) = \mu_0$ and
$\begin{align*}\text{E}[(X_i-\mu)^2] & = \text{E}[(X_i-\mu_0+\mu_0-\mu)^2] \\ & ... ...mu_0)(\mu_0-\mu) + (\mu_0-\mu)^2] \\ & = \sigma_0^2 +(\mu_0-\mu)^2 \end{align*}$
we can compute

$\begin{displaymath}\text{E}_{\mu_0,\sigma_0}[U(X_,1\ldots,X_n;\mu,\sigma)] = \le... ...u_0-\mu)^2 )}{\sigma^2} -\frac{n}{\sigma^2} \end{array}\right] \end{displaymath}$

I want you to see that if $\mu=\mu_0$ and $\sigma=\sigma_0$ then this expectation is 0 and vice-versa.

Now imagine $\epsilon>0$ is tiny. If n is large then (probably) $\vert\bar{X} - \mu_0\vert < \epsilon$ . This would guarantee that

$\begin{displaymath}\frac{\sum(X_i-(\mu_0+\epsilon))}{\sigma^2} < 0\end{displaymath}$

and

$\begin{displaymath}\frac{\sum(X_i-(\mu_0-\epsilon))}{\sigma^2} > 0\end{displaymath}$

Since the first component of U is a continuous function of $\mu$ and has opposite signs at $\mu_0\pm \epsilon$ the root of the equation must be within $\epsilon$ of the true answer. This is roughly we prove consistency.

Now we will work out the other pieces of the theorem. I will compute B, A and $\Sigma$ in this example. In order to do so I will use a variety of rules for working with variances and covariances. Remember that if X is a random vector with $\text{E}(X) = \mu$ then

$\begin{displaymath}\text{Var}(X) = \text{E}[(X-\mu)(X-\mu)^t ] = \text{E}[XX^t] - \mu\mu^t \end{displaymath}$

The ijth entry in this matrix is

$\begin{displaymath}\text{Cov}(X_i,X_j) = \text{E}[(X_i-\mu_i)(X_j-\mu_j)] = \text{E}[X_iX_j] - \mu_i\mu_j \end{displaymath}$

We need the following properties of Var and Cov. In what follows as and bs are constants while V_is and W_is are random variables, not independent unless I say so.

1.

$\text{Var}(aV+b) = a^2 \text{Var}(V)$ .

2.

If $W_1,\ldots, W_n$ are independent then

$\begin{displaymath}\text{Var}(\sum a_i W_i) = \sum a_i^2 \text{Var}(W_i) \end{displaymath}$

3.

If V and W are independent the $\text{Cov}(V,W) =0$ .

4.

$\text{Cov}(\sum a_i W_i, V) = \sum a_i \text{Cov}(W_i,V)$

5.

$\text{Cov}(V,\sum a_i W_i) = \sum a_i \text{Cov}(V,W_i)$

6.

$\text{Cov}(V,W)= \\ text{Cov}(W,V)$

7.

$\text{Cov}(V,V) = \text{Var}(V)$

8.

$\text{Cov}(W,a) = 0$

I now intend to compute $\text{Var}_{\theta_0}(U(\theta_0))$ . To save typing effort I drop the subscripts on $\theta_0$ but I remind you that the calculation nowonly works then the argument of U is the true value of the parameter $\theta$ .

We have

$\begin{displaymath}B = \left[ \begin{array}{cc} \text{Var}(\sum(X_i-\mu)/\sigma^... ...ext{Var}(\sum(X_i-\mu)^2/\sigma^3-n/\sigma) \end{array}\right] \end{displaymath}$

To compute the individual entries it is simpler to define $Z_i=(X_i-\mu)/\sigma$ and remember that $Z_i \sim N(0,1)$ . Thus $\text{E}(Z) = 0$ , $\text{E}(Z^2) = 1$ , $\text{E}(Z^3) = 0$ and $\text{E}(Z^4) = 3$ . Then

$\begin{displaymath}\text{Var}(\sum(X_i-\mu)/\sigma^2) = \text{Var}(\sum Z_i) / \sigma^2 = n/\sigma^2 \end{displaymath}$

$\begin{displaymath}\text{Var}(\sum(X_i-\mu)^2/\sigma^3-n/\sigma) = \text{Var}(\sum Z_i^2)/\sigma^2 \end{displaymath}$

SInce $\text{Var}(Z_i^2) = \text{E}(Z_i^4) - \left[ \text{E}(Z_i^2)\right]^2 = 3-1=2$ we find

$\begin{displaymath}\text{Var}(\sum(X_i-\mu)^2/\sigma^3-n/\sigma) = 2n/\sigma^2 \end{displaymath}$

Finally

$\begin{displaymath}\text{Cov}(\sum(X_i-\mu)/\sigma^2,\sum(X_i-\mu)^2/\sigma^3-n/... ...\sum Z_j^2)/\sigma^2 = \sum_{ij}\text{Cov}(Z_i,Z_i^2)/\sigma^2 \end{displaymath}$

For $i \neq j$ , $\text{Cov}(Z_i,Z_j^2) =$ by independence. For i=j

$\begin{displaymath}\text{Cov}(Z_i,Z_i^2 ) = \text{E}(Z_i^3) - \text{E}(Z_i)\text{E}(Z_i^2) =0 \end{displaymath}$

Assemble these to discover

$\begin{displaymath}B = \left[\begin{array}{cc} \frac{n}{\sigma^2} & 0 \\ 0 & \frac{2n}{\sigma^2} \end{array}\right] \end{displaymath}$

Next: compute $A=-\text{E}(\partial U/ \partial\theta)$ . First
$\begin{align*}\frac{\partial U(\mu,\sigma)}{\partial \theta} & = \left[ \begin{a... ...{3\sum(X_i-\mu)2}{\sigma^4} + \frac{n}{\sigma^2} \end{array}\right] \end{align*}$
Take expected values to get

$\begin{displaymath}A = \left[\begin{array}{cc} \frac{n}{\sigma^2} & 0 \\ 0 & \frac{2n}{\sigma^2} \end{array}\right] = B \end{displaymath}$

$next$ $up$ $previous$

Richard Lockhart
1999-10-22