STAT 350: Lecture 35
Estimating equations: an introduction via glm
Estimating Equations: refers to equations of the form
which are solved for
to get estimates
.
Examples:
where
is the log-likelihood.
where, in a generalized linear model the variance
is a known
(except possibly for a multiplicative constant) function of
.
Only the first of these equations can usually be solved analytically. In Lecture 34 I showed you an example of an iterative technique of solving such equations.
Theory of Generalized Linear Models
The likelihood function for a Poisson regression model is:
and the log-likelihood is
A typical glm model is
where the
are covariate values for the ith observation
(often including an intercept term just as in standard linear
regression).
In this case the log-likelihood is
which should be treated as a function of
and maximized.
The derivative of this log-likelihood with respect to
is
If
has p components then setting these p derivatives
equal to 0 gives the likelihood equations.
For a Poisson model the variance is given by
so the likelihood equations can be written as
which is the fourth equation above.
These equations are solved iteratively, as in non-linear regression, but with the iteration now involving weighted least squares. The resulting scheme is called iteratively reweighted least squares.
If the
converge as
to something, say,
then since
we learn that
must be a root of the equation
which is the last of our example estimating equations.
Distribution of Estimators
Distribution Theory is the subject of computing the distribution of statistics, estimators and pivots. Examples in this course are the Multivariate Normal Distribution, the theorems about the chi-squared distribution of quadratic forms, the theorems that F statistics have F distributions when the null hypothesis is true, the theorems that show a t pivot has a t distribution.
Exact Distribution Theory: name applied to exact results such as those in previous example when the errors are assumed to have exactly normal distributions.
Asymptotic or Large Sample Distribution Theory: same sort of conclusions but only approximately true and assuming n is large. Theorems of the form:
Sketch of reasoning in special case
POISSON EXAMPLE: p=1
Assume
has a Poisson distribution with mean
where
now
is a scalar.
The estimating equation (the likelihood equation) is
It is now important to distinguish between a value of
which
we are trying out in the estimating equation and the true value
of
which I will call
. If we happen to try out the
true value of
in U then we find
On the other hand if we try out a value of
other than the
correct one we find
But
is a sum of independent random variables so by the law of
large numbers (law of averages) must be close to its expected value.
This means: if we stick in a value of
far from the right value
we will not get 0 while if we stick in a value of
close to the
right answer we will get something close to 0. This can sometimes be
turned in to the assertion:
The glm estimate of
is consistent, that is,
it converges to the correct answer as the sample size goes to
.
The next theoretical step is another linearization. If
is the root of the equation, that is,
, then
This is a Taylor's expansion. In our case the derivative
is
so that approximately
The right hand side of this formula has expected value 0, variance
which simplifies to
This means that an approximate standard error of
is
that an estimated approximate standard error is
Finally, since the formula shows that
is a sum
of independent terms the central limit theorem suggests that
has an approximate normal distribution and that
is an approximate pivot with approximately a N(0,1) distribution.
You should be able to turn this assertion into a 95% (approximate)
confidence interval for
.
Scope of these ideas
The ideas in the above calculation can be used in many contexts.
Further exploration of the ideas in this course