STAT 350: Lecture 33
Theory of Generalized Linear Models
If Y has a Poisson distribution with parameter
then
for k a non-negative integer. We can use the method of
maximum likelihood to estimate k if we have a sample
of independent Poisson random variables all
with mean
. If we observe
,
and so
on then the likelihood function is
This function of
can be maximized by maximizing its
logarithm, the log likelihood function. We set the derivative of the
log likelihood with respect to
equal to 0, getting
This is the likelihood equation and its solution
is the maximum likelihood estimate of
.
In a regression problem all the
will have different means
. Our log-likelihood is now
If we treat all n of the
we can maximize the log
likelihood by setting each of the n partial derivatives with
respect to
for k from 1 to n equal to 0.
The kth of these n equations is just
This leads to
. In glm jargon this model is the
saturated model.
A more useful model is one in which there are fewer parameters but more than 1. A typical glm model is
where the
are covariate values for the ith observation
(often including an intercept term just as in standard linear
regression).
In this case the log-likelihood is
which should be treated as a function of
and maximized.
The derivative of this log-likelihood with respect to
is
If
has p components then setting these p derivatives
equal to 0 gives the likelihood equations.
It is no longer possible to solve the likelihood equations
analytically. We have, instead, to settle for numerical techniques.
One common technique is called iteratively re-weighted least
squares. For a Poisson variable with mean
the variance
is
. Ignore for a moment the fact that if we
knew
we would know
and consider fitting our
model by least squares with the
known. We would
minimize (see our discussion of weighted least squares)
by taking the derivative with respect to
and (again
ignoring the fact that
depends on
we
would get
But the derivative of
with respect to
is
and replacing
by
we get the
equation
exactly as before. This motivates the following estimation scheme.
Non-linear least squares
In a regression model of the form
we call the model linear if
for some covariates
and a parameter
. Often, however, we have non-linear
models such as
or in general
for some function f which need not
be linear in
. If the errors are
homo-scedastic normal errors then the maximum likelihood estimates of
the
are least squares estimates, minimizing
. The value
then solves
This equation can usually not be solved analytically. In general the process of solving it numerically is iterative. One approach is:
where
is the vector of partial derivatives of
with respect to
the entries in
.
with respect to
.
where
. The solution is
. We compute this
.
Example
In thermoluminescence dating ( see Lecture 1) a common model is
Thus in this case
The derivatives of
with respect to the entries of
are
and
Software and computing details
In SAS the procedure proc nlin will do non-linear least squares.
You need to specify quite a few ingredients. The model statement must
specify the exact functional form of the function
. You have a
statement parms which is needed to specify the initial guess for
.
You usually will have a bunch of der statements to specify derivatives of
the predicted values with respect to each parameter and, if you use some algorithms,
you will even specify some second derivatives. (You get a choice of 5 different
iterative algorithms. The one described above is called Gauss-Newton.)
In Splus the function nls can be used to to the same thing.