Chapter Contents |
Previous |
Next |
The GENMOD Procedure |
This is a brief introduction to the theory of generalized linear models . See the "References" section for sources of more detailed information.
Standard theory for this type of distribution gives expressions for the mean and variance of Y.
Probability distributions of the response Y in generalized linear models are usually parameterized in terms of the mean and dispersion parameter instead of the natural parameter .The probability distributions that are available in the GENMOD procedure are shown in the following list. The PROC GENMOD scale parameter and the variance of Y are also shown.
The negative binomial distribution contains a parameter k, called the negative binomial dispersion parameter. This is not the same as the generalized linear model dispersion , but it is an additional distribution parameter that must be estimated or set to a fixed value.
For the binomial distribution, the response is the binomial proportion Y = events/ trials. The variance function is , and the binomial trials parameter n is regarded as a weight w.
If a weight variable is present, is replaced with , where w is the weight variable.
PROC GENMOD works with a scale parameter that is related to the exponential family dispersion parameter instead of with itself. The scale parameters are related to the dispersion parameter as shown previously with the probability distribution definitions. Thus, the scale parameter output in the "Analysis of Parameter Estimates" table is related to the exponential family dispersion parameter. If you specify a constant scale parameter with the SCALE= option in the MODEL statement, it is also related to the exponential family dispersion parameter in the same way.
For the binomial, multinomial, and Poisson distribution, terms involving binomial coefficients or factorials of the observed counts are dropped from the computation of the log-likelihood function since they do not affect parameter estimates or their estimated covariances.
On the rth iteration, the algorithm updates the parameter vector with
In some cases, the scale parameter is estimated by maximum likelihood. In these cases, elements corresponding to the scale parameter are computed and included in s and H.
If is the linear predictor for observation i and g is the link function, then , so that is an estimate of the mean of the ith observation, obtained from an estimate of the parameter vector .
The gradient vector and Hessian matrix for the regression parameters are given by
The correlation matrix is the normalized covariance matrix. That is, if is an element of , then the corresponding element of the correlation matrix is ,where .
Note that these statistics are not valid for GEE models.
If is the log-likelihood function expressed as a function of the predicted mean values and the vector y of response values, then the scaled deviance is defined by
Distribution | Deviance |
normal | |
Poisson | |
binomial | |
gamma | |
inverse Gaussian | |
multinomial | |
negative binomial |
In the binomial case, yi=ri/mi, where ri is a binomial count and mi is the binomial number of trials parameter.
In the multinomial case, yij refers to the observed number of occurrences of the jth category for the ith subpopulation defined by the AGGREGATE= variable, mi is the total number in the ith subpopulation, and pij is the category probability.
Pearson's chi-square statistic is defined as
The scaled version of both of these statistics, under certain regularity conditions, has a limiting chi-square distribution, with degrees of freedom equal to the number of observations minus the number of parameters estimated. The scaled version can be used as an approximate guide to the goodness of fit of a given model. Use caution before applying these statistics to ensure that all the conditions for the asymptotic distributions hold. McCullagh and Nelder (1989) advise that differences in deviances for nested models can be better approximated by chi-square distributions than the deviances themselves.
In cases where the dispersion parameter is not known, an estimate can be used to obtain an approximation to the scaled deviance and Pearson's chi-square statistic. One strategy is to fit a model that contains a sufficient number of parameters so that all systematic variation is removed, estimate from this model, and then use this estimate in computing the scaled deviance of sub-models. The deviance or Pearson's chi-square divided by its degrees of freedom is sometimes used as an estimate of the dispersion parameter .For example, since the limiting chi-square distribution of the scaled deviance has n-p degrees of freedom, where n is the number of observations and p the number of parameters, equating D* to its mean and solving for yields .Similarly, an estimate of based on Pearson's chi-square X2 is .Alternatively, a maximum likelihood estimate of can be computed by the procedure, if desired. See the discussion in the "Type 1 Analysis" section for more on the estimation of the dispersion parameter.
Otherwise, values of the SCALE and NOSCALE options and the resultant actions are displayed in the following table.
NOSCALE | SCALE=value | Action |
present | present | scale fixed at value |
present | not present | scale fixed at 1 |
not present | not present | scale estimated by ML |
not present | present | scale estimated by ML, |
starting point at value |
The meaning of the scale parameter displayed in the "Analysis Of Parameter Estimates" table is different for the Gamma distribution than for the other distributions. The relation of the scale parameter as used by PROC GENMOD to the exponential family dispersion parameter is displayed in the following table. For the binomial and Poisson distributions, is the overdispersion parameter, as defined in the "Overdispersion" section, which follows.
Distribution | Scale |
normal | |
inverse Gaussian | |
gamma | |
binomial | |
Poisson |
In the case of the negative binomial distribution, PROC GENMOD reports the "dispersion" parameter estimated by maximum likelihood. This is the negative binomial parameter k defined in the "Response Probability Distributions" section.
The SCALE= option in the MODEL statement enables you to specify a value of for the binomial and Poisson distributions. If you specify the SCALE=DEVIANCE option in the MODEL statement, the procedure uses the deviance divided by degrees of freedom as an estimate of ,and all statistics are adjusted appropriately. You can use Pearson's chi-square instead of the deviance by specifying the SCALE=PEARSON option.
The function obtained by dividing a log-likelihood function for the binomial or Poisson distribution by a dispersion parameter is not a legitimate log-likelihood function. It is an example of a quasi-likelihood function. Most of the asymptotic theory for log likelihoods also applies to quasi-likelihoods, which justifies computing standard errors and likelihood ratio statistics using quasi-likelihoods instead of proper log likelihoods. Refer to McCullagh and Nelder (1989, Chapter 9) and McCullagh (1983) for details on quasi-likelihood functions.
Although the estimate of the dispersion parameter is often used to indicate overdispersion or underdispersion, this estimate may also indicate other problems such as an incorrectly specified model or outliers in the data. You should carefully assess whether this type of model is appropriate for your data.
Chapter Contents |
Previous |
Next |
Top |
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.