Example 29.7: Log-Linear Model for Count Data

The GENMOD Procedure

Example 29.7: Log-Linear Model for Count Data

These data, from Thall and Vail (1990), are concerned with the treatment of people suffering from epileptic seizure episodes. These data are also analyzed in Diggle, Liang, and Zeger (1994). The data consist of the number of epileptic seizures in an eight-week baseline period, before any treatment, and in each of four two-week treatment periods, in which patients received either a placebo or the drug Progabide in addition to other therapy. A portion of the data is displayed in Table 29.5. See "Gee Model for Count Data, Exchangeable Correlation" in the SAS/STAT Sample Program Library for the complete data set.

Table 29.5: Epileptic Seizure Data

Patient ID	Treatment	Baseline	Visit1	Visit2	Visit3	Visit4
104	Placebo	11	5	3	3	3
106	Placebo	11	3	5	3	3
107	Placebo	6	2	4	0	5
.
.
.
101	Progabide	76	11	14	9	8
102	Progabide	38	8	7	9	4
103	Progabide	19	0	4	3	0
.
.
.

Model the data as a log-linear model with $V(\mu) = \mu$ (the Poisson variance function) and

$\log(E(Y_{ij}))&=&\beta_{0}+x_{i1}\beta_{1}+x_{i2}\beta_{2}+ \ & & x_{i1}x_{i2}\beta_{3} + \log(t_{ij})$

where

Y_ij= number of epileptic seizures in interval j
t_ij= length of interval j
$x_{i1}= \{ 1: { weeks 8-16 (treatment)} \ 0: { weeks 0-8 (baseline)} .$
$x_{i2}= \{1: { progabide group } \ 0: { placebo group } .$

The correlations between the counts are modeled as $r_{ij}=\alpha$ , $i \neq j$ (exchangeable correlations). For comparison, the correlations are also modeled as independent (identity correlation matrix). In this model, the regression parameters have the interpretation in terms of the log seizure rate displayed in Table 29.6.

Table 29.6: Interpretation of Regression Parameters

Treatment	Visit	log(E(Y_ij)/t_ij)
Placebo	Baseline	$\beta_{0}$
	1-4	$\beta_{0}+\beta_{1}$
Progabide	Baseline	$\beta_{0}+\beta_{2}$
	1-4	$\beta_{0}+\beta_{1}+\beta_{2}+\beta_{3}$

The difference between the log seizure rates in the pretreatment (baseline) period and the treatment periods is $\beta_{1}$ for the placebo group and $\beta_{1}+\beta_{3}$ for the Progabide group. A value of $\beta_{3} \lt 0$ indicates a reduction in the seizure rate.

The following statements input the data, which are arranged as one visit per observation:

   data thall;
      input id y visit trt bline age;
   datalines;
   104 5 1  0 11 31
   104 3 2  0 11 31
   104 3 3  0 11 31
   104 3 4  0 11 31
   106 3 1  0 11 30
   106 5 2  0 11 30
   106 3 3  0 11 30
   106 3 4  0 11 30
   107 2 1  0 6 25
   107 4 2  0 6 25
   107 0 3  0 6 25
   107 5 4  0 6 25
   114 4 1  0 8 36
   114 4 2  0 8 36
   ...
   run;

Some further data manipulations create an observation for the baseline measures, a log time interval variable for use as an offset, and an indicator variable for whether the observation is for a baseline measurement or a visit measurement. Patient 207 is deleted as an outlier, as in the Diggle, Liang, and Zeger (1994) analysis.

   data new;
      set thall;
      output;
      if visit=1 then do;
         y=bline;
         visit=0;
         output;
      end;
   run;

   data new2;
      set new;
      if id ne 207;
      if visit=0 then do;
         x1=0;
         ltime=log(8);
      end;
      else do;
         x1=1;
         ltime=log(2);
      end;
   run;

The GEE solution is requested by using the REPEATED statement in the GENMOD procedure. The SUBJECT=ID option indicates that the id variable describes the observations for a single cluster, and the CORRW option displays the working correlation matrix. The TYPE= option specifies the correlation structure; the value EXCH indicates the exchangeable structure.

   proc genmod data=new2;
      class id;
      model y=x1 | trt / d=poisson offset=ltime;
      repeated subject=id / corrw covb type=exch;
   run;

These statements first produce the usual output from fitting a generalized linear model (GLM) to these data. The estimates are used as initial values for the GEE solution.

Information about the GEE model is displayed in Output 29.7.2. The results of fitting the model are displayed in Output 29.7.3. Compare these with the model of independence displayed in Output 29.7.1. The parameter estimates are nearly identical, but the standard errors for the independence case are underestimated. The coefficient of the interaction term, $\beta_{3}$ , is highly significant under the independence model and marginally significant with the exchangeable correlations model.

Output 29.7.1: Independence Model

The GENMOD Procedure

Analysis Of Initial Parameter Estimates
Parameter	DF	Estimate	Standard Error	Wald 95% Confidence Limits		Chi-Square	Pr > ChiSq
Intercept	1	1.3476	0.0341	1.2809	1.4144	1565.44	<.0001
x1	1	0.1108	0.0469	0.0189	0.2027	5.58	0.0181
trt	1	-0.1080	0.0486	-0.2034	-0.0127	4.93	0.0264
*x1trt**	1	-0.3016	0.0697	-0.4383	-0.1649	18.70	<.0001
Scale	0	1.0000	0.0000	1.0000	1.0000

NOTE:

The scale parameter was held fixed.

Output 29.7.2: GEE Model Information

The GENMOD Procedure

GEE Model Information
Correlation Structure	Exchangeable
Subject Effect	id (58 levels)
Number of Clusters	58
Correlation Matrix Dimension	5
Maximum Cluster Size	5
Minimum Cluster Size	5

Output 29.7.3: GEE Parameter Estimates

The GENMOD Procedure

Analysis Of GEE Parameter Estimates
Empirical Standard Error Estimates
Parameter	Estimate	Standard Error	95% Confidence Limits		Z	Pr > \|Z\|
Intercept	1.3476	0.1574	1.0392	1.6560	8.56	<.0001
x1	0.1108	0.1161	-0.1168	0.3383	0.95	0.3399
trt	-0.1080	0.1937	-0.4876	0.2716	-0.56	0.5770
*x1trt**	-0.3016	0.1712	-0.6371	0.0339	-1.76	0.0781

Table 29.7 displays the regression coefficients, standard errors, and normalized coefficients that result from fitting the model using independent and exchangeable working correlation matrices.

Table 29.7: Results of Model Fitting

Variable	Correlation	Coef.	Std. Error	Coef./S.E.
	Structure
Intercept	Exchangeable	1.35	0.16	8.56
	Independent	1.35	0.03	39.52
Visit (x₁)	Exchangeable	0.11	0.12	0.95
	Independent	0.11	0.05	2.36
Treat (x₂)	Exchangeable	-0.11	0.19	-0.56
	Independent	-0.11	0.05	-2.22
x₁*x₂	Exchangeable	-0.30	0.17	-1.76
	Independent	-0.30	0.07	-4.32

The fitted exchangeable correlation matrix is specified with the CORRW option and is displayed in Output 29.7.4.

Output 29.7.4: Working Correlation Matrix

The GENMOD Procedure

Working Correlation Matrix
	Col1	Col2	Col3	Col4	Col5
Row1	1.0000	0.5941	0.5941	0.5941	0.5941
Row2	0.5941	1.0000	0.5941	0.5941	0.5941
Row3	0.5941	0.5941	1.0000	0.5941	0.5941
Row4	0.5941	0.5941	0.5941	1.0000	0.5941
Row5	0.5941	0.5941	0.5941	0.5941	1.0000

If you specify the COVB option, you produce both the model-based (naive) and the empirical (robust) covariance matrices. Output 29.7.5 contains these estimates.

Output 29.7.5: Covariance Matrices

The GENMOD Procedure

Covariance Matrix (Model-Based)
	Prm1	Prm2	Prm3	Prm4
Prm1	0.01223	0.001520	-0.01223	-0.001520
Prm2	0.001520	0.01519	-0.001520	-0.01519
Prm3	-0.01223	-0.001520	0.02495	0.005427
Prm4	-0.001520	-0.01519	0.005427	0.03748

Covariance Matrix (Empirical)
	Prm1	Prm2	Prm3	Prm4
Prm1	0.02476	-0.001152	-0.02476	0.001152
Prm2	-0.001152	0.01348	0.001152	-0.01348
Prm3	-0.02476	0.001152	0.03751	-0.002999
Prm4	0.001152	-0.01348	-0.002999	0.02931

The two covariance estimates are similar, indicating an adequate correlation model.

Chapter Contents
Previous
Next
Top