|
Chapter Contents |
Previous |
Next |
| The LOGISTIC Procedure |
Antibodies produced in response to an infectious disease like malaria
remain in the body after the individual has recovered from the
disease. A serological test detects the presence
or absence of such antibodies. An individual with such antibodies
is termed seropositive. In areas where the disease is endemic,
the inhabitants are at fairly constant risk of infection.
The probability of an individual never having been infected
in Y years is
, where
is the mean number of infections per year (refer to the appendix
of Draper et al. 1972).
Rather than estimating the unknown
, it is of interest to
epidemiologists
to estimate the probability of a person living in the area
being infected in one year. This infection rate
is given
by

The following SAS statements create the data set sero, which contains the results of a serological survey of malarial infection. Individuals of nine age groups were tested. Variable A represents the midpoint of the age range for each age group. Variable N represents the number of individuals tested in each age group, and variable R represents the number of individuals that are seropositive.
data sero;
input group A N R;
X=log(A);
label X='Log of Midpoint of Age Range';
datalines;
1 1.5 123 8
2 4.0 132 6
3 7.5 182 18
4 12.5 140 14
5 17.5 138 20
6 25.0 161 39
7 35.0 133 19
8 47.0 92 25
9 60.0 74 44
;
For the ith group with age midpoint Ai, the probability of
being seropositive is
. It follows that
proc logistic data=sero;
model R/N= / offset=X
link=cloglog
clparm=pl
scale=none;
title 'Constant Risk of Infection';
run;
Output 39.10.1: Modeling Constant Risk of Infection
Results of fitting this constant risk model are shown in Output 39.10.1.
The maximum likelihood estimate of
and its
estimated standard error
are
and
,respectively. The infection rate is estimated as

The 95% confidence interval for
, obtained by
back-transforming the 95% confidence interval for
,is (0.0082, 0.0011); that is, there is a 95% chance that,
in repeated sampling, the interval
of 8 to 11 infections per thousand individuals contains the true
infection rate.
The goodness of fit statistics for the constant risk
model are statistically significant (p < 0.0001),
indicating that the assumption of constant risk of
infection is not correct. You can fit a more extensive
model by allowing a separate risk of infection for each
age group. Suppose
is the mean number of
infections per year for the ith age group. The
probability of seropositive for the ith group with age
midpoint Ai is
, so that

In the following SAS statements,
nine dummy variables (agegrp1 -agegrp9) are created as the design
variables for the age groups. PROC LOGISTIC is invoked to fit
a complementary log-log model that contains
agegrp1 -agegrp9 as the
only explanatory variables with no intercept term and with
X=log(A) as an offset term. Note that
is
the regression parameter associated with agegrpi.
data two;
array agegrp(9) agegrp1-agegrp9 (0 0 0 0 0 0 0 0 0);
set sero;
agegrp[group]=1;
output;
agegrp[group]=0;
run;
proc logistic data=two;
model R/N=agegrp1-agegrp9 / offset=X
noint
link=cloglog
clparm=pl;
title 'Infectious Rates and 95% Confidence Intervals';
run;
Output 39.10.2: Modeling Separate Risk of Infection|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Number Infected per 1000 People | |||
| Age | Point | 95% Confidence Limits | |
| Group | Estimate | Lower | Upper |
| 1 | 44 | 20 | 80 |
| 2 | 12 | 5 | 23 |
| 3 | 14 | 8 | 21 |
| 4 | 8 | 5 | 14 |
| 5 | 9 | 6 | 13 |
| 6 | 11 | 8 | 15 |
| 7 | 4 | 3 | 7 |
| 8 | 7 | 4 | 10 |
| 9 | 15 | 11 | 20 |
Results of fitting the model for separate risk of infection are
shown in Output 39.10.2. For the first age group, the
point estimate of
is -3.1048.
This translates into an
infection rate of 1-exp(-exp(-3.1048)) = 0.0438.
A 95% confidence interval for the infection rate
is obtained
by transforming the
95% confidence interval for
.For the first age group, the lower and upper confidence limits are
1-exp(-exp(-3.8880) = 0.0203 and 1-exp(-exp(-2.4833)) = 0.0801,
respectively. Table 39.3 shows the estimated infection
rate in one year's time for each age group.
Note that
the infection rate for the first age group
is high compared to the other age groups.
|
Chapter Contents |
Previous |
Next |
Top |
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.