Chapter Contents |
Previous |
Next |
The LOGISTIC Procedure |
Often survival times are not observed more precisely than the interval (for instance, a day) within which the event occurred. Survival data of this form are known as grouped or interval-censored data. A discrete analogue of the continuous proportional hazards model (Prentice and Gloeckler 1978; Allison 1982) is used to investigate the relationship between these survival times and a set of explanatory variables.
Suppose Ti is the discrete survival time variable of the ith subject with covariates xi. The discrete-time hazard rate is defined as
Using elementary properties of conditional probabilities, it can be shown that
Suppose ti is the observed survival time of the ith subject. Suppose if Ti=ti is an event time and 0 otherwise. The likelihood for the grouped survival data is given by
Note that the likelihood L for the grouped survival data is the same as the likelihood of a binary response model with event probabilities .If the data are generated by a continuous-time proportional hazards model, Prentice and Gloeckler (1978) have shown that
Consider a study of the effect of insecticide on flour-beetles. Four different concentrations of an insecticide were sprayed on separate groups of flour-beetles. The numbers of male and female flour-beetles dying in successive intervals were saved in the data set beetles.
data beetles(keep=time sex conc freq); input time m20 f20 m32 f32 m50 f50 m80 f80; conc=.20; freq= m20; sex=1; output; freq= f20; sex=2; output; conc=.32; freq= m32; sex=1; output; freq= f32; sex=2; output; conc=.50; freq= m50; sex=1; output; freq= f50; sex=2; output; conc=.80; freq= m80; sex=1; output; freq= f80; sex=2; output; datalines; 1 3 0 7 1 5 0 4 2 2 11 2 10 5 8 4 10 7 3 10 4 11 11 11 6 8 15 4 7 8 16 10 15 6 14 9 5 4 9 3 5 4 3 8 3 6 3 3 2 1 2 1 2 4 7 2 0 1 0 1 1 1 1 8 1 0 0 1 1 4 0 1 9 0 0 1 1 0 0 0 0 10 0 0 0 0 0 0 1 1 11 0 0 0 0 1 1 0 0 12 1 0 0 0 0 1 0 0 13 1 0 0 0 0 1 0 0 14 101 126 19 47 7 17 2 4 ;
The data set beetles contains four variables: time, sex, conc, and freq. time represents the interval death time; for example, time=2 is the interval between day 1 and day 2. Insects surviving the duration (13 days) of the experiment are given a time value of 14. The variable sex represents the sex of the insects (1=male, 2=female), conc represents the concentration of the insecticide (mg/cm2), and freq represents the frequency of the observations.
To use PROC LOGISTIC with the grouped survival data, you must expand the data so that each beetle has a separate record for each day of survival. A beetle that died in the third day (time=3) would contribute three observations to the analysis, one for each day it was alive at the beginning of the day. A beetle that survives the 13-day duration of the experiment (time=14) would contribute 13 observations.
A new data set days that contains the beetle-day observations is created from the data set beetles. In addition to the variables sex, conc and freq, the data set contains an outcome variable y and 13 indicator variables day1, day2, ..., day13. y has a value of 1 if the observation corresponds to the day that the beetle died and has a value of 0 otherwise. An observation for the first day will have a value of 1 for day1 and a value of 0 for day2 -day13; an observation for the second day will have a value of 1 for day2 and a value of 0 for day1 and day2 -day13. For instance, Output 39.11.1 shows an observation in the beetles data set with time=3, and Output 39.11.2 shows the corresponding beetle-day observations in the data set days.
data days; retain day1-day13 0; array dd[13] day1-day13; set beetles; if time = 14 then do day=1 to 13; y=0; dd[day]=1; output; dd[day]=0; end; else do day=1 to time; if day=time then y=1; else y=0; dd[day]=1; output; dd[day]=0; end;Output 39.11.1: An Observation with Time=3 in Data Set Beetles
|
proc logistic data=days descending outest=est1; model y= day1-day13 sex conc / noint link=cloglog technique=newton; freq freq; run;Output 39.11.3: Parameter Estimates for the Grouped Proportional Hazards Model
|
The coefficients of day1 -day13 are the maximum likelihood estimates of , respectively. The baseline survivor function S0(t) is estimated by
The following DATA step computes the survivor curves for male and female flour-beetles exposed to the insecticide of concentrations .20 mg/cm2 and .80 mg/cm2. The GPLOT procedure in SAS/GRAPH software is used to plot the survival curves. Instead of plotting them as step functions, the SPLINE option is used to smooth the curves. These smoothed survival curves are displayed in Output 39.11.4.
data one (keep=day survival element s_m20 s_f20 s_m80 s_f80); array dd day1-day13; array sc[4] m20 f20 m80 f80; array s_sc[4] s_m20 s_f20 s_m80 s_f80 (1 1 1 1); set est1; m20= exp(sex + .20 * conc); f20= exp(2 * sex + .20 * conc); m80= exp(sex + .80 * conc); f80= exp(2 * sex + .80 * conc); survival=1; day=0; output; do over dd; element= exp(-exp(dd)); survival= survival * element; do i=1 to 4; s_sc[i] = survival ** sc[i]; end; day + 1; output; end; label s_m20= 'Male at .20 conc.' s_m80= 'Male at .80 conc.' s_f20= 'Female at .20 conc.' s_f80= 'Female at .80 conc.'; run; title1 'Flour-Beetles Sprayed with Insecticide'; legend1 label=none frame cframe=ligr cborder=black position=center value=(justify=center); axis1 label=(angle=90 'Survival Function'); proc gplot data=one; plot (s_m20 s_f20 s_m80 s_f80) * day / overlay legend=legend1 vaxis=axis1 cframe=ligr; symbol1 v=dot i=spline c=black height=.8; symbol2 v=dot i=spline c=red height=.8; symbol3 v=dot i=spline c=blue height=.8; symbol4 v=dot i=spline c=yellow height=.8; run;
The probability of survival is displayed on the vertical axis. Notice that most of the insecticide effect occurs by day 6 for both the high and low concentrations.
Output 39.11.4: Predicted Survival at Concentrations of 0.20 and 0.80 mg/cm2
Chapter Contents |
Previous |
Next |
Top |
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.