Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The LOGISTIC Procedure

Getting Started

The LOGISTIC procedure is similar in use to the other regression procedures in the SAS System. To demonstrate the similarity, suppose the response variable y is binary or ordinal, and x1 and x2 are two explanatory variables of interest. To fit a logistic regression model, you can use a MODEL statement similar to that used in the REG procedure:

   proc logistic;
      model y=x1 x2;
   run;

The response variable y can be either character or numeric. PROC LOGISTIC enumerates the total number of response categories and orders the response levels according to the ORDER= option in the PROC LOGISTIC statement. The procedure also allows the input of binary response data that are grouped:

   proc logistic;
      model r/n=x1 x2;
   run;

Here, n represents the number of trials and r represents the number of events.

The following example illustrates the use of PROC LOGISTIC. The data, taken from Cox and Snell (1989, pp. 10 -11), consist of the number, r, of ingots not ready for rolling, out of n tested, for a number of combinations of heating time and soaking time. The following invocation of PROC LOGISTIC fits the binary logit model to the grouped data:

   data ingots;
      input Heat Soak r n @@;
      datalines;
   7 1.0 0 10  14 1.0 0 31  27 1.0 1 56  51 1.0 3 13
   7 1.7 0 17  14 1.7 0 43  27 1.7 4 44  51 1.7 0  1
   7 2.2 0  7  14 2.2 2 33  27 2.2 0 21  51 2.2 0  1
   7 2.8 0 12  14 2.8 0 31  27 2.8 1 22  51 4.0 0  1
   7 4.0 0  9  14 4.0 0 19  27 4.0 1 16
   ;

   proc logistic data=ingots;
      model r/n=Heat Soak;
   run;

The results of this analysis are shown in the following tables.

The SAS System
The LOGISTIC Procedure
Model Information
Data Set WORK.INGOTS
Response Variable (Events) r
Response Variable (Trials) n
Number of Observations 19
Link Function Logit
Optimization Technique Fisher's scoring


PROC LOGISTIC first lists background information about the fitting of the model. Included are the name of the input data set, the response variable(s) used, the number of observations used, and the link function used.

The LOGISTIC Procedure
Response Profile
Ordered
Value
Binary Outcome Total
Frequency
1 Event 12
2 Nonevent 375
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.


The "Response Profile" table lists the response categories (which are EVENT and NO EVENT when grouped data are input), their ordered values, and their total frequencies for the given data.

The LOGISTIC Procedure
Model Fit Statistics
Criterion Intercept
Only
Intercept
and
Covariates
AIC 108.988 101.346
SC 112.947 113.221
-2 Log L 106.988 95.346
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 11.6428 2 0.0030
Score 15.1091 2 0.0005
Wald 13.0315 2 0.0015


The "Model Fit Statistics" table contains the Akaike Information Criterion (AIC), the Schwarz Criterion (SC), and the negative of twice the log likelihood (-2 Log L) for the intercept-only model and the fitted model. AIC and SC can be used to compare different models, and the ones with smaller values are preferred. Results of the likelihood ratio test and the efficient score test for testing the joint significance of the explanatory variables (Soak and Heat) are included in the "Testing Global Null Hypothesis: BETA=0" table.

The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Parameter DF Estimate Standard
Error
Chi-Square Pr > ChiSq
Intercept 1 -5.5592 1.1197 24.6503 <.0001
Heat 1 0.0820 0.0237 11.9454 0.0005
Soak 1 0.0568 0.3312 0.0294 0.8639
Odds Ratio Estimates
Effect Point Estimate 95% Wald
Confidence Limits
Heat 1.085 1.036 1.137
Soak 1.058 0.553 2.026


The "Analysis of Maximum Likelihood Estimates" table lists the parameter estimates, their standard errors, and the results of the Wald test for individual parameters. The odds ratio for each slope parameter, estimated by exponentiating the corresponding parameter estimate, is shown in the "Odds Ratios Estimates" table, along with 95% Wald confidence intervals.

Using the parameter estimates, you can calculate the estimated logit of p as

-5.5592+0.082 × Heat+0.0568 × Soak

If Heat=7 and Soak=1, then logit(\hat{p})=-4.9284. Using this logit estimate, you can calculate \hat{p} as follows:

\hat{p}=1 / (1 + e^{4.9284})=0.0072

This gives the predicted probability of the event (ingot not ready for rolling) for Heat=7 and Soak=1. Note that PROC LOGISTIC can calculate these statistics for you; use the OUTPUT statement with the P= option.

The LOGISTIC Procedure
Association of Predicted Probabilities and
Observed Responses
Percent Concordant 64.4 Somers' D 0.460
Percent Discordant 18.4 Gamma 0.555
Percent Tied 17.2 Tau-a 0.028
Pairs 4500 c 0.730


Finally, the "Association of Predicted Probabilities and Observed Responses" table contains four measures of association for assessing the predictive ability of a model. They are based on the number of pairs of observations with different response values, the number of concordant pairs, and the number of discordant pairs, which are also displayed. Formulas for these statistics are given in the "Rank Correlation of Observed Responses and Predicted Probabilities" section.

To illustrate the use of an alternative form of input data, the following program creates the INGOTS data set with new variables NotReady and Freq instead of n and r. The variable NotReady represents the response of individual units; it has a value of 1 for units not ready for rolling (event) and a value of 0 for units ready for rolling (nonevent). The variable Freq represents the frequency of occurrence of each combination of Heat, Soak, and NotReady. Note that, compared to the previous data set, NotReady=1 implies Freq=r, and NotReady=0 implies Freq= n-r.

   data ingots;
      input Heat Soak NotReady Freq @@;
      datalines;
   7 1.0 0 10  14 1.0 0 31  14 4.0 0 19  27 2.2 0 21  51 1.0 1  3
   7 1.7 0 17  14 1.7 0 43  27 1.0 1  1  27 2.8 1  1  51 1.0 0 10
   7 2.2 0  7  14 2.2 1  2  27 1.0 0 55  27 2.8 0 21  51 1.7 0  1
   7 2.8 0 12  14 2.2 0 31  27 1.7 1  4  27 4.0 1  1  51 2.2 0  1
   7 4.0 0  9  14 2.8 0 31  27 1.7 0 40  27 4.0 0 15  51 4.0 0  1
   ;

The following SAS statements invoke PROC LOGISTIC to fit the same model using the alternative form of the input data set.

   proc logistic data=ingots descending;
      model NotReady = Soak Heat;
      freq Freq;
   run;

Results of this analysis are the same as the previous one. The displayed output for the two runs are identical except for the background information of the model fit and the "Response Profile" table.

PROC LOGISTIC models the probability of the response level that corresponds to the Ordered Value 1 as displayed in the "Response Profile" table. By default, Ordered Values are assigned to the sorted response values in ascending order.

The DESCENDING option reverses the default ordering of the response values so that NotReady=1 corresponds to the Ordered Value 1 and NotReady=0 corresponds to the Ordered Value 2, as shown in the following table:

The LOGISTIC Procedure
Response Profile
Ordered
Value
NotReady Total
Frequency
1 1 12
2 0 375


If the ORDER= option and the DESCENDING option are specified together, the response levels are ordered according to the ORDER= option and then reversed. You should always check the "Response Profile" table to ensure that the outcome of interest has been assigned Ordered Value 1. See the "Response Level Ordering" section for more detail.

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.