Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The CATMOD Procedure

Example 22.5: Log-Linear Model, Structural and Sampling Zeros

This example illustrates a log-linear model of independence, using data that contain structural zero frequencies as well as sampling (random) zero frequencies.

In a population of six squirrel monkeys, the joint distribution of genital display with respect to active or passive role was observed. The data are from Fienberg (1980, Table 8-2). Since a monkey cannot have both the active and passive roles in the same interaction, the diagonal cells of the table are structural zeros. See Agresti (1990) for more information on the quasi-independence model. Since there is only one population, the structural zeros are automatically deleted by PROC CATMOD. The sampling zeros are replaced in the DATA step by some positive number close to zero (1E-20). Also, the row for Monkey `t' is deleted since it contains all zeros; therefore, the cell frequencies predicted by a model of independence are also zero. In addition, the CONTRAST statement compares the behavior of the two monkeys labeled `u' and `v'. The following statements produce Output 22.5.1 through Output 22.5.8:

   title 'Behavior of Squirrel Monkeys';
   data Display;
      input Active $ Passive $ wt @@;
      if Active ne 't';
      if Active ne Passive then 
         if wt=0 then wt=1e-20;
      datalines;
   r r  0   r s  1   r t  5   r u  8   r v  9   r w  0
   s r 29   s s  0   s t 14   s u 46   s v  4   s w  0
   t r  0   t s  0   t t  0   t u  0   t v  0   t w  0
   u r  2   u s  3   u t  1   u u  0   u v 38   u w  2
   v r  0   v s  0   v t  0   v u  0   v v  0   v w  1
   w r  9   w s 25   w t  4   w u  6   w v 13   w w  0
   ;

   proc catmod data=Display;
      weight wt;
      model Active*Passive=_response_
            / freq pred=freq noparm noresponse oneway;
      loglin Active Passive;
      contrast 'Passive, U vs. V' Passive 0 0 0 1 -1;
      contrast 'Active,  U vs. V' Active  0 0 1 -1;
      title2 'Test Quasi-Independence for the Incomplete Table';
   quit;

Output 22.5.1: Log-Linear Model Analysis with Zero Frequencies
 
Behavior of Squirrel Monkeys
Test Quasi-Independence for the Incomplete Table

The CATMOD Procedure

Response Active*Passive Response Levels 25
Weight Variable wt Populations 1
Data Set DISPLAY Total Frequency 220
Frequency Missing 0 Observations 25

The results of the ONEWAY option are shown in Output 22.5.2. Monkey `t' does not show up as a value for the Active variable since that row was removed.

Output 22.5.2: Output from the ONEWAY option
 
Behavior of Squirrel Monkeys
Test Quasi-Independence for the Incomplete Table

The CATMOD Procedure

One-Way Frequencies
Variable Value Frequency
Active r 23
  s 93
  u 46
  v 1
  w 57
Passive r 40
  s 29
  t 24
  u 60
  v 64
  w 3

Output 22.5.3: Profiles
 
Behavior of Squirrel Monkeys
Test Quasi-Independence for the Incomplete Table

The CATMOD Procedure

Sample Sample Size
1 220
 
Response Profiles
Response Active Passive
1 r s
2 r t
3 r u
4 r v
5 r w
6 s r
7 s t
8 s u
9 s v
10 s w
11 u r
12 u s
13 u t
14 u v
15 u w
16 v r
17 v s
18 v t
19 v u
20 v w
21 w r
22 w s
23 w t
24 w u
25 w v

Sampling zeros are displayed as 1E-20 in Output 22.5.4. The Response Number corresponds to the value displayed in Output 22.5.2.

Output 22.5.4: Frequency of Response by Response Number
 
Behavior of Squirrel Monkeys
Test Quasi-Independence for the Incomplete Table

The CATMOD Procedure

Response Frequencies
Sample Response Number
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
1 1 5 8 9 1E-20 29 14 46 4 1E-20 2 3 1 38 2 1E-20 1E-20 1E-20 1E-20 1 9 25 4 6 13

Output 22.5.5: Iteration History
 
Behavior of Squirrel Monkeys
Test Quasi-Independence for the Incomplete Table

The CATMOD Procedure

Maximum Likelihood Analysis
Iteration Sub Iteration -2 Log
Likelihood
Convergence Criterion Parameter Estimates
1 2 3 4 5 6 7 8 9
0 0 1416.3054 1.0000 0 0 0 0 0 0 0 0 0
1 0 1238.2417 0.1257 -0.4976 1.1112 0.1722 -0.8804 -0.006978 0.0827 -0.4735 0.7287 0.5791
2 0 1205.1264 0.0267 -0.3420 1.0962 0.5612 -1.7549 0.2233 0.3899 -0.4086 0.7875 0.5728
3 0 1199.5068 0.004663 -0.1570 1.2687 0.7058 -2.3992 0.3034 0.4360 -0.3162 0.8812 0.6703
4 0 1198.6271 0.000733 -0.0466 1.3791 0.8170 -2.8422 0.3309 0.4625 -0.2890 0.9085 0.6968
5 0 1198.5611 0.0000551 -0.002748 1.4230 0.8609 -3.0176 0.3334 0.4649 -0.2866 0.9110 0.6992
6 0 1198.5603 6.5351E-7 0.002760 1.4285 0.8664 -3.0396 0.3334 0.4649 -0.2865 0.9110 0.6992
7 0 1198.5603 1.217E-10 0.002837 1.4285 0.8665 -3.0399 0.3334 0.4649 -0.2865 0.9110 0.6992
 
Maximum likelihood computations converged.

Output 22.5.6: Analysis of Variance Table
 
Behavior of Squirrel Monkeys
Test Quasi-Independence for the Incomplete Table

The CATMOD Procedure

Maximum Likelihood Analysis of Variance
Source DF Chi-Square Pr > ChiSq
Active 4 56.58 <.0001
Passive 5 47.94 <.0001
Likelihood Ratio 15 135.17 <.0001

The analysis of variance table (Output 22.5.6) shows that the model of independence does not fit since the likelihood ratio test for the interaction is significant. In other words, active and passive behaviors of the squirrel monkeys are dependent behavior roles.

Output 22.5.7: Contrasts between Monkeys `u' and `v'
 
Behavior of Squirrel Monkeys
Test Quasi-Independence for the Incomplete Table

The CATMOD Procedure

Contrasts of Maximum Likelihood Estimates
Contrast DF Chi-Square Pr > ChiSq
Passive, U vs. V 1 1.31 0.2524
Active, U vs. V 1 14.87 0.0001

If the model fit these data, then the contrasts in Output 22.5.7 show that monkeys `u' and `v' appear to have similar passive behavior patterns but very different active behavior patterns.

Output 22.5.8: Response Function Predicted Values
 
Behavior of Squirrel Monkeys
Test Quasi-Independence for the Incomplete Table

The CATMOD Procedure

Maximum Likelihood Predicted Values for Response Functions
Sample Function
Number
Observed Predicted Residual
Function Standard
Error
Function Standard
Error
1 1 -2.5649494 1.037749 -0.973554 0.339019 -1.5913953
  2 -0.9555114 0.526235 -1.7250404 0.345438 0.76952896
  3 -0.4855078 0.449359 -0.5275144 0.309254 0.0420066
  4 -0.3677248 0.433629 -0.7392682 0.249006 0.37154345
  5 -48.616651 1E10 -3.560517 0.634104 -45.056134
  6 0.80234647 0.333775 0.32058886 0.26629 0.48175761
  7 0.07410797 0.385164 -0.2993416 0.295634 0.37344956
  8 1.26369204 0.314105 0.89818441 0.250857 0.36550763
  9 -1.178655 0.571772 0.6864306 0.173396 -1.8650856
  10 -48.616651 1E10 -2.1348182 0.608071 -46.481833
  11 -1.8718022 0.759555 -0.2414953 0.287218 -1.6303069
  12 -1.4663371 0.640513 -0.1099394 0.303568 -1.3563977
  13 -2.5649494 1.037749 -0.8614257 0.314794 -1.7035236
  14 1.0726368 0.321308 0.12434644 0.204345 0.94829036
  15 -1.8718022 0.759555 -2.6969023 0.617433 0.82510014
  16 -48.616651 1E10 -4.1478747 1.024508 -44.468777
  17 -48.616651 1E10 -4.0163187 1.030062 -44.600332
  18 -48.616651 1E10 -4.7678051 1.032457 -43.848846
  19 -48.616651 1E10 -3.5702791 1.020794 -45.046372
  20 -2.5649494 1.037749 -6.6032817 1.161289 4.03833233
  21 -0.3677248 0.433629 -0.3658417 0.202959 -0.001883
  22 0.65392647 0.34194 -0.2342858 0.232794 0.88821229
  23 -1.178655 0.571772 -0.9857722 0.239408 -0.1928828
  24 -0.7731899 0.493548 0.21175381 0.185007 -0.9849437

Output 22.5.9: Predicted Frequencies
 
Behavior of Squirrel Monkeys
Test Quasi-Independence for the Incomplete Table

The CATMOD Procedure

Maximum Likelihood Predicted Values for Frequencies
Sample Active Passive Function
Number
Observed Predicted Residual
Frequency Standard
Error
Frequency Standard
Error
1 r s F1 1 0.997725 5.25950838 1.36156 -4.2595084
  r t F2 5 2.210512 2.48072585 0.691066 2.51927415
  r u F3 8 2.776525 8.21594841 1.855146 -0.2159484
  r v F4 9 2.937996 6.64804868 1.50932 2.35195132
  r w F5 1E-20 1E-10 0.39576868 0.240268 -0.3957687
  s r F6 29 5.017696 19.1859928 3.147915 9.81400723
  s t F7 14 3.620648 10.321716 2.169599 3.67828404
  s u F8 46 6.031734 34.1846262 4.428706 11.8153738
  s v F9 4 1.981735 27.6609647 3.722788 -23.660965
  s w F10 1E-20 1E-10 1.64670026 0.952712 -1.6467003
  u r F11 2 1.407771 10.936396 2.12322 -8.936396
  u s F12 3 1.720201 12.4740717 2.554336 -9.4740717
  u t F13 1 0.997725 5.8835826 1.380655 -4.8835826
  u v F14 38 5.606814 15.7672979 2.684692 22.2327021
  u w F15 2 1.407771 0.93865177 0.551645 1.06134823
  v r F16 1E-20 1E-10 0.21996583 0.221779 -0.2199658
  v s F17 1E-20 1E-10 0.2508934 0.253706 -0.2508934
  v t F18 1E-20 1E-10 0.11833763 0.120314 -0.1183376
  v u F19 1E-20 1E-10 0.39192393 0.393255 -0.3919239
  v w F20 1 0.997725 0.01887928 0.021728 0.98112072
  w r F21 9 2.937996 9.6576454 1.808656 -0.6576454
  w s F22 25 4.707344 11.0155266 2.275019 13.9844734
  w t F23 4 1.981735 5.19563797 1.184452 -1.195638
  w u F24 6 2.415857 17.2075014 2.772098 -11.207501
  w v F25 13 3.497402 13.9236886 2.24158 -0.9236886

Output 22.5.8 displays the predicted response functions and Output 22.5.9 displays predicted cell frequencies (from the PRED=FREQ option), but since the model does not fit, these should be ignored.

Structural and Sampling Zeros with Raw Data

The preceding PROC CATMOD step uses cell count data as input. Prior to invoking the CATMOD procedure, structural and sampling zeros are easily identified and manipulated in a single DATA step. For the situation where structural or sampling zeros (or both) may exist and the input data set is raw data, use the following steps:
  1. Run PROC FREQ on the raw data. In the TABLES statement, list all dependent and independent variables separated by asterisks and use the SPARSE option and the OUT= option. This creates an output data set that contains all possible zero frequencies.
  2. Use a DATA step to change the zero frequencies associated with sampling zeros to a small value, such as 1E-20.
  3. Use the resulting data set as input to PROC CATMOD, and specify the statement WEIGHT COUNT to use adjusted frequencies.

For example, suppose the data set RawDisplay contains the raw data for the squirrel monkey data. The following statements show how to obtain the same analysis as shown previously:

   proc freq data=RawDisplay;
      tables Active*Passive / sparse out=Combos noprint;
   run;

   data Combos2;
      set Combos;
      if Active ne 't';
      if Active ne Passive then 
         if count=0 then count=1e-20;
   run;

   proc catmod data=Combos2;
      weight count;
      model Active*Passive=_response_
            / freq pred=freq noparm noresponse;
      loglin Active Passive;
   quit;

The first IF statement in the DATA step is needed only for this particular example; since observations for Monkey `t' were deleted from the Display data set, they also need to be deleted from Combos2.

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.