Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The FREQ Procedure

Example 28.2: Computing Chi-square Tests for One-Way Frequency Tables

This example examines whether the children's hair color (from Example 28.1) has a specified multinomial distribution for the two regions. The hypothesized distribution for hair color is 30% fair, 12% red, 30% medium, 25% dark, and 3% black.

In order to test the hypothesis for each region, the data are first sorted by Region. Then the FREQ procedure uses a BY statement to produce a separate table for each BY group (Region). The option ORDER=DATA orders the frequency table values (hair color) by their order in the data set. The TABLES statement requests a frequency table for hair color, and the option NOCUM suppresses the display of the cumulative frequencies and percentages. The TESTP= option specifies the hypothesized percentages for the chi-square test; the number of percentages specified equals the number of table levels, and the percentages sum to 100. The following statements produce Output 28.2.1.

   proc sort data=Color;
      by Region;
   run;
   proc freq data=Color order=data;
      weight Count;
      tables Hair/nocum testp=(30 12 30 25 3);
      by Region;
      title 'Hair Color of European Children';
   run;

Output 28.2.1: One-way Frequency Table with BY Group
 

Hair Color of European Children
The FREQ Procedure
Geographic Region=1
Hair Color
Hair Frequency Percent Test
Percent
fair 76 30.89 30.00
red 19 7.72 12.00
medium 83 33.74 30.00
dark 65 26.42 25.00
black 3 1.22 3.00
 
Chi-Square Test
for Specified Proportions
Chi-Square 7.7602
DF 4
Pr > ChiSq 0.1008


 

Hair Color of European Children
The FREQ Procedure
Geographic Region=2
Hair Color
Hair Frequency Percent Test
Percent
fair 152 29.46 30.00
red 94 18.22 12.00
medium 134 25.97 30.00
dark 117 22.67 25.00
black 19 3.68 3.00
 
Chi-Square Test
for Specified Proportions
Chi-Square 21.3824
DF 4
Pr > ChiSq 0.0003

The frequency tables list the variable values (hair color) in the order in which they appear in the data set. The "Test Percent" column lists the hypothesized percentages for the chi-square test. Always check that you have ordered the TESTP= percentages to correctly match the order of the variable levels.

PROC FREQ computes a chi-square statistic for each region. The chi-square statistic is significant at the 0.05 level for Region 2 (p=0.0003) but not for Region 1. This indicates a significant departure from the hypothesized percentages in Region 2.

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.