Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The FREQ Procedure

Example 28.1: Creating an Output Data Set with Table Cell Frequencies

[Creating an Output Data Set with Table Cell...;] The eye and hair color of children from two different regions of Europe are recorded in the data set Color. Instead of recording one observation per child, the data are recorded as cell counts, where the variable Count contains the number of children exhibiting each of the 15 eye and hair color combinations. The data set does not include missing combinations.

   data Color;
      input Region Eyes $ Hair $ Count @@;
         label Eyes  ='Eye Color'
               Hair  ='Hair Color'
               Region='Geographic Region';
         datalines;
   1 blue  fair   23  1 blue  red     7  1 blue  medium 24
   1 blue  dark   11  1 green fair   19  1 green red     7
   1 green medium 18  1 green dark   14  1 brown fair   34
   1 brown red     5  1 brown medium 41  1 brown dark   40 
   1 brown black   3  2 blue  fair   46  2 blue  red    21
   2 blue  medium 44  2 blue  dark   40  2 blue  black   6
   2 green fair   50  2 green red    31  2 green medium 37
   2 green dark   23  2 brown fair   56  2 brown red    42
   2 brown medium 53  2 brown dark   54  2 brown black  13
   ;

The following statements read the Color data set and create an output data set containing the frequencies, percentages, and expected cell frequencies of the Eyes by Hair two-way table. The TABLES statement requests three tables: Eyes and Hair frequency tables and an Eyes by Hair crosstabulation table. The OUT= option creates the FreqCnt data set, which contains the crosstabulation table frequencies. The OUTEXPECT option outputs the expected cell frequencies to FreqCnt, and the SPARSE option includes the zero cell counts. The WEIGHT statement specifies that Count contains the observation weights. The following statements create Output 28.1.1 through Output 28.1.3.

   proc freq data=Color;
      weight Count;
      tables Eyes Hair Eyes*Hair/out=FreqCnt outexpect sparse;
      title 'Eye and Hair Color of European Children';
   run;
   proc print data=FreqCnt noobs;
      title2 'Output Data Set from PROC FREQ';
   run;

Output 28.1.1: Frequency Table
 

Eye and Hair Color of European Children
The FREQ Procedure
Eye Color
Eyes Frequency Percent Cumulative
Frequency
Cumulative
Percent
blue 222 29.13 222 29.13
brown 341 44.75 563 73.88
green 199 26.12 762 100.00
 
Hair Color
Hair Frequency Percent Cumulative
Frequency
Cumulative
Percent
black 22 2.89 22 2.89
dark 182 23.88 204 26.77
fair 228 29.92 432 56.69
medium 217 28.48 649 85.17
red 113 14.83 762 100.00

Output 28.1.2: Cross Tabulation Table
 

Eye and Hair Color of European Children
The FREQ Procedure
Frequency
Percent
Row Pct
Col Pct
Table of Eyes by Hair
Eyes(Eye Color Hair(Hair Color) Total
black dark fair medium red
blue 6
0.79
2.70
27.27
51
6.69
22.97
28.02
69
9.06
31.08
30.26
68
8.92
30.63
31.34
28
3.67
12.61
24.78
222
29.13
 
 
brown 16
2.10
4.69
72.73
94
12.34
27.57
51.65
90
11.81
26.39
39.47
94
12.34
27.57
43.32
47
6.17
13.78
41.59
341
44.75
 
 
green 0
0.00
0.00
0.00
37
4.86
18.59
20.33
69
9.06
34.67
30.26
55
7.22
27.64
25.35
38
4.99
19.10
33.63
199
26.12
 
 
Total 22
2.89
182
23.88
228
29.92
217
28.48
113
14.83
762
100.00

By default, PROC FREQ displays the variable values in alphabetical order (Output 28.1.1). The 'Eyes*Hair' specification produces a crosstabulation table (Output 28.1.2) with eye color defining the table rows and hair color defining the table columns. A zero cell count for green eyes and black hair indicates that this eye and hair color combination does not occur in the data.

Output 28.1.3: OUT= Data Set
 

Output Data Set from PROC FREQ
Eyes Hair COUNT EXPECTED PERCENT
blue black 6 6.409 0.7874
blue dark 51 53.024 6.6929
blue fair 69 66.425 9.0551
blue medium 68 63.220 8.9239
blue red 28 32.921 3.6745
brown black 16 9.845 2.0997
brown dark 94 81.446 12.3360
brown fair 90 102.031 11.8110
brown medium 94 97.109 12.3360
brown red 47 50.568 6.1680
green black 0 5.745 0.0000
green dark 37 47.530 4.8556
green fair 69 59.543 9.0551
green medium 55 56.671 7.2178
green red 38 29.510 4.9869


The output data set (Output 28.1.3) contains frequency counts and percentages for the last table. The data set also includes an observation for the zero cell count (SPARSE) and a variable with the expected cell frequency for each table cell (OUTEXPECT).

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.