Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The FREQ Procedure

TABLES Statement

TABLES requests < / options > ;

The TABLES statement requests one-way to n-way frequency and crosstabulation tables and statistics for those tables.

If you omit the TABLES statement, PROC FREQ generates one-way frequency tables for all data set variables that are not listed in the other statements. The following argument is required in the TABLES statement.

requests
specifies the frequency and crosstabulation tables to produce. A request is composed of one variable name or several variable names separated by asterisks. To request a one-way frequency table, use a single variable. To request a two-way crosstabulation table, use an asterisk between two variables. To request a multiway table (an n-way table, where n>2), separate the desired variables with asterisks. The unique values of these variables form the rows, columns, and strata of the table.

For two-way to multiway tables, the values of the last variable form the crosstabulation table columns, while the values of the next-to-last variable form the rows. Each level (or combination of levels) of the other variables forms one stratum. PROC FREQ produces a separate crosstabulation table for each stratum. For example, a specification of A*B*C*D in a TABLES statement produces k tables, where k is the number of different combinations of values for A and B. Each table lists the values for C down the side and the values for D across the top.

You can use multiple TABLES statements in the PROC FREQ step. PROC FREQ builds all the table requests in one pass of the data, so that there is essentially no loss of efficiency. You can also specify any number of table requests in a single TABLES statement. To specify multiple table requests quickly, use a grouping syntax by placing parentheses around several variables and joining other variables or variable combinations. For example, the following statements illustrate grouping syntax.

Table 28.8: Grouping Syntax
Request Equivalent to
tables A*(B C);tables A*B    A*C;
tables (A B)*(C D);tables A*C    B*C    A*D    B*D;
tables (A B C)*D;tables A*D    B*D    C*D;
tables A -C;tables A    B    C;
tables (A -C)*D;tables A*D    B*D    C*D;

Without Options

If you request a one-way frequency table for a variable without specifying options, PROC FREQ produces frequencies, cumulative frequencies, percentages of the total frequency, and cumulative percentages for each value of the variable. If you request a two-way or an n-way crosstabulation table without specifying options, PROC FREQ produces crosstabulation tables that include cell frequencies, cell percentages of the total frequency, cell percentages of row frequencies, and cell percentages of column frequencies. The procedure excludes observations with missing values from the table but displays the total frequency of missing observations below each table.

Options

The following table lists the options available with the TABLES statement. Descriptions follow in alphabetical order.

Table 28.9: TABLES Statement Options
Option Description
Control Statistical Analysis
AGREErequests tests and measures of classification agreement
ALLrequests tests and measures of association produced by CHISQ, MEASURES, and CMH
ALPHA=sets the confidence level for confidence limits
BINOMIALrequests binomial proportion, confidence limits and test for one-way tables
CHISQrequests chi-square tests and measures of association based on chi-square
CLrequests confidence limits for the MEASURES statistics
CMHrequests all Cochran-Mantel-Haenszel statistics
CMH1requests the CMH correlation statistic, and adjusted relative risks and odds ratios
CMH2requests CMH correlation and row mean scores (ANOVA) statistics, and adjusted relative risks and odds ratios
CONVERGE=specifies convergence criterion to compute polychoric correlation
FISHERrequests Fisher's exact test for tables larger than 2 ×2
JTrequests Jonckheere-Terpstra test
MAXITER=specifies maximum number of iterations to compute polychoric correlation
MEASURESrequests measures of association and their asymptotic standard errors
MISSINGtreats missing values as nonmissing
PLCORRrequests polychoric correlation
RELRISKrequests relative risk measures for 2 ×2 tables
RISKDIFFrequests risks and risk differences for 2 ×2 tables
SCORES=specifies the type of row and column scores
TESTF=specifies expected frequencies for a one-way table chi-square test
TESTP=specifies expected proportions for a one-way table chi-square test
TRENDrequests Cochran-Armitage test for trend
Control Additional Table Information
CELLCHI2displays each cell's contribution to the total Pearson chi-square statistic
CUMCOLdisplays the cumulative column percentage in each cell
DEVIATIONdisplays the deviation of the cell frequency from the expected value for each cell
EXPECTEDdisplays the expected cell frequency for each cell
MISSPRINTdisplays missing value frequencies
SPARSElists all possible combinations of variable levels even when a combination does not occur
TOTPCTdisplays percentage of total frequency on n-way tables when n>2
Control Displayed Output
NOCOLsuppresses display of the column percentage for each cell
NOCUMsuppresses display of cumulative frequencies and cumulative percentages in one-way frequency tables and in list format
NOFREQsuppresses display of the frequency count for each cell
NOPERCENTsuppresses display of the percentage, row percentage, and column percentage in crosstabulation tables, or percentages and cumulative percentages in one-way frequency tables and in list format
NOPRINTsuppresses display of tables but displays statistics
NOROWsuppresses display of the row percentage for each cell
LISTdisplays two-way to n-way tables in list format
PRINTKWTdisplays kappa coefficient weights
SCOROUTdisplays the row and the column scores
Create an Output Data Set
OUT=specifies an output data set to contain variable values and frequency counts
OUTEXPECTincludes the expected frequency of each cell in the output data set
OUTPCTincludes the percentage of column frequency, row frequency, and two-way table frequency in the output data set


You can specify the following options in a TABLES statement.

AGREE < (WT=FC) >
requests tests and measures of classification agreement for square tables. The AGREE option provides McNemar's test for 2 ×2 tables and Bowker's test of symmetry for tables with more than two response categories. The AGREE option also produces the simple kappa coefficient, the weighted kappa coefficient, the asymptotic standard errors for the simple and weighted kappas, and the corresponding confidence limits. When there are multiple strata, the AGREE option provides overall simple and weighted kappas as well as tests for equal kappas among strata. When there are multiple strata and two response categories, PROC FREQ computes Cochran's Q test. For more information, see the section "Tests and Measures of Agreement".

The (WT=FC) specification requests that PROC FREQ use Fleiss-Cohen weights to compute the weighted kappa coefficient. By default, PROC FREQ uses Cicchetti-Allison weights. See the section "Weighted Kappa Coefficient" for more information. You can specify the option PRINTKWT to display the kappa coefficient weights.

ALL
requests all of the tests and measures that are computed by the CHISQ, MEASURES, and CMH options. The number of CMH statistics computed can be controlled by the CMH1 and CMH2 options.

ALPHA=\alpha
sets the confidence level for confidence limits. The value of the ALPHA= option must be between 0.0001 and 0.9999, and the default is 0.05. A confidence level of \alpharesults in 100(1 - \alpha)% confidence limits. The default of ALPHA=0.05 results in 95% confidence limits. If \alpha is between 0 and 1 but outside the range of 0.0001 to 0.9999, PROC FREQ uses the closest range endpoint. For example, if you specify ALPHA=0.000001, PROC FREQ uses 0.0001 to determine confidence limits.

BINOMIAL < (p= value) >
requests the binomial proportion for one-way tables. This is the proportion of observations for the first variable level that appears in the output. The BINOMIAL option also provides the asymptotic standard error, asymptotic and exact confidence intervals, and the asymptotic test for the binomial proportion. To specify the null hypothesis proportion value for the test, use the p= specification. If you omit p=value, PROC FREQ uses 0.5 as the default for the test. See the section "Binomial Proportion" for more information.

CELLCHI2
displays each cell's contribution to the total Pearson chi-square statistic, which is computed as
[((frequency-expected)2)/(expected)]

The CELLCHI2 option is valid for contingency tables but has no effect on tables that are produced with the LIST option.

CHISQ
requests chi-square tests of homogeneity or independence and measures of association based on chi-square. The tests include the Pearson chi-square, likelihood-ratio chi-square, and Mantel-Haenszel chi-square. The measures include the phi coefficient, the contingency coefficient, and Cramer's V. For 2 ×2 tables, the CHISQ option includes Fisher's exact test and the continuity-adjusted chi-square. For one-way tables, the CHISQ option requests a chi-square goodness-of-fit test for equal proportions. If you specify the null hypothesis proportions with the TESTP= option, then PROC FREQ computes a chi-square goodness-of-fit test for the specified proportions. If you specify null hypothesis frequencies with the TESTF= option, PROC FREQ computes a chi-square goodness-of-fit test for the specified frequencies. See the section "Chi-Square Tests and Statistics" for more information.

CL
requests confidence limits for the MEASURES statistics. If you omit the MEASURES option, the CL option invokes MEASURES. The FREQ procedure determines the confidence coefficient using the ALPHA= option, which by default equals 0.05 and produces 95% confidence limits.

For more information, see the section "Confidence Limits".

CMH
requests Cochran-Mantel-Haenszel statistics, which test for association between the row and column variables after adjusting for the remaining variables in a multiway table. In addition, for 2 ×2 tables, PROC FREQ computes the adjusted Mantel-Haenszel and logit estimates of the odds ratios and relative risks and the corresponding confidence limits. For the stratified 2 ×2 case, PROC FREQ computes the Breslow-Day test for homogeneity of odds ratios. The CMH1 and CMH2 options control the number of CMH statistics that PROC FREQ computes. For more information, see the section "Cochran-Mantel-Haenszel Statistics".

CMH1
requests the Cochran-Mantel-Haenszel correlation statistic and, for 2 ×2 tables, the adjusted Mantel-Haenszel and logit estimates of the odds ratios and relative risks and the corresponding confidence limits. For the stratified 2 ×2 case, PROC FREQ computes the Breslow-Day test for homogeneity of odds ratios. Except for 2 ×2 tables, the CMH1 option requires less memory than the CMH option, which can require an enormous amount for large tables.

CMH2
requests the Cochran-Mantel-Haenszel correlation statistic, row mean scores (ANOVA) statistic, and, for 2 ×2 tables, the adjusted Mantel-Haenszel and logit estimates of the odds ratios and relative risks and the corresponding confidence limits. For the stratified 2 ×2 case, PROC FREQ computes the Breslow-Day test for homogeneity of odds ratios. Except for tables with two columns, the CMH2 option requires less memory than the CMH option, which can require an enormous amount for large tables.

CONVERGE=value
specifies the convergence criterion for computing the polychoric correlation when the PLCORR option is specified. The value of the CONVERGE= option must be a positive number; by default, CONVERGE=0.0001. Iterative computation of the polychoric correlation stops when the convergence measure falls below the value of the CONVERGE= option or when the number of iterations specified by the MAXITER= option is exceeded, whichever happens first.

See the section "Polychoric Correlation" for more information.

CUMCOL
displays the cumulative column percentages in the cells of the crosstabulation table.

DEVIATION
displays the deviation of the cell frequency from the expected frequency for each cell of the crosstabulation table. The DEVIATION option is valid for contingency tables but has no effect on tables produced with the LIST option.

FISHER  |  EXACT
requests Fisher's exact test for tables that are larger than 2 ×2. This test is also known as the Freeman-Halton test. For more information, see the section "Fisher's Exact Test" and the "EXACT Statement" section.

If you omit the CHISQ option in the TABLES statement, the FISHER option invokes CHISQ. You can also request Fisher's exact test by specifying the FISHER option in the EXACT statement.

Caution: For tables with many rows or columns or with large total frequency, PROC FREQ may require a large amount of time or memory to compute exact p-values (see the section "Computational Resources").

EXPECTED
displays the expected cell frequencies under the hypothesis of independence (or homogeneity). The EXPECTED option is valid for crosstabulation tables but has no effect on tables produced with the LIST option.

JT
performs the Jonckheere-Terpstra test. For more information, see the section "Jonckheere-Terpstra Test".

LIST
displays two-way to n-way tables in a list format rather than as crosstabulation tables. PROC FREQ ignores the LIST option when you request statistical tests or measures of association.

MAXITER=number
specifies the maximum number of iterations for computing the polychoric correlation when the PLCORR option is specified. The value of the MAXITER= option must be a positive integer; by default, MAXITER=20. Iterative computation of the polychoric correlation stops when the number of iterations specified by the MAXITER= option is exceeded or when the convergence measures fall below the value of the CONVERGE= option, whichever happens first. For more information see the section "Polychoric Correlation".

MEASURES
requests several measures of association and their asymptotic standard errors (ASE). The measures include gamma, Kendall's tau-b, Stuart's tau-c, Somers' D (C|R), Somers' D (R|C), the Pearson and Spearman correlation coefficients, lambda (symmetric and asymmetric), uncertainty coefficients (symmetric and asymmetric), and, for 2 ×2 tables, the odds ratio, column 1 relative risk, column 2 relative risk, and the corresponding confidence limits.

For more information, see the section "Measures of Association".

MISSING
treats missing values as nonmissing and includes them in calculations of percentages and other statistics.

For more information, see the section "Missing Values".

MISSPRINT
displays missing value frequencies for all tables, even though PROC FREQ does not use the frequencies in the calculation of statistics. For more information, see the section "Missing Values".

NOCOL
suppresses the display of column percentages in cells of the crosstabulation table.

NOCUM
suppresses the display of cumulative frequencies and cumulative percentages for one-way frequency tables and for frequencies in list format.

NOFREQ
suppresses the display of cell frequencies for a crosstabulation table. This also suppresses frequencies for row totals.

NOPERCENT
suppresses the display of cell percentages, row total percentages, and column total percentages for a crosstabulation table. For one-way frequency tables and frequencies in list format, the NOPERCENT option suppresses the display of percentages and cumulative percentages.

NOPRINT
suppresses the display of frequency and crosstabulation tables but displays all requested tests and statistics. Use the NOPRINT option in the PROC FREQ statement to suppress the display of all tables.

NOROW
suppresses the display of row percentages in cells of the crosstabulation table.

OUT=SAS-data-set
names the output data set that contains variable values and frequency counts. The variable COUNT contains the frequencies and the variable PERCENT contains the percentages. If more than one table request appears in the TABLES statement, the contents of the data set correspond to the last table request in the TABLES statement. For more information, see the section "Output Data Sets" and see the following descriptions for the options OUTEXPECT and OUTPCT.

OUTEXPECT
includes the expected frequency in the output data set when you specify the OUT= option in the TABLES statement. The variable EXPECTED contains the expected frequency for each table cell.

For more information, see the section "Output Data Sets".

OUTPCT
includes the following additional variables in the output data set when you specify the OUT= option in the TABLES statement:
PCT_COL
the percentage of column frequency

PCT_ROW
the percentage of row frequency

PCT_TABL
the percentage of stratum frequency, for n-way tables where n > 2

For more information, see the section "Output Data Sets".

PLCORR
requests the polychoric correlation coefficient. For 2 ×2 tables, this statistic is more commonly known as the tetrachoric correlation coefficient, and it is labeled as such in the displayed output. If you omit the MEASURES option, the PLCORR option invokes MEASURES. For more information, see the section "Polychoric Correlation" and the descriptions for the CONVERGE= and MAXITER= options in this list.

PRINTKWT
displays the weights PROC FREQ uses to compute the weighted kappa coefficient. You must also specify the AGREE option, which requests the weighted kappa coefficient. You can specify (WT=FC) with the AGREE option to request Fleiss-Cohen weights. By default, PROC FREQ uses Cicchetti-Allison weights.

See the section "Weighted Kappa Coefficient" for more information.

RELRISK
requests relative risk measures and their confidence limits for 2 ×2 tables. These measures include the odds ratio and the column 1 and 2 relative risks. For more information, see the section "Odds Ratio and Relative Risks for 2×2 Tables". You can also obtain the RELRISK measures by specifying the MEASURES option, which produces other measures of association in addition to the relative risks.

RISKDIFF
requests column 1 and 2 risks (or binomial proportions), risk differences, and their confidence limits for 2 ×2 tables. See the section "Risks and Risk Differences" for more information.

SCORES=type
specifies the type of row and column scores that PROC FREQ uses with the Mantel-Haenszel chi-square, Pearson correlation, Cochran-Armitage test for trend, weighted kappa coefficient, and Cochran-Mantel-Haenszel statistics, where type is one of the following (the default is SCORE=TABLE):

By default, the row or column scores are the integers 1,2,... for character variables and the actual variable values for numeric variables. Using other types of scores yields nonparametric analyses.

For more information, see the section "Scores".

SCOROUT
displays the row and the column scores. You specify the score type with the SCORES= option. PROC FREQ uses the scores when it calculates the Mantel-Haenszel chi-square, Pearson correlation, Cochran-Armitage test for trend, weighted kappa coefficient, or Cochran-Mantel-Haenszel statistics. The SCOROUT option displays the row and column scores only when statistics are computed for two-way tables. To store the scores in an output data set, use the Output Delivery System.

For more information, see the section "Scores".

SPARSE
lists all possible combinations of the variable values for an n-way table when n>1, even if a combination does not occur in the data. The SPARSE option has no effect unless you also specify the LIST or OUT= option. When you use the SPARSE and LIST options, PROC FREQ lists any combination of values with a frequency count of zero. When you use the SPARSE and OUT= options, PROC FREQ includes empty crosstabulation table cells in the output data set.

For more information, see the section "Missing Values".

TESTF=(values)
specifies the null hypothesis frequencies for a one-way chi-square test for specified frequencies. You can separate values with blanks or commas. The sum of the frequency values must equal the total frequency for the one-way table. The number of TESTF= values must equal the number of variable levels in the one-way table. List these values in the order in which the corresponding variable levels appear in the output. If you omit the CHISQ option, the TESTF= option invokes CHISQ.

For more information, see the section "Chi-Square Test for One-Way Tables".

TESTP=(values)
specifies the null hypothesis proportions for a one-way chi-square test for specified proportions. You can separate values with blanks or commas. Specify values in probability form as numbers between 0 and 1, where the proportions sum to 1. Or specify values in percentage form as numbers between 0 and 100, where the percentages sum to 100. The number of TESTP= values must equal the number of variable levels in the one-way table. List these values in the order in which the corresponding variable levels appear in the output. If you omit the CHISQ option, the TESTP= option invokes CHISQ.

For more information, see the section "Chi-Square Test for One-Way Tables".

TOTPCT
displays the percentage of total frequency on crosstabulation tables, for n-way tables where n > 2. This percentage is also available with the LIST option or as the PERCENT variable in the OUT= output data set.

TREND
performs the Cochran-Armitage test for trend. The table must be 2 ×C or R ×2. For more information, see the section "Cochran-Armitage Test for Trend".

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.