Output Data Sets
The OUTC= Data Set
The OUTC= data set contains two or three character variables
and 4n+4 numeric variables, where n is the
number of axes from DIMENS=n (two by default).
The OUTC= data set contains one observation for each
row, column, supplementary row, and supplementary
column point, and one observation for inertias.
The first variable is named _TYPE_
and identifies the type of observation.
The values of _TYPE_ are as follows:
- The `INERTIA' observation contains the total inertia
in the INERTIA variable, and each dimension's
inertia in the Contr1 -Contrn variables.
- The `OBS' observations contain the coordinates
and statistics for the rows of the table.
- The `SUPOBS' observations contain the coordinates and
statistics for the supplementary rows of the table.
- The `VAR' observations contain the coordinates
and statistics for the columns of the table.
- The `SUPVAR' observations contain the coordinates and
statistics for the supplementary columns of the table.
If you specify the SOURCE option, then the data set also contains a variable
_VAR_ containing the name or label of the input variable
from which that row originates. The name of the next variable
is either _NAME_ or (if you specify an ID statement) the name
of the ID variable.
For observations with a value of `OBS' or `SUPOBS' for the
_TYPE_ variable, the
values of the second variable are constructed as follows:
- When you use a VAR statement without an ID statement,
the values are `Row1', `Row2', and so on.
- When you specify a VAR statement with an ID statement, the
values are set equal to the values of the ID variable.
- When you specify a TABLES statement, the _NAME_ variable has
values formed from the appropriate row variable values.
For observations with a value of `VAR' or `SUPVAR' for the
_TYPE_ variable,
the values of the second variable are equal to the
names or labels of the VAR (or SUPPLEMENTARY) variables.
When you specify a TABLES statement, the values are
formed from the appropriate column variable values.
The third and subsequent variables contain the
numerical results of the correspondence analysis.
- Quality contains the quality of each point's
representation in the DIMENS=n dimensional
display, which is the sum of squared cosines
over the first n dimensions.
- Mass contains the masses or marginal
sums of the relative frequency matrix.
- Inertia contains each point's relative
contribution to the total inertia.
- Dim1 -Dimn contain the point coordinates.
- Contr1 -Contrn contain the
partial contributions to inertia.
- SqCos1 -SqCosn contain the squared cosines.
- Best1 -Bestn and Best contain the summaries
of the partial contributions to inertia.
The OUTF= Data Set
The OUTF= data set contains frequencies and percentages. It is similar
to a PROC FREQ output data set.
The OUTF= data set begins with a variable called _TYPE_, which
contains the observation type. If the SOURCE option is specified, the
data set contains two variables _ROWVAR_ and _COLVAR_
that contain the names or labels of the row and column input variables
from which each cell originates.
The next two variables are classification variables that contain the row
and column levels. If you use TABLES statement input and each variable
list consists of a single variable, the names of the first two variables
match the names of the input variables; otherwise, these variables are
named Row and Column. The next two variables are
Count and Percent, which contain frequencies and
percentages.
The _TYPE_ variable can have the following values:
- `OBSERVED' observations contain the contingency table.
- `SUPOBS' observations contain the supplementary rows.
- `SUPVAR' observations contain the supplementary columns.
- `EXPECTED' observations contain the product of the
row marginals and the column marginals divided by
the grand frequency of the observed frequency table.
For ordinary two-way contingency tables, these
are the expected frequency matrix under the
hypothesis of row and column independence.
- `DEVIATION' observations contain the matrix of
deviations between the observed frequency matrix
and the product of its row marginals and column
marginals divided by its grand frequency.
For ordinary two-way contingency tables, these
are the observed minus expected frequencies under
the hypothesis of row and column independence.
- `CELLCHI2' observations contain contributions
to the total chi-square test statistic.
- `RP' observations contain the row profiles.
- `SUPRP' observations contain supplementary row profiles.
- `CP' observations contain the column profiles.
- `SUPCP' observations contain supplementary column profiles.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.