OUTBY= Data Set
The OUTBY= data set contains information on the cross sections
contained in the input data file.
These cross sections are represented as BY groups in the OUT= data set.
The OUTBY= data set contains the following variables:
- the BY variables, whose values identify the different
cross sections in the data file. The BY variables depend on
the file type.
- BYSELECT, a numeric variable that reports the outcome
of the WHERE statement
condition for the BY variable values for this observation.
The value of BYSELECT is 1 for BY groups selected
by the WHERE statement for output to the OUT= data set
and is 0 for BY groups that are excluded by the WHERE statement.
BYSELECT is added to the data set only if a WHERE statement is given.
When there is no WHERE statement, then all the BY groups are selected.
- ST_DATE, a numeric variable that gives the starting date for the BY group.
The starting date is the earliest of the starting dates of all the
series that have data for the current BY group.
- END_DATE, a numeric variable that gives the ending date for the BY group.
The ending date is the latest of the ending dates of all the
series that have data for the BY group.
- NTIME, a numeric variable that gives the number of
time periods between ST_DATE and END_DATE, inclusive.
Usually, this is the same as NOBS, but they may differ when time
periods are not equally spaced and when the OUT= data set is not specified.
NTIME is a maximum limit on NOBS.
- NOBS, a numeric variable that gives the number of
time series observations in OUT= data set
between ST_DATE and END_DATE, inclusive. When a given BY
group is discarded by a WHERE statement, the NOBS variable
corresponding to this BY group becomes 0, since the OUT= data
set does not contain any observations for this BY group. Note
that BYSELECT=0 for every discarded BY group.
- NINRANGE, a numeric variable that gives the number of observations
in the range (from,to) defined by the RANGE statement.
This variable is only added to the OUTBY= data
set when the RANGE statement is specified.
- NSERIES, a numeric variable that gives the total number of
unique time series variables having data for the BY group.
- NSELECT, a numeric variable that gives
the total number of selected time series variables
having data for the BY group.
- the generic variables, whose values remain
constant for all the series in the current BY group.
In this list, you can only control the attributes of the BY and GENERIC
variables.
The variables NOBS, NTIME, and NINRANGE give observation counts, while the
variables NSERIES and NSELECT give series counts.
By default, observations for only
the selected BY groups (where BYSELECT=1) are output to the OUTBY= data set, and
the date and time range variables
are computed over only the selected time series variables.
If the OUTSELECT=OFF option is specified, the OUTBY= data set contains an observation
for each BY group, and the date and time range variables
are computed over all the time series variables.
For file types that have no BY variables,
the OUTBY= data set contains one observation giving ST_DATE, END_DATE, NTIME,
NOBS, NINRANGE, NSERIES, and NSELECT for all the series in the file.
If you do not know the BY variable names
or their possible values,
you can do an initial run of PROC DATASOURCE with the OUTBY= option.
The information contained in the OUTBY= data set can help you design
your WHERE expression and RANGE statement for the subsequent executions of
PROC DATASOURCE to obtain different subsets of the same data file.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.