Example 10.4: DRI/McGraw-Hill Tape Format CITIBASE Files
This example illustrates how to extract daily series from a
sample CITIBASE file. Also, it shows how the OUTSELECT= option affects the
contents of the auxiliary data sets.
The daily series contained in the sample data file CITIDEMO are listed
by the following statements:
proc datasource filetype=citibase infile=citidemo interval=weekday
outall=citiall outby=citikey;
run;
title1 'Summary Information on Daily Data for CITIDEMO File';
proc print data=citikey noobs;
run;
title1 'Daily Series Available in CITIDEMO File';
proc print data=citiall( drop=label );
run;
Output 10.4.1: Printout of the OUTBY= and OUTALL= Data Sets
|
| Summary Information on Daily Data for CITIDEMO File |
| ST_DATE |
END_DATE |
NTIME |
NOBS |
NSERIES |
NSELECT |
| 01JAN1988 |
14MAR1991 |
835 |
835 |
10 |
10 |
|
|
| Daily Series Available in CITIDEMO File |
| Obs |
NAME |
SELECTED |
TYPE |
LENGTH |
VARNUM |
BLKNUM |
FORMAT |
FORMATL |
FORMATD |
ST_DATE |
END_DATE |
NTIME |
NOBS |
CODE |
ATTRIBUT |
NDEC |
| 1 |
DSIUSNYDJCM |
1 |
1 |
5 |
. |
42 |
|
0 |
0 |
04JAN1988 |
14MAR1991 |
834 |
834 |
DSIUSNYDJCM |
1 |
2 |
| 2 |
DSIUSNYSECM |
1 |
1 |
5 |
. |
43 |
|
0 |
0 |
04JAN1988 |
14MAR1991 |
834 |
834 |
DSIUSNYSECM |
1 |
2 |
| 3 |
DSIUSWIL |
1 |
1 |
5 |
. |
44 |
|
0 |
0 |
04JAN1988 |
14MAR1991 |
834 |
834 |
DSIUSWIL |
1 |
2 |
| 4 |
DFXWCAN |
1 |
1 |
5 |
. |
45 |
|
0 |
0 |
01JAN1988 |
14MAR1991 |
835 |
835 |
DFXWCAN |
1 |
4 |
| 5 |
DFXWUK90 |
1 |
1 |
5 |
. |
46 |
|
0 |
0 |
01JAN1988 |
14MAR1991 |
835 |
835 |
DFXWUK90 |
1 |
2 |
| 6 |
DSIUKAS |
1 |
1 |
5 |
. |
47 |
|
0 |
0 |
01JAN1988 |
14MAR1991 |
835 |
835 |
DSIUKAS |
1 |
2 |
| 7 |
DSIJPND |
1 |
1 |
5 |
. |
48 |
|
0 |
0 |
01JAN1988 |
14MAR1991 |
835 |
835 |
DSIJPND |
1 |
2 |
| 8 |
DCP05 |
1 |
1 |
5 |
. |
49 |
|
0 |
0 |
04JAN1988 |
24FEB1989 |
300 |
300 |
DCP05 |
2 |
2 |
| 9 |
DCD1M |
1 |
1 |
5 |
. |
50 |
|
0 |
0 |
04JAN1988 |
08MAR1991 |
830 |
830 |
DCD1M |
1 |
2 |
| 10 |
DTBD3M |
1 |
1 |
5 |
. |
51 |
|
0 |
0 |
04JAN1988 |
08MAR1991 |
830 |
830 |
DTBD3M |
1 |
2 |
|
Note the following from Output 10.4.1:
- The OUTALL= data set reports the time ranges of variables.
- There are ten observations in the OUTALL= data set, the same number
as reported by NSERIES and NSELECT variables in the OUTBY= data set.
- The VARNUM variable contains all MISSING values, since no OUT= data
set is created.
The next step is to demonstrate how the OUTSELECT= option affects the contents
of the OUTBY= and OUTALL= data sets when a KEEP statement is present.
First, set the OUTSELECT= option to OFF.
proc datasource filetype=citibase infile=citidemo interval=weekday
outall=alloff outby=keyoff outselect=off;
keep dsiusnysecm dc:;
run;
title1 'Summary Information on Daily Data for CITIDEMO File';
proc print data=keyoff;
run;
title1 'Daily Series Available in CITIDEMO File';
proc print data=alloff( keep=name kept selected st_date
end_date ntime nobs );
run;
Output 10.4.2: Printout of the OUTBY= and OUTALL= Data Sets with OUTSELECT=OFF
|
| Summary Information on Daily Data for CITIDEMO File |
| Obs |
ST_DATE |
END_DATE |
NTIME |
NOBS |
NSERIES |
NSELECT |
| 1 |
01JAN1988 |
14MAR1991 |
835 |
834 |
10 |
3 |
|
|
| Daily Series Available in CITIDEMO File |
| Obs |
NAME |
KEPT |
SELECTED |
ST_DATE |
END_DATE |
NTIME |
NOBS |
CODE |
| 1 |
DSIUSNYDJCM |
0 |
0 |
04JAN1988 |
14MAR1991 |
834 |
834 |
DSIUSNYDJCM |
| 2 |
DSIUSNYSECM |
1 |
1 |
04JAN1988 |
14MAR1991 |
834 |
834 |
DSIUSNYSECM |
| 3 |
DSIUSWIL |
0 |
0 |
04JAN1988 |
14MAR1991 |
834 |
834 |
DSIUSWIL |
| 4 |
DFXWCAN |
0 |
0 |
01JAN1988 |
14MAR1991 |
835 |
835 |
DFXWCAN |
| 5 |
DFXWUK90 |
0 |
0 |
01JAN1988 |
14MAR1991 |
835 |
835 |
DFXWUK90 |
| 6 |
DSIUKAS |
0 |
0 |
01JAN1988 |
14MAR1991 |
835 |
835 |
DSIUKAS |
| 7 |
DSIJPND |
0 |
0 |
01JAN1988 |
14MAR1991 |
835 |
835 |
DSIJPND |
| 8 |
DCP05 |
1 |
1 |
04JAN1988 |
24FEB1989 |
300 |
300 |
DCP05 |
| 9 |
DCD1M |
1 |
1 |
04JAN1988 |
08MAR1991 |
830 |
830 |
DCD1M |
| 10 |
DTBD3M |
0 |
0 |
04JAN1988 |
08MAR1991 |
830 |
830 |
DTBD3M |
|
Then, set the OUTSELECT= option ON.
proc datasource filetype=citibase infile=citidemo interval=weekday
outall=allon outby=keyon outselect=on;
keep dsiusnysecm dc:;
run;
title1 'Summary Information on Daily Data for CITIDEMO File';
proc print data=keyon;
run;
title1 'Daily Series Available in CITIDEMO File';
proc print data=allon( keep=name kept selected st_date
end_date ntime nobs );
run;
Output 10.4.3: Printout of the OUTBY= and OUTALL= Data Sets with OUTSELECT=ON
|
| Summary Information on Daily Data for CITIDEMO File |
| Obs |
ST_DATE |
END_DATE |
NTIME |
NOBS |
NSERIES |
NSELECT |
| 1 |
04JAN1988 |
14MAR1991 |
834 |
834 |
10 |
3 |
|
|
| Daily Series Available in CITIDEMO File |
| Obs |
NAME |
KEPT |
SELECTED |
ST_DATE |
END_DATE |
NTIME |
NOBS |
CODE |
| 1 |
DSIUSNYSECM |
1 |
1 |
04JAN1988 |
14MAR1991 |
834 |
834 |
DSIUSNYSECM |
| 2 |
DCP05 |
1 |
1 |
04JAN1988 |
24FEB1989 |
300 |
300 |
DCP05 |
| 3 |
DCD1M |
1 |
1 |
04JAN1988 |
08MAR1991 |
830 |
830 |
DCD1M |
|
Comparison of Output 10.4.2 and Output 10.4.3 reveals the following:
- The OUTALL= data set contains ten (NSERIES) observations when OUTSELECT=OFF, and
three (NSELECT) observations when OUTSELECT=ON.
- The observations in OUTALL=ALLON are those for which SELECTED=1 in
OUTALL=ALLOFF.
- The time ranges in the OUTBY= data set are computed
over all the variables (selected or not) for OUTSELECT=OFF,
resulting in ST_DATE='01JAN88'd and END_DATE='14MAR91'd; and over
only the selected variables for OUTSELECT=ON, resulting in
ST_DATE='04JAN88'd and END_DATE='14MAR91'd.
This corresponds to computing
time ranges over all the series reported in the OUTALL= data set.
- The variable NTIME is the number of time periods between ST_DATE and END_DATE,
while NOBS is the number of observations the OUT= data set is to contain.
Thus, NTIME is different depending on whether the OUTSELECT=
option is set to ON or OFF, while NOBS stays the same.
Also the use of the KEEP statement in the last two examples illustrates
the use of an additional variable, KEPT, in the OUTALL= data sets of
Output 10.4.2 and Output 10.4.3. KEPT, which reports the outcome
of the KEEP statement, is only added to the
OUTALL= data set when there is KEEP statement, as
shown in Output 10.4.1.
Adding the RANGE statement to the last example
generates the data sets in Output 10.4.4:
proc datasource filetype=citibase infile=citidemo interval=weekday
outby=keyrange out=citiday outselect=on;
keep dsiusnysecm dc:;
range to '12jan88'd;
run;
title1 'Summary Information
title1 'Daily Data in CITIDEMO File';
proc print data=citiday;
run;
Output 10.4.4: Printout of the OUT=CITIDAY Data Set for FILETYPE=CITIBASE
|
| Summary Information on Daily Data for CITIDEMO File |
| Obs |
ST_DATE |
END_DATE |
NTIME |
NOBS |
NINRANGE |
NSERIES |
NSELECT |
| 1 |
04JAN1988 |
14MAR1991 |
834 |
834 |
7 |
10 |
3 |
|
|
| Daily Data in CITIDEMO File |
| Obs |
DATE |
DSIUSNYSECM |
DCP05 |
DCD1M |
| 1 |
04JAN1988 |
142.900 |
6.81000 |
6.89000 |
| 2 |
05JAN1988 |
144.540 |
6.84000 |
6.85000 |
| 3 |
06JAN1988 |
144.820 |
6.79000 |
6.87000 |
| 4 |
07JAN1988 |
145.890 |
6.77000 |
6.88000 |
| 5 |
08JAN1988 |
137.030 |
6.73000 |
6.88000 |
| 6 |
11JAN1988 |
138.810 |
6.81000 |
6.89000 |
| 7 |
12JAN1988 |
137.740 |
6.73000 |
6.83000 |
|
The OUTBY= data set in this last example contains an additional
variable NINRANGE. This variable is added since there is
a RANGE statement. Its value, 7, is the number of
observations in the OUT= data set. In this case, NOBS gives
the number of observations the OUT= data set would contain
if there were not a RANGE statement.
Note that the OUT= data set does not contain data for 09JAN1988
and 10JAN1988. This is because the WEEKDAY interval skips over weekends.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.