Example 10.2: BLS Consumer Price Index Surveys
This example compares changes of the prices in
medical care services with respect to different regions for
all urban consumers (SURVEY='CU') since May, 1975. The source
of data is the Consumer Price Index Surveys distributed by the
U.S. Department of Labor, Bureau of Labor Statistics.
An initial run of PROC DATASOURCE gives the descriptive information
on different regions available (the OUTBY= data set), as well as the
series variable name corresponding to medical care services (the
OUTCONT= data set).
filename datafile 'host-specific-file-name' <host-options>;
proc datasource filetype=blscpi interval=month
outby=cpikey outcont=cpicont;
where survey='CU';
run;
title1 'Partial Listing of the OUTBY= Data Set';
proc print data=cpikey noobs;
where upcase(areaname) in
('NORTHEAST','NORTH CENTRAL','SOUTH','WEST');
run;
title1 'Partial Listing of the OUTCONT= Data Set';
proc print data=cpicont noobs;
where index( upcase(label), 'MEDICAL CARE' );
run;
The OUTBY= data set in Output 10.2.1 lists all cross sections
available for the four geographical regions: Northeast (AREA='0100'),
North Central (AREA='0200'), Southern (AREA='0300'), and
Western (AREA='0400'). The OUTCONT= data set gives the variable names
for medical care related series.
Output 10.2.1: Partial Listings of the OUTBY= and OUTCONT= Data Sets
|
| Partial Listing of the OUTBY= Data Set |
| survey |
season |
area |
basptype |
baseper |
st_date |
end_date |
ntime |
nobs |
nseries |
nselect |
surtitle |
areaname |
| CU |
U |
0100 |
A |
DECEMBER 1977=100 |
DEC1966 |
JUL1990 |
284 |
284 |
1 |
1 |
ALL URBAN CONSUM |
NORTHEAST |
| CU |
U |
0100 |
S |
1982-84=100 |
DEC1966 |
JUL1990 |
284 |
284 |
90 |
90 |
ALL URBAN CONSUM |
NORTHEAST |
| CU |
U |
0100 |
S |
DECEMBER 1982=100 |
DEC1982 |
JUL1990 |
92 |
92 |
7 |
7 |
ALL URBAN CONSUM |
NORTHEAST |
| CU |
U |
0100 |
S |
DECEMBER 1986=100 |
DEC1986 |
JUL1990 |
44 |
44 |
1 |
1 |
ALL URBAN CONSUM |
NORTHEAST |
| CU |
U |
0200 |
A |
DECEMBER 1977=100 |
DEC1966 |
JUL1990 |
284 |
284 |
1 |
1 |
ALL URBAN CONSUM |
NORTH CENTRAL |
| CU |
U |
0200 |
S |
1982-84=100 |
DEC1966 |
JUL1990 |
284 |
284 |
90 |
90 |
ALL URBAN CONSUM |
NORTH CENTRAL |
| CU |
U |
0200 |
S |
DECEMBER 1982=100 |
DEC1982 |
JUL1990 |
92 |
92 |
7 |
7 |
ALL URBAN CONSUM |
NORTH CENTRAL |
| CU |
U |
0200 |
S |
DECEMBER 1986=100 |
DEC1986 |
JUL1990 |
44 |
44 |
1 |
1 |
ALL URBAN CONSUM |
NORTH CENTRAL |
| CU |
U |
0300 |
A |
DECEMBER 1977=100 |
DEC1966 |
JUL1990 |
284 |
284 |
1 |
1 |
ALL URBAN CONSUM |
SOUTH |
| CU |
U |
0300 |
S |
1982-84=100 |
DEC1966 |
JUL1990 |
284 |
284 |
90 |
90 |
ALL URBAN CONSUM |
SOUTH |
| CU |
U |
0300 |
S |
DECEMBER 1982=100 |
DEC1982 |
JUL1990 |
92 |
92 |
7 |
7 |
ALL URBAN CONSUM |
SOUTH |
| CU |
U |
0300 |
S |
DECEMBER 1986=100 |
DEC1986 |
JUL1990 |
44 |
44 |
1 |
1 |
ALL URBAN CONSUM |
SOUTH |
| CU |
U |
0400 |
A |
DECEMBER 1977=100 |
DEC1966 |
JUL1990 |
284 |
284 |
1 |
1 |
ALL URBAN CONSUM |
WEST |
| CU |
U |
0400 |
S |
1982-84=100 |
DEC1966 |
JUL1990 |
284 |
284 |
90 |
90 |
ALL URBAN CONSUM |
WEST |
| CU |
U |
0400 |
S |
DECEMBER 1982=100 |
DEC1982 |
JUL1990 |
92 |
92 |
7 |
7 |
ALL URBAN CONSUM |
WEST |
| CU |
U |
0400 |
S |
DECEMBER 1986=100 |
DEC1986 |
JUL1990 |
44 |
44 |
1 |
1 |
ALL URBAN CONSUM |
WEST |
|
|
| Partial Listing of the OUTCONT= Data Set |
| name |
selected |
type |
length |
varnum |
label |
format |
formatl |
formatd |
| ASL5 |
1 |
1 |
5 |
. |
SERVICES LESS MEDICAL CARE |
|
0 |
0 |
| A0L5 |
1 |
1 |
5 |
. |
ALL ITEMS LESS MEDICAL CARE |
|
0 |
0 |
| A5 |
1 |
1 |
5 |
. |
MEDICAL CARE |
|
0 |
0 |
| A51 |
1 |
1 |
5 |
. |
MEDICAL CARE COMMODITIES |
|
0 |
0 |
| A512 |
1 |
1 |
5 |
. |
MEDICAL CARE SERVICES |
|
0 |
0 |
|
The following statements make use
of this information to extract the data for A512 and descriptive information
on cross sections containing A512:
proc format;
value $areafmt '0100' = 'Northeast Region'
'0200' = 'North Central Region'
'0300' = 'Southern Region'
'0400' = 'Western Region';
run;
filename datafile 'host-specific-file-name' <host-options>;
proc datasource filetype=blscpi interval=month
out=medical outall=medinfo;
where survey='CU' and area in ( '0100','0200','0300','0400' );
keep a512;
range from 1980:5;
format area $areafmt.;
rename a512=medcare;
run;
title1 'Information on Medical Care Service';
proc print data=medinfo;
run;
Output 10.2.2: Printout of the OUTALL= Data Set
|
| Information on Medical Care Service |
| Obs |
survey |
season |
area |
basptype |
baseper |
length |
byselect |
name |
kept |
selected |
type |
varnum |
blknum |
label |
format |
formatl |
formatd |
st_date |
end_date |
ntime |
nobs |
ninrange |
surtitle |
areaname |
s_code |
units |
ndec |
| 1 |
CU |
U |
Northeast Region |
S |
1982-84=100 |
5 |
1 |
MEDCAR |
1 |
1 |
1 |
7 |
3479 |
MEDICAL CARE SERVICES |
|
0 |
0 |
DEC1977 |
JUL1990 |
152 |
152 |
123 |
ALL URBAN CONSUM |
NORTHEAST |
CUUR0100SA512 |
|
1 |
| 2 |
CU |
U |
North Central Region |
S |
1982-84=100 |
5 |
1 |
MEDCAR |
1 |
1 |
1 |
7 |
3578 |
MEDICAL CARE SERVICES |
|
0 |
0 |
DEC1977 |
JUL1990 |
152 |
152 |
123 |
ALL URBAN CONSUM |
NORTH CENTRAL |
CUUR0200SA512 |
|
1 |
| 3 |
CU |
U |
Southern Region |
S |
1982-84=100 |
5 |
1 |
MEDCAR |
1 |
1 |
1 |
7 |
3677 |
MEDICAL CARE SERVICES |
|
0 |
0 |
DEC1977 |
JUL1990 |
152 |
152 |
123 |
ALL URBAN CONSUM |
SOUTH |
CUUR0300SA512 |
|
1 |
| 4 |
CU |
U |
Western Region |
S |
1982-84=100 |
5 |
1 |
MEDCAR |
1 |
1 |
1 |
7 |
3776 |
MEDICAL CARE SERVICES |
|
0 |
0 |
DEC1977 |
JUL1990 |
152 |
152 |
123 |
ALL URBAN CONSUM |
WEST |
CUUR0400SA512 |
|
1 |
|
Note that only the cross sections with BASEPER='1982-84=100' are
listed in the OUTALL= data set (see Output 10.2.2). This is
because only those cross sections contain data for MEDCARE.
The OUTALL= data set indicates that data values are
stored with one decimal place (see the NDEC variable). Therefore,
they need to be rescaled, as follows:
data medical;
set medical;
medcare = medcare * 0.1;
run;
The variation of MEDCARE against DATE with respect to different geographic
regions can be demonstrated graphically, as follows:
Output 10.2.3: Plot of Time Series in the OUT= Data Set for FILETYPE=BLSCPI
This example illustrates the following features:
- Descriptive information needed to write KEEP and WHERE statements
can be obtained with an initial run of the DATASOURCE procedure.
- The OUTCONT= and OUTALL= data sets may contain information on how
data values are stored, such as the precision, the units, and so on.
- The OUTCONT= and OUTALL= data sets report the new series names assigned
by the RENAME statement, not the old names (see the NAME variable
in Output 10.2.2).
- You can use PROC FORMAT to define formats for series or BY
variables to enhance your output. Note that PROC DATASOURCE
associated a permanent format, $AREAFMT., with the BY variable
AREA. As a result, the formatted values are displayed in
the printout of the OUTALL=MEDINFO data set (see Output 10.2.2)
and in the legend created by PROC GPLOT.
- The base period for all the geographical areas is the same
(BASEPER='1982-84=100') as indicated by the intersections of plots
with the horizontal reference line drawn at 100.
This makes comparisons meaningful.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.