|
Chapter Contents |
Previous |
Next |
| Working with Time Series Data |
Often, a collection of time series are related by a cross-sectional dimension. For example, the national average U.S. consumer price index data shown in the previous example can be disaggregated to show price indexes for major cities. In this case there are several related time series: CPI for New York, CPI for Chicago, CPI for Los Angeles, and so forth. When these time series are considered one data set, the city whose price level is measured is a cross-sectional dimension of the data.
There are two basic ways to store such related time series in a SAS data set. The first way is to use a standard form time series data set with a different variable for each series.
For example, the following statements read CPI series for three major U.S. cities:
data citycpi;
input date monyy7. cpiny cpichi cpila;
format date monyy7.;
datalines;
nov1989 133.200 126.700 130.000
dec1989 133.300 126.500 130.600
jan1990 135.100 128.100 132.100
feb1990 135.300 129.200 133.600
mar1990 136.600 129.500 134.500
apr1990 137.300 130.400 134.200
may1990 137.200 130.400 134.600
jun1990 137.100 131.700 135.000
jul1990 138.400 132.000 135.600
;
The second way is to store the data in a time series cross-sectional form. In this form, the series for all cross sections are stored in one variable and a cross-section ID variable is used to identify observations for the different series. The observations are sorted by the cross-section ID variable and by time within each cross section.
The following statements indicate how to read the CPI series for U.S. cities in time series cross-sectional form:
data cpicity;
input city $11. date monyy7. cpi;
format date monyy7.;
datalines;
Chicago nov1989 126.700
Chicago dec1989 126.500
Chicago jan1990 128.100
Chicago feb1990 129.200
Chicago mar1990 129.500
Chicago apr1990 130.400
Chicago may1990 130.400
Chicago jun1990 131.700
Chicago jul1990 132.000
Los Angeles nov1989 130.000
Los Angeles dec1989 130.600
Los Angeles jan1990 132.100
... etc. ...
New York may1990 137.200
New York jun1990 137.100
New York jul1990 138.400
;
proc sort data=cpicity;
by city date;
run;
When processing a time series cross-section-form data set with most SAS/ETS procedures, use the cross-section ID variable in a BY statement to process the time series separately. The data set must be sorted by the cross-section ID variable and sorted by date within each cross section. The PROC SORT step in the preceding example ensures that the CPICITY data set is correctly sorted.
When the cross-section ID variable is used in a BY statement, each BY group in the data set is like a standard form time series data set. Thus, SAS/ETS procedures that expect a standard form time series data set can process time series cross-sectional data sets when a BY statement is used, producing an independent analysis for each cross section.
It is also possible to analyze time series cross-sectional data jointly. The TSCSREG procedure expects the input data to be in the time series cross-sectional form described here. See Chapter 20 for more information.
|
Chapter Contents |
Previous |
Next |
Top |
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.