Chapter Contents |
Previous |
Next |
Working with Time Series Data |
Often, a collection of time series are related by a cross-sectional dimension. For example, the national average U.S. consumer price index data shown in the previous example can be disaggregated to show price indexes for major cities. In this case there are several related time series: CPI for New York, CPI for Chicago, CPI for Los Angeles, and so forth. When these time series are considered one data set, the city whose price level is measured is a cross-sectional dimension of the data.
There are two basic ways to store such related time series in a SAS data set. The first way is to use a standard form time series data set with a different variable for each series.
For example, the following statements read CPI series for three major U.S. cities:
data citycpi; input date monyy7. cpiny cpichi cpila; format date monyy7.; datalines; nov1989 133.200 126.700 130.000 dec1989 133.300 126.500 130.600 jan1990 135.100 128.100 132.100 feb1990 135.300 129.200 133.600 mar1990 136.600 129.500 134.500 apr1990 137.300 130.400 134.200 may1990 137.200 130.400 134.600 jun1990 137.100 131.700 135.000 jul1990 138.400 132.000 135.600 ;
The second way is to store the data in a time series cross-sectional form. In this form, the series for all cross sections are stored in one variable and a cross-section ID variable is used to identify observations for the different series. The observations are sorted by the cross-section ID variable and by time within each cross section.
The following statements indicate how to read the CPI series for U.S. cities in time series cross-sectional form:
data cpicity; input city $11. date monyy7. cpi; format date monyy7.; datalines; Chicago nov1989 126.700 Chicago dec1989 126.500 Chicago jan1990 128.100 Chicago feb1990 129.200 Chicago mar1990 129.500 Chicago apr1990 130.400 Chicago may1990 130.400 Chicago jun1990 131.700 Chicago jul1990 132.000 Los Angeles nov1989 130.000 Los Angeles dec1989 130.600 Los Angeles jan1990 132.100 ... etc. ... New York may1990 137.200 New York jun1990 137.100 New York jul1990 138.400 ; proc sort data=cpicity; by city date; run;
When processing a time series cross-section-form data set with most SAS/ETS procedures, use the cross-section ID variable in a BY statement to process the time series separately. The data set must be sorted by the cross-section ID variable and sorted by date within each cross section. The PROC SORT step in the preceding example ensures that the CPICITY data set is correctly sorted.
When the cross-section ID variable is used in a BY statement, each BY group in the data set is like a standard form time series data set. Thus, SAS/ETS procedures that expect a standard form time series data set can process time series cross-sectional data sets when a BY statement is used, producing an independent analysis for each cross section.
It is also possible to analyze time series cross-sectional data jointly. The TSCSREG procedure expects the input data to be in the time series cross-sectional form described here. See Chapter 20 for more information.
Chapter Contents |
Previous |
Next |
Top |
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.