Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Working with Time Series Data

Cross-sectional Dimensions and BY Groups

Often, a collection of time series are related by a cross-sectional dimension. For example, the national average U.S. consumer price index data shown in the previous example can be disaggregated to show price indexes for major cities. In this case there are several related time series: CPI for New York, CPI for Chicago, CPI for Los Angeles, and so forth. When these time series are considered one data set, the city whose price level is measured is a cross-sectional dimension of the data.

There are two basic ways to store such related time series in a SAS data set. The first way is to use a standard form time series data set with a different variable for each series.

For example, the following statements read CPI series for three major U.S. cities:

   data citycpi;
      input date monyy7. cpiny cpichi cpila;
      format date monyy7.;
   datalines;
   nov1989  133.200  126.700  130.000
   dec1989  133.300  126.500  130.600
   jan1990  135.100  128.100  132.100
   feb1990  135.300  129.200  133.600
   mar1990  136.600  129.500  134.500
   apr1990  137.300  130.400  134.200
   may1990  137.200  130.400  134.600
   jun1990  137.100  131.700  135.000
   jul1990  138.400  132.000  135.600
   ;

The second way is to store the data in a time series cross-sectional form. In this form, the series for all cross sections are stored in one variable and a cross-section ID variable is used to identify observations for the different series. The observations are sorted by the cross-section ID variable and by time within each cross section.

The following statements indicate how to read the CPI series for U.S. cities in time series cross-sectional form:

   data cpicity;
      input city $11. date monyy7. cpi;
      format date monyy7.;
   datalines;
   Chicago      nov1989  126.700
   Chicago      dec1989  126.500
   Chicago      jan1990  128.100
   Chicago      feb1990  129.200
   Chicago      mar1990  129.500
   Chicago      apr1990  130.400
   Chicago      may1990  130.400
   Chicago      jun1990  131.700
   Chicago      jul1990  132.000
   Los Angeles  nov1989  130.000
   Los Angeles  dec1989  130.600
   Los Angeles  jan1990  132.100
    ... etc. ...
   New York     may1990  137.200
   New York     jun1990  137.100
   New York     jul1990  138.400
   ;
   
   proc sort data=cpicity;
      by city date;
   run;

When processing a time series cross-section-form data set with most SAS/ETS procedures, use the cross-section ID variable in a BY statement to process the time series separately. The data set must be sorted by the cross-section ID variable and sorted by date within each cross section. The PROC SORT step in the preceding example ensures that the CPICITY data set is correctly sorted.

When the cross-section ID variable is used in a BY statement, each BY group in the data set is like a standard form time series data set. Thus, SAS/ETS procedures that expect a standard form time series data set can process time series cross-sectional data sets when a BY statement is used, producing an independent analysis for each cross section.

It is also possible to analyze time series cross-sectional data jointly. The TSCSREG procedure expects the input data to be in the time series cross-sectional form described here. See Chapter 20 for more information.

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.