Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The DATASOURCE Procedure

OUT= Data Set

The OUT= data set can contain the following variables:

The values of BY variables remain constant in each cross section. Observations within each BY group correspond to the sampling of the series variables at the time periods indicated by the DATE variable.

You can create a set of single indexes for the OUT= data set by using the INDEX option, provided there are BY variables. Under some circumstances, this may increase the efficiency of subsequent PROC and DATA steps that use BY and WHERE statements. However, there is a cost associated with creation and maintenance of indexes. The SAS Language: Reference, Version 7, First Edition lists the conditions under which the benefits of indexes outweigh the cost.

With data files containing cross sections, there can be various degrees of overlap among the series variables. One extreme is when all the series variables contain data for all the cross sections. In this case, the output data set is very compact. In the other extreme case, however, the set of time series variables are unique for each cross section, making the output data set very sparse, as depicted in Figure 10.8.

BY Series in Series in ... Series in
Variables first BY group second BY group ... last BY group
BY1 ... BYP F1 F2 F3 ... FN S1 S2 S3 ... SM ... T1 T2 T3 ... TK
BY 
group 
1 
BY data is missing
group everywhere except
2 in these boxes
   
{\vdots} {\vdots} 
   
BY 
group 
N 

Figure 10.8: The OUT= Data Set containing unique Series for each BY Group

The data in Figure 10.8 can be represented more compactly if cross-sectional information is incorporated into series variable names.

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.