Chapter Contents

Previous

Next

How the DATA Step Identifies BY Groups

In the DATA step, SAS identifies the beginning and end of each BY group by creating two temporary variables for each BY variable: FIRST.variable and LAST.variable. These temporary variables are available for DATA step programming but are not added to the output data set. Their values indicate whether an observation is

You can take actions conditionally, based on whether you are processing the first or the last observation of a BY group.

When an observation is the first in a BY group, SAS sets the value of the FIRST.variable to 1. For all other observations in the BY group, the value of the FIRST.variable is 0. Likewise, if an observation is the last in a BY group, SAS sets the value of LAST.variable to 1. For all other observations in the BY group, the value of LAST.variable is 0. If the observations are sorted by more than one BY variable, the FIRST.variable for each variable in the BY statement is set to 1 at the first occurrence of a new value for the variable.

This example shows how SAS uses the FIRST.variable and LAST.variable to flag the beginning and end of four BY groups. Six temporary variables are created within the program data vector. These variables can be used during the DATA step, but they do not become variables in the new data set.

In the figure that follows, observations in the SAS data set are arranged in an order that can be used with this BY statement:

by State City ZipCode;

SAS creates the following temporary variables: FIRST.State, LAST.State, FIRST.City, LAST.City, FIRST.ZipCode, and LAST.ZipCode.

FIRST. and LAST. Values for Four BY Groups

[IMAGE]


Chapter Contents

Previous

Next

Top of Page

Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.