Chapter Contents

Previous

Next

Examples: Fixing Problems


Example 2: Reading Two-Digit Year Values in Raw Data

This example demonstrates how to use the YEARCUTOFF= system option to read a raw data file that contains two-digit year values. Here are the input data records:

   ADAMS,GERALD,M,819-68-375,01MAY99
   AVERY,JERRY,M,944-69-923,01MAY02
   BLALOCK,RALPH,M,189-62-924,01MAY02
   BRADLEY,JEREMY,M,142-96-693,01MAY00
   BURNETTE,THOMAS,M,922-64-294,01MAY99
   CARTER,DOROTHY,F,441-87-333,01MAY00
   CHIN,JACK,M,438-82-579,01MAY99
   DAVIDSON,JASON,M,617-00-610,01MAY99
   DENNIS,ROGER,M,342-14-341,01MAY99
   EDGERTON,JOSHUA,M,778-62-406,01MAY00
In the DATA step, the YEARCUTOFF= system option is set to 1900 and produces unintended graduation dates.
options YEARCUTOFF=1900 nodate pageno=1 linesize=80 pagesize=60;  
data students;
    infile 'students-data-file' dlm=',';
    input lname: $15. fname: $15. gender: $1. idnum: $11.
          graddate: date7.;
    format graddate date9.;
run;

proc print data=students noobs label;
    title1 'Projected Graduation Dates';
    title2 'Report Date: Sep 30, 1999';
    label lname='Last Name'
          fname='First Name'
          gender='Gender'
          idnum='Student Id'
          graddate='Projected Graduation';
run;
The following output shows the resulting dates in the last column of the report.

Output Data Set Showing Errors Due to Incorrect YEARCUTOFF= Value of 1900
                 Projected Graduation Dates                1
                     Report Date: Sept 30, 1999

 Last        First                              Projected
 Name        Name       Gender    Student Id    Graduation

 ADAMS       GERALD       M       819-68-375    01MAY1999 
 AVERY       JERRY        M       944-69-923    01MAY1999 
 BLALOCK     RALPH        M       189-62-924    01MAY1902 
 BRADLEY     JEREMY       M       142-96-693    01MAY1900    
 BURNETTE    THOMAS       M       922-64-294    01MAY1999 
 CARTER      DOROTHY      F       441-87-333    01MAY1900 
 CHIN        JACK         M       438-82-579    01MAY1999 
 DAVIDSON    JASON        M       617-00-610    01MAY1999 
 DENNIS      ROGER        M       342-14-341    01MAY1999 
 EDGERTON    JOSHUA       M       778-62-406    01MAY1900 

To correct the problem in the resulting dates, specify the appropriate YEARCUTOFF= value and rerun the DATA step.

options YEARCUTOFF=1950; 
This next output results from the same DATA step program with the OPTIONS statement containing a YEARCUTOFF= value set to 1950. Note that for the raw data shown in this example, the YEARCUTOFF= system option will yield correct results when set within a range of 1903 to 1998.

Output Data Set Showing Corrected Date Values Due to Setting the YEARCUTOFF= Value to 1950
                 Projected Graduation Dates                  1
                     Report Date: Sept 30, 1999

 Last        First                              Projected
 Name        Name       Gender    Student Id    Graduation

 ADAMS       GERALD       M       819-68-375    01MAY1999 
 AVERY       JERRY        M       944-69-923    01MAY1999 
 BLALOCK     RALPH        M       189-62-924    01MAY2002 
 BRADLEY     JEREMY       M       142-96-693    01MAY2000 
 BURNETTE    THOMAS       M       922-64-294    01MAY1999 
 CARTER      DOROTHY      F       441-87-333    01MAY2000 
 CHIN        JACK         M       438-82-579    01MAY1999 
 DAVIDSON    JASON        M       617-00-610    01MAY1999 
 DENNIS      ROGER        M       342-14-341    01MAY1999 
 EDGERTON    JOSHUA       M       778-62-406    01MAY2000 

Note:   The new value for YEARCUTOFF remains in effect for the remainder of the SAS job or SAS session, or until it is changed again.  [cautionend]


Example 3: Expanding Dates in External Files

When you make structural changes to stored data, you must modify all applications that access the changed data in order to accommodate the new data format or data definition. It may be less disruptive to add expanded fields to the end of the record and leave the rest unchanged.

If you decide you must alter the contents of an external file, you can use a DATA step to modify or correct the stored data. Changing two-digit years to four-digit years in raw data files by adding the century is called field expansion. Additional disk space is required to store the expanded data.

In the following example, flight dates are stored in a mmddyy format (030195) in a raw data file in columns 4-9. We'll expand the date field from 6 to 8 columns by inserting the century values (03011995).

Here's a partial listing of the original raw data file:

----|----10---|----20---|----30---|----40---|----50
182030195 8:21LGAYYZ  366 458 390104 16  3123178  
114030195 7:10LGALAX2,475 357 390172 18  6196210  
20203019510:43LGAORD  740 369 244151 11  5157210  
219030195 9:31LGAFRA3,442 412 334198 17  7222250  
43903019512:16LGALAX2,475 422 267167 13  5185210  
38703019511:40LGACPH3,856 423 398152  8  3163250  
290030195 6:56LGAWAS  229 327 253 96 16  7117180  
52303019515:19LGAORD  740 476 456177 20  3185210  
98203019510:28LGADFW1,383 383 355 49 19  2 56180  
62203019512:19LGALON3,857 255 243207 15  5227250  
The following program demonstrates field-expansion:
options yearcutoff=1950 nodate pageno=1 linesize=80 pagesize=60;
   data _null_;
     infile 'march.dat' truncover;
     input @1 string1 $char3. 
           @4 date mmddyy6.
           @10 string2 $char40.;
     file 'march.dat';
     put   @1 string 1 $char3.
           @4 date mmddyyn8.
           @12 string2 $char40.;
run; 

Here is a partial listing of MARCH2.DAT after expanding the two-digit year values to four digits:

----|----10---|----20---|----30---|----40---|----50
18203011995 8:21LGAYYZ  366 458 390104 16  3123178
11403011995 7:10LGALAX2,475 357 390172 18  6196210
2020301199510:43LGAORD  740 369 244151 11  5157210
21903011995 9:31LGAFRA3,442 412 334198 17  7222250
4390301199512:16LGALAX2,475 422 267167 13  5185210
3870301199511:40LGACPH3,856 423 398152  8  3163250
29003011995 6:56LGAWAS  229 327 253 96 16  7117180
5230301199515:19LGAORD  740 476 456177 20  3185210
9820301199510:28LGADFW1,383 383 355 49 19  2 56180
6220301199512:19LGALON3,857 255 243207 15  5227250


Example 4: Converting Dates Stored As Character Values to SAS Date Values

This example converts dates in character variables in a SAS data set, to SAS date values in a SAS data set. The Birth dates in the SAS data set YR20.SENIORS are stored in the character variable BIRTHDATE using a DDMMMYYYY format (18SEP1898). Some records contain two-digit years, others use four-digit years:

Listing of Seniors Birth Dates
     LNAME         FNAME      GENDER    BIRTHDATE        ID
    ADAMS         GERALD       M       04DEC96      259398646
    AVERY         JERRY        M       27JAN25      531691723
    BLALOCK       RALPH        M       06APR1888    523870522
    BRADLEY       JEREMY       M       26SEP1922    297193965
    BURNETTE      THOMAS       M       16FEB1921    226507519
    CARTER        DOROTHY      F       05APR18      287225611
    CHIN          JACK         M       24APR19      590364670
    DAVIDSON      JASON        M       16FEB1918    506603530
    DENNIS        ROGER        M       11MAR26      297222847
    EDGERTON      JOSHUA       M       10JAN06      168088351
The following SAS program uses the INPUT function to convert the character dates to SAS date values to enable chronological sequencing in a report. Because some birth dates are before 1900, note that the YEARCUTOFF= value must be adjusted.
 options yearcutoff=1880;
 options nodate pageno=1 linesize=80 pagesize=60; 
 data seniors(drop=tempvar);
    set yr20_seniors(rename=(Birthdate=tempvar));
    Birthdate=input(tempvar,date9.);
 run;

Note:   The DATE9. informat correctly reads two-digit as well as four-digit years.  [cautionend]
The following program illustrates how to print the WORK.SENIORS data set. The birth dates, in the BIRTHDATE variable, are written with a DATE9. format.

proc print data=seniors noobs;
   format birthdate date9. ID ssn.;
   title 'Listing of Seniors Birth Dates';
run;
The PROC PRINT step produces the listing of the reformatted WORK.SENIORS data set:

Reformatted Data Set WORK.SENIORS
                         Listing of Seniors Birth Dates                        1

          Lastname    Firstname    Gender        SSN        Birthdate

          ADAMS        GERALD        M       259-39-8646    04DEC1896
          AVERY        JERRY         M       531-69-1723    27JAN1925
          BLALOCK      RALPH         M       523-87-0522    06APR1888
          BRADLEY      JEREMY        M       297-19-3965    26SEP1922
          BURNETTE     THOMAS        M       226-50-7519    16FEB1921
          CARTER       DOROTHY       F       287-22-5611    05APR1918
          CHIN         JACK          M       590-36-4670    24APR1919
          DAVIDSON     JASON         M       506-60-3530    16FEB1918
          DENNIS       ROGER         M       297-22-2847    11MAR1926
          EDGERTON     JOSHUA        M       168-08-8351    10JAN1906


Example 5: Converting Data Stored As Simple Numeric Values To SAS Date Values

This example shows how the SAS System recognizes date values, which are stored as simple numeric variables, and displays them as more familiar date values in SAS data sets. Even though SAS is able to present the date values in many formats, the SAS system stores each date as a SAS date, which is a value equal to the number of days between January 1, 1960 and the calendar date. In the following SAS data set YR20.JUNIORS birth dates are stored as numeric variables in Julian (BDJULIAN) and numeric YYYYMMDD format (BDCOBOL). Note that in the first observation, the number 2003116 is a Julian date that represents the 116th day of year 2003. This date will be recognized by the SAS System, stored as a SAS date, and displayed in a more familiar DDMMMYY date format. Here is a partial listing of the YR20.JUNIORS data set:

                  
 Last Name     First Name   Gender     BDJULIAN   BDCOBOL  
 ALEXANDER     SUSAN          F        2003116    20030426
 BAREFOOT      JOSEPH         M        1997103    19970413
 BOSTIC        MARIE          F        1998143    19980523
 BRADY         CHRISTINE      F        2000289    20001015
 CAHILL        MARSHALL       M        2004305    20041031
 CARTER        KAREN          F        1999068    19990309
 CHOW          JANE           F        1998293    19981020
 DEAN          SANDRA         F        1997111    19970421
 DONALDSON     KAREN          F        1999030    19990130
 FERNANDEZ     KATRINA        F        1997042    19970211
 FOSTER        GERALD         M        1999213    19990801
 GRAHAM        ALVIN          M        1997320    19971116
 HARRISON      FELICIA        F        1999147    19990527
 HOWARD        GRETCHEN       F        1999298    19991025
The following program example takes the numeric date values, converts the Julian dates with the DATEJUL function and converts the COBOL dates with the PUT and INPUT functions, and then prints the results.
options nodate pageno=1 linesize=80 pagesize=60 yearcutoff=1960;

data juniors;
   set yr20_juniors;
   sasdate1=datejul(bdjulian);
   sasdate2=input(put(bdcobol,8.),yymmdd8.);
run;

title 'Birth Dates converted to SAS dates';
proc print data=juniors (obs=14) noobs;
   format sasdate1 sasdate2 date9.;
run;
The following output displays the first 14 observations in the output data set:

Output Showing Converted Dates from YR20.JUNIORS Data Set
                       Birth Dates converted to SAS dates                      1

  Lname       Fname       Gender   bdjulian    bdcobol    sasdate1    sasdate2

  ALEXANDER   SUSAN         F       2003116   20030426   26APR2003   26APR2003
  BAREFOOT    JOSEPH        M       1997103   19970413   13APR1997   13APR1997
  BOSTIC      MARIE         F       1998143   19980523   23MAY1998   23MAY1998
  BRADY       CHRISTINE     F       2000289   20001015   15OCT2000   15OCT2000
  CAHILL      MARSHALL      M       2004305   20041031   31OCT2004   31OCT2004
  CARTER      KAREN         F       1999068   19990309   09MAR1999   09MAR1999
  CHOW        JANE          F       1998293   19981020   20OCT1998   20OCT1998
  DEAN        SANDRA        F       1997111   19970421   21APR1997   21APR1997
  DONALDSON   KAREN         F       1999030   19990130   30JAN1999   30JAN1999
  FERNANDEZ   KATRINA       F       1997042   19970211   11FEB1997   11FEB1997
  FOSTER      GERALD        M       1999213   19990801   01AUG1999   01AUG1999
  GRAHAM      ALVIN         M       1997320   19971116   16NOV1997   16NOV1997
  HARRISON    FELICIA       F       1999147   19990527   27MAY1999   27MAY1999
  HOWARD      GRETCHEN      F       1999298   19991025   25OCT1999   25OCT1999


Chapter Contents

Previous

Next

Top of Page

Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.