Chapter Contents

Previous

Next
SAS/ACCESS Software for PC File Formats: Reference

Understanding DIF Essentials

This chapter introduces SAS System users to DIF files that can be created using software under a variety of microcomputer software packages.

This chapter focuses on the terms and concepts that help you access DIF files with SAS/ACCESS software. Then it describes DIF-specific statements you use in the ACCESS and DBLOAD procedures. Finally, it contains a section on how the SAS/ACCESS interface works.

Data Interchange Format (DIF) files are used by SAS/ACCESS software to access data indirectly from other software products, such as data in Lotus 1-2-3 spreadsheets and database files.


DIF Files

Various software products such as Lotus 1-2-3 enable you to use spreadsheet or database files to enter, organize, and perform calculations on data. Spreadsheets are most often used for general ledgers, income statements, and other types of financial record keeping. Database files also enable you to organize related information, such as, the data in an accounts-receivable journal.

In both spreadsheets and database files, the data are organized according to certain relationships among data items. These relationships are expressed by files in a tabular form, that is, in columns and rows. DIF files allow both character and numeric data in the same column. See DIF File Data Types in this chapter for more information. Each row can hold one data value for each column. The spreadsheet and database files can be translated to DIF files that the SAS/ACCESS interface can process.

A spreadsheet consists of columns and rows, and their intersection is called a cell. Columns and Rows of Data in a DIF File illustrates four columns from the spreadsheet CUSTOMERS and highlights a column and a row.

Columns and Rows of Data in a DIF File

[IMAGE]


DIF File Naming Conventions

DIF filenames must follow operating-environment specific conventions, so check the documentation that comes with your application or operating system software for further information.


DIF File Data Types

Every column in a DIF file has a name and one or two data types. A DIF file allows columns that include both character and numeric data.

Character data are generally considered text and can include dates and numbers if prefixes are used to indicate character data and to align the data in the cell. For example, in Lotus 1-2-3, the value "110 Maple Street uses the double quote prefix and aligns the label on the right side of the cell.

Numeric data include numbers (0 through 9), formulas, and cell entries that begin with one of the following symbols: +, $, @, -, or #. When you create and load a DIF file with PROC DBLOAD, the SAS/ACCESS engine supplies NA for a missing, numeric value. For decimal numbers, the SAS/ACCESS engine queries the operating environment for the current setting of the decimal separator and uses it when reading or creating DIF files.

Numeric data also include date and time values. In Lotus software, a date value is the integer portion of a number that can range from 01 January 1900 to 31 December 2099, that is, 1 to 73050. A Lotus software time value is the decimal portion of a number that represents time as a proportion of a day. For example, 0.0 is midnight, 0.5 is noon, and 0.999988 is 23:59:59 (on a 24-hour clock). While a number can have both a date and a time portion, the formats in Lotus 1-2-3 display a number only in a date format or a time format. For information about how the SAS/ACCESS interface handles date and time values and formats, see How the SAS System Handles Date and Time Values.

When you create an access descriptor, the interface software determines the column type by the value in the first row of data (excluding any rows that are defined for column names, blank rows for readability, and so on). If the first row in the column has no data value, the column type defaults to character data.

By default, any data value in a column that does not match the type is treated as a missing value. However, if you set the DIFNUMS environment variable to YES in your SAS configuration file, any numeric data values in a character column are converted to the character representation of the number and are not treated as missing values. Add the following line to your SAS configuration file to set the DIFNUMS environment variable to YES:

-SET DIFNUMS YES

The default for the DIFNUMS environment variable is NO. Refer to the SAS companion documentation for your operating system for more information about environment variables.

You can change the column type from the type determined by SAS/ACCESS software when you create an access descriptor.


How the SAS System Handles Date and Time Values

As an introduction to how the SAS System handles date and time values in DIF files, this section describes the date and time values in DIF files created from Lotus.

Lotus date and time datetime values and formats are described earlier in this chapter in DIF File Data Types.(footnote 1) The following sections describe how the ACCESS and DBLOAD procedures handle the conversions between the SAS values and the Lotus values and their respective formats.

Datetime Conversions in the ACCESS Procedure

When you create an access descriptor, the SAS System cannot distinguish a Lotus datetime value from other numeric data. The SAS System stores the Lotus datetime value as a number and displays it like other Lotus numeric data by using the SAS variable format 15.2 (the default format for this interface).

To convert a Lotus datetime value to a SAS datetime value, you must specify a SAS datetime format in the access descriptor. A Lotus datetime value is a number that represents the number of days between January 1, 1900, and a specified date; changing the default SAS format (15.2) to a datetime format in the descriptor causes the Lotus value to be converted to an equivalent SAS datetime value based on January 1, 1960. In other words, the Lotus numeric value for January 1, 1960 (which is 21916) is converted to the equivalent SAS representation of January 1, 1960 (which is 0) only if a SAS datetime format is stored in the descriptor for that column. Otherwise, the Lotus value of 21916 is treated as a SAS numeric value of 21916.

The following table shows how the SAS System uses a Lotus datetime value to convert to a SAS datetime format.

Value-to-Format Conversions
For a SAS format SAS System uses
date integer portion of the Lotus number
time decimal portion of the Lotus number
date-and-time integer and decimal portion of the Lotus number


Datetime Conversions in the DBLOAD Procedure

If a SAS variable is specified with a date, time, or datetime format in the FORMAT statement, the interface view engine converts that datetime value into the equivalent Lotus datetime value when the new DIF file is created. However, the DIF file has no way of relating this formatting information to Lotus products. Therefore, when you load the DIF file into a 1-2-3 worksheet, the datetime values are represented as numbers. You should assign (from within Lotus) a Lotus datetime format to any datetime column that you load from a DIF file.

If a SAS variable represents a date, time, or datetime value, but it has not been assigned a SAS datetime format-- the SAS datetime value is represented as a number--the number is not converted into an equivalent Lotus datetime value in the DIF file. Rather, the number is written to the new DIF file as is. Because SAS dates are based on January 1, 1960, and Lotus dates are based on January 1, 1900, if you assign a Lotus datetime format to an unconverted Lotus column, the datetime values in that column are inaccurate.

To maintain a SAS variable format in the input data set, yet change it only while the DBLOAD procedure is in progress, use the DBLOAD FORMAT statement. This statement enables you to assign a temporary format to a SAS variable for the duration of the procedure without affecting how the SAS System stores the variable.

For example, if the SAS format for the BIRTHDAT variable in the MYDATA.SASEMPS data set is left at the default 15.2 format, you can specify the FORMAT statement to change the variable's format to DATE7. while you are creating and loading the DIF file. When you load the DIF file into a 1-2-3 worksheet, you can specify an equivalent Lotus date format. When the DBLOAD procedure has completed, the SAS format for the BIRTHDAT variable reverts to the 15.2 format.

You can specify the FORMAT statement in the PROC DBLOAD statement when you invoke the procedure using any of the methods of processing.

Note:   There are certain display restrictions on the SAS datetime values that are loaded into Lotus 1-2-3 worksheets through DIF files. If you load a SAS variable with a DATETIMEw.d format into a DIF file, Lotus stores the number with both integer and decimal portions. However, when you load the DIF file into a 1-2-3 worksheet and specify a format for the column, you can only specify a date format (that uses the integer portion) or time format (that uses the decimal portion) for that column, not both at the same time.  [cautionend]


FOOTNOTE 1:  In this description, datetime (in lowercase) refers to any value or format that represents a date, a time, or both a date and time. [arrow]


Chapter Contents

Previous

Next

Top of Page

Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.