Chapter Contents

Previous

Next
SAS/ACCESS Software for PC File Formats: Reference

Understanding XLS Essentials

This chapter introduces SAS System users to XLS files. It focuses on the terms and concepts that help you use the SAS/ACCESS interface and includes descriptions of

XLS files contain data in the form of Microsoft Excel spreadsheets. Unless otherwise noted, Excel 4 and Excel 5 files are referred to collectively throughout this report as XLS. Excel 5 files are not supported under OS/2.

Note:   The ACCESS and DBLOAD procedures do not support files in the Excel 97 (Version 8) format. However, you can still access Excel 97 files through the SAS Import/Export facility. From a SAS session's PROGRAM EDITOR window, select the File menu and then select the Import Data or Export Data item. Information about how to import and export Excel data is available from the [Help] button. To write code to import or export Excel data, refer to the IMPORT or EXPORT procedure description in the SAS Procedures Guide.  [cautionend]


XLS Files

Various software products, such as the Microsoft Excel spreadsheet, enable you to use spreadsheet or database files to enter, organize, and perform calculations on data. Spreadsheets are most often used for general ledgers, income statements, and other types of financial record keeping. Database files also enable you to organize related information, such as the data in an accounts-receivable journal.

In spreadsheets, the data are organized according to certain relationships among data items. These relationships are expressed in a tabular form--in columns and rows. Each column represents one category of data, and each row can hold one data value for each column.

A Microsoft Excel 5.0 worksheet, for example, is an electronic spreadsheet consisting of a grid of 256 columns and 16,384 rows. The intersection of a column and a row is called a cell. Columns and Rows of Data in an XLS File illustrates a portion of a standard Excel worksheet.

Columns and Rows of Data in an XLS File

[IMAGE]

Column letters for each column appear above the worksheet. Columns are lettered A through IV (A to Z, AA to AZ, BA to BZ, and so on to IV). Row numbers for each row appear to the left of the worksheet. Rows are numbered 1 to 16,384. For Excel 4 files, only one worksheet (worksheet 1) is allowed per file, but more than one worksheet can be stored in a workbook. You must convert any worksheets you store in a workbook back to worksheets before you can use the data in a SAS program.

A range is a subset of cells in a worksheet. A range is identified by its address, which begins with the name of the top left cell and ends with the name of the bottom right cell separated by two periods. For example, the range B3..D6 is the range address for a rectangular block of 12 cells whose top left cell is B3 and whose bottom right cell is D6 (as shaded in the display).

XLS File Naming Conventions

The following conventions apply to XLS filenames. Filenames must also follow operating-system specific conventions, so check the documentation that comes with your Microsoft Excel product or other software products for further information.


XLS Data Types

Microsoft Excel software has two data types: character and numeric. Microsoft Excel character data may be entered as labels or formula strings; Microsoft Excel numeric data may be entered as numbers or formulas.

Character data are generally considered text and can include dates and numbers.

Numeric data can include numbers (0 through 9), formulas, and cell entries that begin with one of the following symbols: +, $, @, -, =, or #. When you create and load an Excel file with PROC DBLOAD, the SAS/ACCESS engine supplies #NA for a missing, numeric value.

Numeric data also can include date and time values. In Microsoft Excel software, a date value is the integer portion of a number that can range from 01 January 1900 to 31 December 2078, that is, 1 to 65380. A Microsoft Excel software time value is the decimal portion of a number that represents time as a proportion of a day. For example, 0.0 is midnight, 0.5 is noon, and 0.999988 is 23:59:59 (on a 24-hour clock). While a number can have both a date and a time portion, the formats in Microsoft Excel display a number only in a date, time, or datetime format. For information on how the SAS/ACCESS interface handles date and time values and formats, see How the SAS System Handles Date and Time Values.

When you create an access descriptor, the interface software uses the column types and formats in the XLS file to determine the corresponding SAS variable formats. The SAS System generates its default formats based on the values that you specify for the SCANTYPE, SKIPROWS, and GETNAMES statements (or in the corresponding fields in the Access Descriptor Identification window). You can change the formats generated by the software interface. For more information, see How the SAS/ACCESS Interface Works.

When you create an access descriptor, any data value that does not match the column type (character or numeric) is treated as a missing value. This is the default action. However, you can use the MIXED=YES statement to convert numeric data values in a character column to their character representation.

You can also set the SS_MIXED environment variable to YES in your SAS configuration file so that both numeric and character data are displayed as SAS character data. Add this line to your SAS configuration file:

-SET SS_MIXED YES

See Setting Environment Variables for more information on environment variables. For more information on changing the column type from the type determined by SAS/ACCESS software when you create an access descriptor, refer to the sections on XLS-specific procedure statements later in this chapter.


How the SAS System Handles Date and Time Values

The conversion of date and time values between SAS data sets and Microsoft Excel spreadsheets is transparent to users. However, you are encouraged to understand the differences between them.

Microsoft Excel date and time values and formats are described in XLS Data Types.

Datetime Conversions in the ACCESS Procedure

As described earlier in this chapter, an XLS date value is the integer portion of a number that represents the number of days between January 1, 1900 and a specified date. An XLS time value is a decimal portion of a number that represents time as a portion of the day. For example, 0.0 is 12:00:00 a.m., and 0.9999884 is 11:59:59 p.m. While a number can have both a date and a time portion, the formats in XLS display a number only in a date format or in a time format. For example, for 1:00 p.m., March 12, 1994, the XLS date value is 34405, the time value is 0.5416667, and the datetime value is 34405.5416667.(footnote 1)

The SAS System handles date and time values differently than XLS. A SAS date value is an integer that represents the number of days between January 1, 1960 and a specified date. A SAS time value is an integer that represents the number of seconds since midnight of the current day. When a date and a time are both present, the SAS System stores the value as the number of seconds since midnight, January 1, 1960. For example, for 1:00 p.m., March 12, 1994, the SAS date value is12489, and the SAS time value is 46800. Therefore, the SAS datetime value is 1079096400.

When you create an access descriptor, the SAS System converts an XLS datetime format to its corresponding SAS datetime format if an XLS datetime format is specified for the variable in the XLS file. Note that if the datetime value does not have an XLS format in the XLS file, the SAS System treats the datetime value like a numeric value.

To convert an XLS datetime format to a SAS datetime format, you need a SAS datetime format in the access descriptor. For example, changing the default SAS numeric format (15.2) to a SAS date format in the descriptor causes the XLS date value (based on January 1, 1900) to be converted to an equivalent SAS date value (based on January 1, 1960). In other words, the XLS numeric value for January 1, 1960 (which is 21916) is converted to the equivalent SAS representation of January 1, 1960 (which is 0) only if a SAS datetime format is assigned in the descriptor for that column. Otherwise, the XLS value of 21916 is treated as a SAS numeric value of 21916.

The following table shows how the SAS System uses a Microsoft Excel datetime value to convert to a SAS datetime format.

Value-to-Format Conversions
For a SAS format SAS System uses
date integer portion of the Microsoft Excel number
time decimal portion of the Microsoft Excel number
date-and-time integer and decimal portion of the Microsoft Excel number


Datetime Conversions in the DBLOAD Procedure

If a SAS variable is specified with a date, time, or datetime format in the FORMAT statement, the interface view engine converts that SAS datetime format into the equivalent Microsoft Excel datetime format when the new XLS file is created.

However, if a SAS datetime format is not specified in the input SAS data set, you have to assign a format by using a PROC DBLOAD FORMAT statement. Doing so assigns a Microsoft Excel datetime format to the SAS variable when the variable is loaded into an XLS file. If you do not assign a SAS datetime format, the SAS numeric value for the date is written to the XLS file. Because SAS dates are based on January 1, 1960, and Microsoft Excel dates are based on January 1, 1900, the date value in the XLS file will be inaccurate.

To maintain a SAS variable format in the input data set, yet change it just while the DBLOAD procedure is in progress, use the FORMAT statement in PROC DBLOAD. This statement enables you to assign a temporary format to a SAS variable for the duration of the procedure without affecting the input SAS data set.

For example, if the SAS format for the BIRTHDAT variable in the MYDATA.SASEMPS access descriptor is left at the default 15.2 format, you can specify the FORMAT statement to change the variable's format to DATE7. while you are creating and loading the XLS file. When you load the XLS file, the DATE7. format becomes an equivalent Microsoft column format, DDMMMYY. When the DBLOAD procedure has completed, the SAS format for the BIRTHDAT variable returns to the 15.2 format.

You can specify the FORMAT statement in the PROC DBLOAD statement when you invoke the procedure using any of the methods of processing.


ACCESS Procedure Data Conversions

Use PROC ACCESS to define descriptors that identify spreadsheet data and the conversions necessary to use the data in SAS programs. The Microsoft Excel label data type is formatted as a SAS character type, and the Microsoft Excel number data type is formatted as a SAS numeric type.

Fonts, attributes, and colors in the XLS files are not read into the SAS data sets. However, the ACCESS procedure supports most of the XLS number formats and automatically converts them to the corresponding SAS formats. Any XLS data strings longer than 200 characters are truncated while being converted into SAS data sets, and any SAS data file created from XLS files can contain up to 256 variables and 16,384 observations.

Default SAS System Variable Formats for XLS File Data shows the default SAS System variable formats that the ACCESS procedure assigns to each type of standard XLS file data. Default SAS System Variable Formats for Customized XLS Format Strings provides SAS System variable formats for customized XLS format strings. XLS file numeric data include date and time values. See How the SAS System Handles Date and Time Values for more information.

Default SAS System Variable Formats for XLS File Data
XLS File Data SAS Variable Format
Data Type XLS Format String Type Format
Char (table note 1) @ (table note 2) Char $w.
Numeric (table note 3) General Num BEST
Numeric 0 Num w.d
Numeric 0.00 Num w.d
Numeric #,##0 Num COMMAw.d
Numeric #,##0.00 Num COMMAw.d
Numeric #,##0_);(#,##0) Num NEGPARENw.d
Numeric #,##0_);[Red](#,##0) Num NEGPARENw.d
Numeric #,##0.00_);(#,##0.00) Num NEGPARENw.d
Numeric #,##0.00_);[Red](#,##0.00) Num NEGPARENw.d
Numeric $#,##0_);($#,##0) Num DOLLARw.d
Numeric $#,##0_);[Red]($#,##0) Num DOLLARw.d
Numeric ($#,##0.00_);($#,##0.00) Num DOLLARw.d
Numeric ($#,##0.00_);[Red]($#,##0.00) Num DOLLARw.d
Numeric _($*#,##0_);_($*(#,##0);_($*"-"_);_(@_) Num DOLLARw.d
Numeric _(*#,##0_);_(*(#,##0);_(*"-"_);_(@_) Num NEGPARENw.d
Numeric _($*#,##0.00_);_($*(#,##0.00);_($*"-"??_);_(@_) Num DOLLARw.d
Numeric _(*#,##0.00_);_(*(#,##0.00);_(*"-"??_);_(@_) Num NEGPARENw.d
Numeric 0% Num PERCENTw.d
Numeric 0.00% Num PERCENTw.d
Numeric 0.00E+00 Num Ew.d
Numeric ##0.0E+0 Num Ew.d
Numeric m/d/yy
Num MMDDYYw.
Numeric d-mmm-yy
Num MMDDYYw.
Numeric d-mmm
Num DATEw.
Numeric mmm-yy
Num MONYYw.
Numeric h:mm AM/PM Num TIMEw.
Numeric h:mm:ss AM/PM Num TIMEw.
Numeric h:mm
Num TIMEw.
Numeric hh:mm Num TIMEw.
Numeric h:mm:ss
Num TIMEw.
Numeric hh:mm:ss
Num TIMEw.
Numeric m/d/yy h:mm
Num DATETIMEw.
Numeric ddmmmyy
Num DATEw.
Numeric ddmmmyyyy:hh:mm:ss
Num DATETIMEw.
Numeric dd
Num DATEw.
Numeric dd/mm/yy
Num DDMMYYw.
Numeric dddd
Num DATEw.
Numeric mm/dd/yy
Num MMDDYYw.
Numeric mm:ss
Num MMSSw.
Numeric mm yy
Num MONYYw.
Numeric mm yyyy
Num MONYYw.
Numeric mm:yy
Num MONYYw.
Numeric mm:yyyy
Num MONYYw.
Numeric mm-yy
Num MONYYw.
Numeric mm-yyyy
Num MONYYw.
Numeric mmyy
Num MONYYw.
Numeric mmyyyy
Num MONYYw.
Numeric mm.yy
Num MONYYw.
Numeric mm.yyyy
Num MONYYw.
Numeric mm/yy
Num MONYYw.
Numeric mm/yyyy
Num MONYYw.
Numeric mmmm
Num MONYYw.
Numeric m
Num MONYYw.
Numeric mmmyy
Num MONYYw.
Numeric mmmyyyy
Num MONYYw.
Numeric dddd, mmmm dd, yyyy
Num MONYYw.
Numeric dddd, dd mmmm yyyy
Num MONYYw.
Numeric mmmm dd, yyyy
Num MONYYw.
Numeric dd mmmm yyyy
Num MONYYw.
Numeric yy
Num YYMMDDw.
Numeric yyyy
Num YYMMDDw.
Numeric yy mm
Num YYMMDDw.
Numeric yyyy mm
Num YYMMDDw.
Numeric yy:mm
Num YYMMDDw.
Numeric yyyy:mm
Num YYMMDDw.
Numeric yy-mm
Num YYMMDDw.
Numeric yyyy-mm
Num YYMMDDw.
Numeric yymm
Num YYMMDDw.
Numeric yyyymm
Num YYMMDDw.
Numeric yy.mm
Num YYMMDDw.
Numeric yyyy.mm
Num YYMMDDw.
Numeric yy/mm
Num YYMMDDw.
Numeric yyyy/mm
Num YYMMDDw.
Numeric yy-mm-dd
Num YYMMDDw.
Numeric yymmm
Num YYMMDDw.
Numeric yyyymmm
Num YYMMDDw.

TABLE NOTE 1:  

Label data.  [arrow]

TABLE NOTE 2:  

The XLS character format for Excel Version 5.0.  [arrow]

TABLE NOTE 3:  

Number, formula, or missing data.  [arrow]

Default SAS System Variable Formats for Customized XLS Format Strings
XLS File Data SAS Variable Format
Data Type XLS Format String Type Format
Numeric "$" Num DOLLARw.d
Numeric "E" Num Ew.d
Numeric "m, d and y" Num MMDDYYw.
Numeric "m and h" Num TIMEw.d
Numeric "m and s" Num TIMEw.d
Numeric "m and y" Num MONYYw.
Numeric "m" Num DATEw.
Numeric "d" Num DATEw.
Numeric "y" Num DATEw.
Numeric "0.0" Num w.d
Numeric Fraction values (#?/?) Num BESTw.d
Numeric Percent values (0.0%) Num PERCENTw.d
Numeric All others Num BESTw.d

Note that w is based on Excel column width; .d is controlled by the Excel format string.

If XLS file data fall outside of the valid SAS data ranges, you receive an error message in the SAS log when you try to access the data.

The SAS/ACCESS interface does not fully support the Microsoft Excel hidden and text formats. XLS data in hidden format are displayed in SAS data sets; however, you can drop the hidden column when you are creating the access descriptor. If you want to display the formula in the text format, add a space to indicate that the formula entry is a label. Otherwise, the results of the formula are displayed.

If you have set the SS_MIXED environment variable to YES, the numerical values in XLS files are converted to character strings in SAS data sets if the corresponding SAS variable type is specified as character.


DBLOAD Procedure Data Conversions

This section explains how SAS data are read into Microsoft Excel data when a table is loaded. In this conversion, the SAS character data type is converted into the Microsoft Excel label type and the SAS numeric type is converted into the Microsoft Excel number type.

The SAS/ACCESS interface automatically converts SAS formats to the same or associated Microsoft Excel formats and column widths. However, you can temporarily assign other formats and column widths to SAS variables by using the FORMAT statement so that the loaded XLS file columns have the formats you want. Converting SAS System Variable Formats to XLS File Data shows the SAS System variable types and formats and the XLS data types, formats, and column widths that you can assign them to.

Note:   The FORMAT statement in PROC DBLOAD only changes the format of SAS variables while you are creating and loading the XLS files. When the procedure is completed, the formats of SAS variables return to their original settings.  [cautionend]

XLS values are numeric data. See How the SAS System Handles Date and Time Values for more information.

Converting SAS System Variable Formats to XLS File Data
SAS Variable Format XLS File Data
Type Format XLS Format String Data Type
Char " " General LABEL
Char $CHAR General LABEL
Char $ General LABEL
Num BESTw.d General NUMBER
Num COMMAw.d #,##0 NUMBER
Num COMMAXw.d #,##0 NUMBER
Num DATEw. ddmmmyy
NUMBER
Num DATETIMEw.d ddmmmyyyy:hh:mm:ss NUMBER
Num DAYw. dd NUMBER
Num DDMMYYw. dd/mm/yy NUMBER
Num DOLLARw.d "$"#,##0_);("$"#,##0) NUMBER
Num DOLLARXw.d "$"#,##0_);("$"#,##0) NUMBER
Num DOWNAMEw.d dddd
NUMBER
Num Ew. 0.00E+00 NUMBER
Num HHMMw.d h:mm NUMBER
Num HOURw.d h:mm NUMBER
Num JULDAYw. m/d/yy NUMBER
Num JULIANw. m/d/yy NUMBER
Num MMDDYYw. mm/dd/yy NUMBER
Num MMSSw.d mm:ss NUMBER
Num MMYYxw. mmyy
NUMBER
Num MMYYC mm:yy NUMBER
Num MMYYD mm-yy NUMBER
Num MMYYN mmyy
NUMBER
Num MMYYP mm.yy NUMBER
Num MMYYS mm/yy NUMBER
Num MONNAMEw. mmmm
NUMBER
Num MONTHw. m
NUMBER
Num MONYYw. mmmyy
NUMBER
Num NEGPARENw.d #,##0_);(#,##0) NUMBER
Num NENGOw. m/d/yy NUMBER
Num PERCENTw.d 0% NUMBER
Num QTRw. m/d/yy NUMBER
Num QTRRw. m/d/yy NUMBER
Num SSNw. 000-00-0000 NUMBER
Num TIMEw.d h:mm:ss NUMBER
Num TODw. h:mm:ss NUMBER
Num W 0 NUMBER
Num WEEKDATEw. dddd, mmmmdd, yyyy
NUMBER
Num WEEKDATXw. dddd, dd mmmmyyyy
NUMBER
Num WEEKDAYw. m/d/yy NUMBER
Num WORDDATEw. mmmmdd, yyyy NUMBER
Num WORDDATXw. ddmmmmyyyy
NUMBER
Num YEARw. yy or yyyy NUMBER
Num YYMM yy mm
NUMBER
Num YYMMC yy:mm NUMBER
Num YYMMD yy-mm NUMBER
Num YYMMN yymm
NUMBER
Num YYMMP yy.mm NUMBER
Num YYMMS yy/mm NUMBER
Num YYMMDDw. yy-mm-dd NUMBER
Num YYMONw. yymmm
NUMBER
Num Zw.d 0w.d NUMBER
Num FRACTw. # ?/? NUMBER

Note that Excel column widths are set to w and displayed in the column. If data are larger than column width, the data are displayed as pound signs (###), in which case the data can be viewed by adjusting the column width.


FOOTNOTE 1:  In this description, datetime (in lowercase) refers to any value or format that represents a date, a time, or both a date and a time. [arrow]


Chapter Contents

Previous

Next

Top of Page

Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.