Chapter Contents

Previous

Next
SAS/ACCESS Software for Relational Databases: Reference

Performance and Efficient View Descriptors

When you create, update, and use view descriptors, follow these guidelines to minimize the use of SAS System resources and to reduce the time it takes the DBMS to access data.


General Information

When you create or update view descriptors, select only the columns that your program needs. Selecting unnecessary columns adds extra processing time due to data conversions.

When you use a view descriptor in a DATA step or SAS procedure, columns that you specified in the VAR and KEEP statements are passed to the DBMS for processing. Columns that you specified in DROP statements are not passed to the DBMS. Therefore, only a subset of the columns is returned to the SAS System, and performance is usually enhanced. (This approach can be applied in some of DBMSs, such as CA-OpenIngres, ORACLE, Oracle Rdb, and SYBASE.)


Sorting DBMS Data

Sorting DBMS data can be resource-intensive, whether it is done using the SORT procedure, a BY statement, or an ORDER BY clause in a view descriptor or SQL procedure. You should sort data only when sorted data are needed for your program. The following list includes guidelines and information about sorting data:


Extracting Data Using a View

In some cases, it might be more efficient to use a view descriptor to extract (that is, copy) DBMS data and place it in a SAS data file instead of using the view descriptor to read the data directly.

A DBMS table is read every time a view descriptor is referred to in a SAS program and the program is executed; the program's output reflects the latest updated level of the DBMS table. If many users are reading the same DBMS table repeatedly, DBMS performance may decrease. If you create several reports during the same SAS session, they may not be based on the same DBMS data due to updating by other users. Therefore, in the following circumstances, it is better to extract data:


Using a Subset of the DBMS Data

When you create or update view descriptors, you should specify selection criteria (where possible) to subset the number of rows that the DBMS returns to the SAS System.

As a general rule, a view descriptor's WHERE clause is passed to the DBMS for processing. How WHERE Clauses Are Processed with View Descriptors and the explanation afterwards describe this process in more detail.

How WHERE Clauses Are Processed with View Descriptors
View descriptor with a WHERE clause System that processes the WHERE clause
used in a DATA step, PROC step, or PROC SQL query
without additional WHERE clause DBMS
with additional WHERE clause DBMS: compound WHERE clause built with AND; otherwise, SAS System (table note 1)
used with a PROC SQL Pass-Through query (table note 2)
without additional WHERE clause DBMS
with additional WHERE clause SAS System. Each DBMS table in the join is processed by the DBMS and then returned to the SAS System for final processing. PROC SQL cannot optimize a join of this kind.

TABLE NOTE 1:  

The interface view engine builds a compound WHERE clause using AND operator(s) if the clauses are valid for the DBMS. Otherwise, only the valid part of the WHERE clause is sent to the DBMS for processing and the SAS System processes the remaining part(s). See "Using Multiple WHERE Clauses" below for more information. [arrow]

TABLE NOTE 2:  

When a view descriptor and a Pass-Through query are used within the same PROC SQL query, they are usually joined in the FROM clause. [arrow]

Note:   See SQL Procedure's Interaction with SAS/ACCESS Software for more information on Pass-Through queries.  [cautionend]

Using Multiple WHERE Clauses

When a view descriptor that includes a DBMS-specific SQL WHERE clause is used in a DATA step or procedure that includes a SAS WHERE statement, the SAS/ACCESS interface view engine tries to pass both WHERE conditions to the DBMS for processing. If all or part of the WHERE clause is valid for the DBMS, SAS/ACCESS software uses one or more logical AND operators to build a compound WHERE clause.

However, if a WHERE clause contains SAS enhancements or features that are not supported by the DBMS, the WHERE conditions are split up, and only the valid condition (that is, valid DBMS-specific syntax) is sent to the DBMS. The SAS System processes the remaining part(s) of the WHERE clause.

For example, a DBMS-specific SQL cannot parse a colon modifier on a comparison operator, and therefore, the first half of the following WHERE clause cannot be passed to the DBMS.

The second half of the WHERE clause is passed to the DBMS, and the DBMS returns to the SAS System all the rows in the INVOICE table for which PAIDON is on or after 01JAN94. The SAS System must then process the first half of the WHERE clause to subset the rows that were returned from the DBMS for the countries Argentina and Australia:

where country =: 'A' and paidon >= '01JAN94'd;

It is more efficient to use a LIKE operator that is valid in both the DBMS-specific SQL and the SAS System:

where country like 'A%' and paidon >= '01JAN94'd;

In this case, the DBMS can process both halves of the WHERE clause.

Note that the SAS/ACCESS interface view engine can translate certain SAS conventions, such as datetime formats or the IS MISSING operator, to their DBMS-specific SQL equivalents and pass these clauses to the DBMS.

In most cases it is more efficient for the DBMS to process the WHERE conditions. Therefore, write WHERE conditions using features (such as SQL operators) that are valid in both the SAS System and the DBMS so that the DBMS can process the entire WHERE clause.

It is also more efficient to use a SAS WHERE statement instead of a subsetting IF statement. As just described, a WHERE clause is passed to the DBMS for processing and returns a subset of rows to the SAS System for further processing. In contrast, when you use a subsetting IF statement, every row is returned to the SAS System to be evaluated by the IF statement. Therefore, using a SAS WHERE statement often improves performance.

Writing Efficient WHERE Clauses for View Descriptors

You should write WHERE clauses that enable the DBMS to use its indexes, where possible. This is a good practice whether you specify the WHERE clause in the view descriptor's selection criteria or you use a SAS WHERE statement in a DATA step that references a view descriptor. This practice is especially important when you are accessing large DBMS tables.

You cannot tell the DBMS to use an index, but you can write WHERE clauses that enable it to use its DBMS indexes effectively. Here are some guidelines for writing WHERE clauses that enable the DBMS to use indexes effectively.

Note:   The guidelines for each specific DBMS may vary.  [cautionend]

For a list of operators that are generally accepted by the SQLs of most DBMSs, see the SAS Procedures Guide.


Chapter Contents

Previous

Next

Top of Page

Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.