Chapter Contents

Previous

Next
SAS/ACCESS Software for Relational Databases: Reference

Maximizing Teradata Performance

A major objective of SAS/ACCESS when reading DBMS tables is to take advantage of Teradata's rate of data transfer. This section describes the actions that you can take to ensure that SAS/ACCESS delivers optimal read performance including


About the PreFetch Facility

PreFetch is a SAS/ACCESS for Teradata facility that speeds up a SAS job by exploiting the parallel processing capability of Teradata. To obtain benefit from the facility, your SAS job must run more than once and have the following characteristics:

  • use SAS/ACCESS to query Teradata DBMS tables

  • should not contain SAS statements that create, update, or delete Teradata DBMS tables

  • run SAS code that changes infrequently or not at all.

In brief, the ideal job is a stable read-only SAS job.

Enabling PreFetch

Use of PreFetch is optional. To use the facility, you must explicitly enable it with the LIBNAME option PREFETCH=.

How Prefetch Works

When reading DBMS tables, SAS/ACCESS submits SQL statements on your behalf to Teradata. Each SQL statement that is submitted has an execution cost: the amount of time Teradata spends processing the statement before it returns the requested data to SAS/ACCESS.

When PreFetch is enabled, the first time you run your SAS job, SAS/ACCESS identifies and selects statements with a high execution cost. SAS/ACCESS then stores (caches) the selected SQL statements to one or more Teradata macros that it creates.

On subsequent runs of the job, when PreFetch is enabled, SAS/ACCESS extracts statements from the cache and submits them to Teradata in advance. The rows selected by these SQL statements are immediately available to SAS/ACCESS because Teradata 'prefetches' them. Your SAS job runs faster because PreFetch reduces the wait for SQL statements with a high execution cost. However, Prefetch improves elapsed time only on subsequent runs of a SAS job. During the first run, SAS/ACCESS only creates the SQL cache and stores selected SQL statements; no prefetching is performed.

The PreFetch Option Arguments

unique_storename As mentioned, when PreFetch is enabled, SAS/ACCESS creates one or more Teradata macros to store the selected SQL statements that PreFetch caches. You can easily distinguish a PreFetch macro from other Teradata macros. The PreFetch Teradata macro contains a comment that is prefaced with the text, "SAS/ACCESS PreFetch Cache".

The name that the PreFetch facility assigns the macro is the value that you enter for the unique_storename argument. The unique_storename name must be unique. Do not specify a name that exists in the Teradata DBMS already for a DBMS table, view or macro. Also, do not enter a name that exists already in another SAS job that employs the Prefetch facility.

#sessions This argument specifies how many cached SQL statements SAS/ACCESS submits in parallel to Teradata. In general, your SAS job completes faster if you increase the number of statements that Teradata works on in advance. However, a large number (too many sessions) can strain client and server resources. A valid value is 1 through 9. If you do not specify a value for this argument, the default is 3.

In addition to the specified number of sessions, SAS/ACCESS adds an additional session for submitting SQL statements that are not stored in the PreFetch cache. Thus, if the default is 3, SAS/ACCESS actually opens up to 4 sessions on the Teradata server.

algorithm This argument is present to handle future enhancements. Currently PreFetch only supports one algorithm, SEQUENTIAL.

When and Why Use PreFetch

If you have a read-only SAS job that runs frequently, this is an ideal candidate for PreFetch. For example, a daily job that extracts data from Teradata tables. To help you decide when to use PreFetch, consider the following daily jobs:

In these examples, Job 1 is an excellent candidate for the facility. In contrast, Job 2 is not. Using PreFetch with Job 2 does not return incorrect results but can impose a performance penalty. PreFetch uses stored SQL statements. Thus, Job 2 is not a good candidate because the SQL statements that the job generates with the WHERE clause change each time the job is run. Consequently, the SQL statements that the job generates never match the statements that are stored.

The impact of Prefetch on processing performance varies by SAS job. Some jobs improve elapsed time 5% or less; others improve elapsed time 25% or more.

Possible Unexpected Results

It is unlikely, but possible, to write a SAS job that delivers unexpected or incorrect results. This can occur if the job contains code that waits on some Teradata or system event before proceeding. For example, SAS code that pauses the SAS job until another user updates a given data item in a Teradata table. Or, SAS code that pauses the SAS job until a given time; for example, 5:00 p.m. In both cases, PreFetch would generate SQL statements in advance. But, table results from these SQL statements would not reflect data changes that will be made by the scheduled Teradata or system event.

PreFetch Processing of Unusual Conditions

PreFetch Facility is designed to handle unusual conditions gracefully. Some of these unusual conditions include:

Condition: Your job contains SAS code that creates updates, or deletes Teradata tables.
PreFetch is designed only for read operations and is disabled when it encounters a non-read operation. The facility will return a performance benefit up to the point where the first non-read operation is encountered. After that, SAS/ACCESS will disable the PreFetch facility and continue processing.

Condition: Your SQL cache name (unique_storename value) is identical to the name of a Teradata table.
PreFetch issues a warning message. SAS/ACCESS will disable the PreFetch facility and will continue processing.

Condition: You change your SAS code for a job that has PreFetch enabled.
PreFetch detects that the SQL statements for the job changed and deletes the cache. SAS/ACCESS will disable Prefetch and will continue processing. The next time that you run the job, PreFetch creates a fresh cache.

Condition: Your SAS job encounters a PreFetch cache that was created by a different SAS job.
PreFetch deletes the cache. SAS/ACCESS will disable Prefetch and will continue processing. The next time that you run the job, PreFetch creates a fresh cache.

Condition: You remove the PreFetch option from an existing job.
Prefetch is disabled. Even if the SQL cache (Teradata macro) still exists in your database, SAS/ACCESS ignores it.

Condition: You accidentally delete the SQL cache (the Teradata macro created by PreFetch) for a SAS job that has PreFetch enabled
SAS/ACCESS simply rebuilds the cache on the next run of the job. In subsequent job runs, PreFetch continues to enhance performance


Using PreFetch as a LIBNAME Option

If you specify the PreFetch= option in a LIBNAME statement, PreFetch applies the option to tables read by the libref.

Note:   If you have more than one LIBNAME in your SAS job, and you specify PreFetch for each LIBNAME, remember to make the SQL cache name for each LIBNAME unique.  [cautionend]

Example 1: Applying PreFetch to One of Two LIBNAMES

libname one teradata user=kamdar password=ellis
 prefetch='pf_store1';
libname two teradata user=larry password=riley;

In Example 1, you apply PreFetch to one of two LIBNAMES. During the first job run, PreFetch stores SQL statements for tables referenced by the LIBNAME ONE in a Teradata macro named PF_STORE1 for reuse later.

Example 2: Applying PreFetch to Multiple LIBNAMES

libname emp teradata user=kamdar password=ellis
 prefetch='emp_sas_macro';
libname sale teradata user=larry password=riley
 prefetch='sale_sas_macro';  

In Example 2, you apply PreFetch to multiple LIBNAMES. During the first job run, PreFetch stores SQL statements for tables referenced by LIBNAME EMP to a Teradata macro named EMP_SAS_MACRO and SQL statements for tables referenced by LIBNAME SALE to a Teradata macro named SALE_SAS_MACRO.


Using Prefetch as a Global Option

Unlike other Teradata LIBNAME options, you can also invoke PreFetch globally for a SAS job. To do this, place the OPTION DEBUG= statement in your SAS program before all LIBNAME statements and PROC SQL steps. If your job contains multiple LIBNAME statements, the global PreFetch invocation creates a uniquely named SQL cache name for each of the LIBNAMES.

Note:   Do not be confused by the DEBUG option here. It is merely a mechanism to deliver the PreFetch capability globally. PreFetch is not for debugging; it is a supported feature of SAS/ACCESS for Teradata.  [cautionend]

Example 1: Using PreFetch Globally with Multiple LIBNAMES

option debug="PREFETCH(unique_mac,2,SEQUENTIAL)";
libname one teradata user=kamdar password=ellis;
libname two teradata user=kamdar password=ellis
                          database=larry;
libname three teradata user=kamdar password=ellis
                            database=wayne;
proc print data=one.kamdar_goods;
run;
proc print data=two.larry_services;
run;
proc print data=three.wayne_miscellaneous;
run; 

In Example 1, the first time you run the job with Prefetch enabled, the facility creates 3 Teradata macros: UNIQUE_MAC1, UNIQUE_MAC2, and UNIQUE_MAC3. In subsequent runs of the job, PreFetch extracts SQL statements from these Teradata macros, enhancing the job performance across all three LIBNAMES referenced by the job.

Example 2: PreFetch Selects the Algorithm

option debug='prefetch(pf_unique_sas,3)';

In Example 2, PreFetch selects the algorithm, that is, the order of the SQL statements. (The option debug statement must the first statement in your SAS job.)

Example 3: User Specifies the Sequential Algorithm

option debug='prefetch(sas_pf_store,3,sequential)';

In Example 3, the user specifies for PreFetch to use the SEQUENTIAL algorithm. (The option debug statement must the first statement in your SAS job.)


Matching Teradata Data Types to More Efficient SAS Formats

You can further optimize read operations, with or without use of the PreFetch facility, by matching the DBMS data types to more efficient SAS formats. Internally, SAS employs floating point numbers and fixed-length character strings. The corresponding Teradata data types are FLOAT and CHAR. When SAS/ACCESS issues SELECT statements on your behalf, it can optimize read processing if the SELECT statement requests FLOAT and CHAR columns from Teradata. In this case, SAS/ACCESS can access the data directly, reading it from the buffers that Teradata controls on the client (your PC, workstation or terminal). SAS/ACCESS then avoids having to move each column from Teradata's buffers into buffers maintained by SAS/ACCESS. Reading data from Teradata's client buffers improves performance. This is true especially when SAS/ACCESS reads a large amount of Teradata data.

In order to take advantage of the optimization just described, you may want to create new tables. For the new tables, where possible define FLOAT columns in place of DECIMAL, INTEGER, SMALLINT, and BYTEINT numeric Teradata columns. Similarly, where possible define CHAR columns in place of VARCHAR and LONG VARCHAR character Teradata columns. These data type changes can speed up processing. But, only when you read the data using SAS/ACCESS. Therefore, create a new table only if you intend to access the table primarily through SAS/ACCESS.

When creating a table, remember that changing other numeric types to FLOAT automatically increases the size of a Teradata table. Other numeric types require less physical storage than does the FLOAT data type. Therefore, before creating a table, consider the overall table design. Additionally, consider running benchmarks with SAS/ACCESS.


Chapter Contents

Previous

Next

Top of Page

Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.