Chapter Contents

Previous

Next
SAS/ACCESS Software for Relational Databases: Reference

Maximizing DB2 Performance

Among the factors that affect DB2 performance are the size of the table that is being accessed and the form of the SQL SELECT statement. If the table that is being accessed is larger than 10,000 rows (or 1,000 pages), you should evaluate all SAS programs that access the table directly. When you evaluate the programs, consider the following questions:

DB2 has a Resource Limit Facility to limit the execution time of dynamic SQL statements. If the time limit is exceeded, the dynamic statement is terminated and the SQL code -905 is returned. The following list describes several situations in which the RLF could stop a user from consuming large quantities of CPU time:

There are several things that you can do in your SAS application to make the DB2 engine perform better:


Making the Most of Your Connections

Beginning in Version 7, the DB2 engine supports more than one connection to DB2 per SAS session. This is an improvement over Version 6 in a number of ways, especially in a server environment. One advantage is being able to segregate tasks that fetch rows from a cursor from tasks that must issue commits. This eliminates having to resynchronize the cursor, prepare the statement, and fetch rows until you are positioned back on the row you were on. This separation also allows tasks that must issue commits to eliminate locking contention to do so sooner, since they are not delayed until after cursors are closed to prevent having to resynchronize. In general, tables opened for input fetch from cursors and do not issue commits, while update opens may, and output opens do, issue commits.

You can control how the DB2 engine uses connections by using the CONNECTION= option on the LIBNAME statement. At one extreme is CONNECTION=UNIQUE, which causes each table access, whether it is for input, update, or output, to create and use its own connection. Conversely, CONNECTION=SHARED means that only one connection is made, and that input, update, and output accesses all share that connection.

The default value for the CONNECTION= option is CONNECTION=SHAREDREAD, which means that tables opened for input share one connection, while update and output opens get their own connections. CONNECTION=SHAREDREAD allows the best separation between tasks that fetch from cursors and tasks that must issue commits, eliminating the resynchronizing of cursors.

The values GLOBAL and GLOBALREAD perform similarly to SHARED and SHAREDREAD. The difference is that you can share the given connection across any of the librefs that you specify as GLOBAL or GLOBALREAD.

Although the default value of CONNECTION= SHAREDREAD is optimal, there are times when another value might be better. If you must use multiple librefs, you might want to set them each as GLOBALREAD. This way, you will have one connection for all of your input opens, regardless of which libref you use, as opposed to one connection per libref for input opens. In a single-user environment (as opposed to a server session), you might know that you will not have multiple opens occurring at the same time. In this case, you might want to use SHARED (or GLOBAL for multiple librefs). This eliminates the overhead of creating separate connections for input, update, and output transactions, while having only one open at a time eliminates the problem of having to resynchronize input cursors if a commit occurs.

Another reason for using SHARED or GLOBAL is the case of opening a table for output while opening another table within the same database for input. This can result in a -911 deadlock situation unless both opens occur in the same connection.

As explained in Information for the Database Administrator, the first connection to DB2 is made from the main SAS task. Subsequent connections are made from corresponding subtasks, which the DB2 engine attaches; DB2 allows only one connection per task. Due to the system overhead of intertask communication, the connection established from the main SAS task is a faster connection in terms of CPU time. Since this is true, if you are reading or writing large numbers of rows, you will have better performance (less CPU time) if you use the first connection for these operations. If you are only reading rows, SHAREDREAD or GLOBALREAD can share the first connection. However, if you are both reading and writing rows (input and output opens), you can use CONNECTION=UNIQUE to make each open use the first connection. UNIQUE causes each open to have its own connection. If you only have one open at a time, and some are input while others are output (for large amounts of data), the performance benefit of using the main SAS task connection far outweighs the overhead of establishing a new connection for each open.

One other type of connection that the DB2 engine uses, and which is not user controllable, is the utility connection. This connection is used to access the system catalog, issues commits to release locks, and is a separate connection. Utility procedures such as DATASETS and CONTENTS can cause this connection to be created, although other actions necessitate it as well. There is one connection of this type per libref, but it is not created until it is needed. If you have critical steps which must use the main SAS task connection for performance reasons, refrain from using the DEFER=YES option on the LIBNAME statement. It is possible that the utility connection can be established from that task, causing the connection you use for your opens to be from a slower subtask.

In summary, there is not one value for the CONNECTION= option which works best in all possible situations. You might need to try different values and arrange your SAS programs in different ways to obtain the best performance possible.

For additional information about


Chapter Contents

Previous

Next

Top of Page

Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.