Chapter Contents

Previous

Next
SAS/ACCESS Interface to SYSTEM 2000 Data Management Software: Reference

Using a BY Key to Resolve Ambiguous Inserts

Each time the interface view engine is called to add an observation, it inspects the changes you made from observation to observation, in order to determine how many data records to insert into the database. The purpose is to reduce data redundancy.

If none of the data changed, or if the changes were only at the bottom level of the view, the engine needs to insert only a single new data record at the bottom level. Since the engine inserts at least one record for any add, and since only one record is called for here, there is no question about how many records to insert for this kind of add. That is, the insert is not ambiguous.

However, if any data values changed in records above the bottom record in the path, an ambiguous situation develops. A certain number of new records seem to be required by your changes, but some of the new data may already be present in existing database records. That is, the actual number of new records to be added to the database may be different.

In optimized load mode, the engine ignores the ambiguity; it inserts all of the new data that are at or below the highest level record that changed. Therefore, when you specify optimized load mode, be sure that your incoming data are always sorted by major-to-minor sort keys at every level (from level 0 down to the bottom level in the view). If the data are not sorted correctly, redundancy will not be eliminated.

On the other hand, in insert mode, the engine can determine whether some of the new data are already present in an existing record. If so, the engine needs to insert records only for those data not already present in the database. If not, the engine needs to insert a record at every level that changed. Use the optional BY key capability to eliminate redundancy and to help the interface view engine find an existing path for inserting the new records.

A BY key is similar to a BY group in the SAS System, which groups observations based on one or more fields. Many SAS procedures process records in BY groups. Also, some updates in the DATA step are performed by matching specified BY variables in different data sets. A similar matching process occurs with BY key items in the SAS/ACCESS interface to SYSTEM 2000 software.

If you specify a BY key, it should contain one or more database items at each level above the bottom level in your view descriptor.

BY keys do cause extra processing time because the engine issues one or more where-clauses to look for already-existing records.


Examples of Using a BY Key

Suppose you have a view with C1 and C11 in the BY key and three observations.

[IMAGE]

Suppose you are in the FSEDIT procedure on observation 1. You enter the DUP command and values A, CCC, and 4. This is not an ambiguous insert; a BY key is not required. The changes in values from observation 1 to your new input are confined to the bottom of the view. Here is the result.

[IMAGE]

Now suppose you are in the FSEDIT procedure on observation 1. You enter an ADD or a DUP command and the values B, DDD, and 5 for C1, C11, and C21, respectively. The insert is ambiguous because all the fields in the new observation are different from observation 1. Without a BY key, the result is

[IMAGE]

With a BY key, the engine finds the BY key values C1=B and C11=DDD in the database. The result then is

[IMAGE]


BY Key Considerations

The recommended way to use BY keys is as follows:

Consider the following caution areas when you do not use BY keys in this way:

In addition, the notion of a prior observation is important during inserts, because the engine compares your new data to it. The prior observation is obvious for SAS procedures that pass through a file sequentially, for example, the DBLOAD procedure. However, other SAS procedures can jump around within a file at random, for example, the FSEDIT procedure.

When you add observations through procedures that do not use sequential processing, remember that the prior observation is the last observation that the procedure showed you. For example, in the FSVIEW procedure, the prior observation is the last observation that the procedure showed you at the bottom of your display before your first update.

In some cases, there is no prior observation, such as when you enter the DBLOAD procedure. That is, the procedure calls the engine to add an observation without any prior retrieval. If this situation occurs, the engine issues a GET1 ... LAST command for the record at the top of the view and retrieves the last record that was inserted in the database.

These details are not very important if you are using a BY key consistently.


Chapter Contents

Previous

Next

Top of Page

Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.