SAS/ACCESS Interface to SYSTEM 2000 Data Management Software: Reference

Using a BY Key to Resolve Ambiguous Inserts

Each time the interface view engine is called to add an observation, it inspects the changes you made from observation to observation, in order to determine how many data records to insert into the database. The purpose is to reduce data redundancy.

If none of the data changed, or if the changes were only at the bottom level of the view, the engine needs to insert only a single new data record at the bottom level. Since the engine inserts at least one record for any add, and since only one record is called for here, there is no question about how many records to insert for this kind of add. That is, the insert is not ambiguous.

However, if any data values changed in records above the bottom record in the path, an ambiguous situation develops. A certain number of new records seem to be required by your changes, but some of the new data may already be present in existing database records. That is, the actual number of new records to be added to the database may be different.

In optimized load mode, the engine ignores the ambiguity; it inserts all of the new data that are at or below the highest level record that changed. Therefore, when you specify optimized load mode, be sure that your incoming data are always sorted by major-to-minor sort keys at every level (from level 0 down to the bottom level in the view). If the data are not sorted correctly, redundancy will not be eliminated.

On the other hand, in insert mode, the engine can determine whether some of the new data are already present in an existing record. If so, the engine needs to insert records only for those data not already present in the database. If not, the engine needs to insert a record at every level that changed. Use the optional BY key capability to eliminate redundancy and to help the interface view engine find an existing path for inserting the new records.

A BY key is similar to a BY group in the SAS System, which groups observations based on one or more fields. Many SAS procedures process records in BY groups. Also, some updates in the DATA step are performed by matching specified BY variables in different data sets. A similar matching process occurs with BY key items in the SAS/ACCESS interface to SYSTEM 2000 software.

If you specify a BY key, it should contain one or more database items at each level above the bottom level in your view descriptor.

BY keys do cause extra processing time because the engine issues one or more where-clauses to look for already-existing records.

Examples of Using a BY Key

Suppose you have a view with C1 and C11 in the BY key and three observations.

[IMAGE]

Suppose you are in the FSEDIT procedure on observation 1. You enter the DUP command and values A, CCC, and 4. This is not an ambiguous insert; a BY key is not required. The changes in values from observation 1 to your new input are confined to the bottom of the view. Here is the result.

[IMAGE]

Now suppose you are in the FSEDIT procedure on observation 1. You enter an ADD or a DUP command and the values B, DDD, and 5 for C1, C11, and C21, respectively. The insert is ambiguous because all the fields in the new observation are different from observation 1. Without a BY key, the result is

[IMAGE]

With a BY key, the engine finds the BY key values C1=B and C11=DDD in the database. The result then is

[IMAGE]

BY Key Considerations

include an item from each level above the bottom of the view
standardize all database updates through the same view or consistent views.

Consider the following caution areas when you do not use BY keys in this way:

Potentially inadequate BY keys. The engine does not enforce the important suggestion that a BY key must contain at least one item at every level above the bottom level in the view descriptor. However, if the BY key does not contain enough unique items, it may be inadequate to help the engine; the engine may behave as though there were no BY key at all.
Inconsistent use of a BY key. The engine does not enforce consistent use of BY keys; one view descriptor might have a BY key and another might not. In this case, redundant data could be added to the database through the view descriptor that does not have a BY key. Also, some applications could enter redundant data through the QUEST procedure, which does not call the engine for database updates.
If data are added in any way other than through a view descriptor using a BY key, the engine might find several qualified database records that match the incoming data. The engine would simply pick one record that works and use it when inserting the new records. Thus, the incoming data might be attached beneath a different existing record than the one you expect.
To avoid this situation, make sure that all users who update the database follow the same rules. That is, ensure that all data entry is done through the interface view engine and that all users use the same view descriptor (or at least consistent view descriptors).

In addition, the notion of a prior observation is important during inserts, because the engine compares your new data to it. The prior observation is obvious for SAS procedures that pass through a file sequentially, for example, the DBLOAD procedure. However, other SAS procedures can jump around within a file at random, for example, the FSEDIT procedure.

When you add observations through procedures that do not use sequential processing, remember that the prior observation is the last observation that the procedure showed you. For example, in the FSVIEW procedure, the prior observation is the last observation that the procedure showed you at the bottom of your display before your first update.

In some cases, there is no prior observation, such as when you enter the DBLOAD procedure. That is, the procedure calls the engine to add an observation without any prior retrieval. If this situation occurs, the engine issues a GET1 ... LAST command for the record at the top of the view and retrieves the last record that was inserted in the database.

These details are not very important if you are using a BY key consistently.

Chapter Contents
Previous
Next
Top of Page