Chapter Contents

Previous

Next
SAS/SHARE User's Guide

Implications of Data Translation

Other parts of SAS translate data from one representation to another. The CPORT and CIMPORT procedures translate data into and from a single representation system called transport format. The DOWNLOAD and UPLOAD procedures of SAS/CONNECT software each perform this "to-transport/from-transport" translation as they move data between hosts that have dissimilar architectures.

In SAS/SHARE software, translation of numeric variables occurs when the server machine and the client machine represent floating-point numbers differently. For character variables, translation occurs when their character representations differ. Values are translated directly from the source representation to the target representation; they do not pass through transport format. Translation occurs both when data flows from the server to the client and when it flows from the client to the server. Therefore, data that flows across architectures from a server to a client and that are then sent back to that server undergo two translations.

For all hosts on which SAS/SHARE software currently runs, the remote engine performs all data conversion on behalf of itself and the server. Thus, it converts outgoing data to the server format, and it converts incoming data from the server to its own format. The administrative procedure, PROC OPERATE, works in the same way. However, the server does all data conversion for clients other than SAS, such as the SAS ODBC driver.


Numeric Translation

Translation from one numeric representation to another can alter the value of a variable. A common type of alteration is loss of precision. This occurs when the source representation uses more bits to represent the mantissa than the target representation. Such distortions always have a very small magnitude, but they might represent a significant percentage change to a very small original value.

CAUTION:
SAS/SHARE software does not produce any warnings about loss of precision during translation.   [cautionend]
A rare type of value distortion is loss of magnitude. This occurs when the source representation has a greater exponent range than the target representation and a value whose magnitude lies in the excess range of the source representation is translated. Of course, the magnitude of this type of distortion is potentially very great, as is the percentage change. SAS/SHARE software does produce a warning when this type of alteration occurs.

See SAS Language Reference: Dictionary for a detailed description of the numeric representation of SAS variable values.


Character Translation

Character data is not altered by the translation process when the translation between the two representation schemes is one-to-one. Current Institute-supplied translation tables provide one-to-one translation between the five character representations that SAS recognizes: EBCDIC, ASCII-OEM, ASCII-ANSI, ASCII-ISO, and ASCII-Mac.

Use caution if the client and server machines are configured for different natural languages, such as Swedish and English. Translation tables on each machine are typically customized to best support the native language. This provides the best mapping of the native language to the coding system, but it makes translation alteration more likely.

A site or individual user can freely modify their translation tables. In such cases, the modifications must be checked carefully to ensure that the new translation is still one-to-one, or to verify that alteration of character data through updating cannot occur (or is not important). See Character-Translation Tables for information about translation tables.


Character-Translation Tables

The tables that are used for character translation in SAS/SHARE software are stored in SAS catalog entries of type TRANTAB. Each of these catalog entries contains two translation tables. The first is for import translation and the second is for export translation. For example, the EBCDIC/ASCII-OEM translation entry on OS/390 contains an import table for ASCII-OEM to EBCDIC translation and an export entry for EBCDIC to ASCII-OEM translation. The names of the catalog entries that contain these translation tables are given in the following table.

Translation Table Set Catalog Entry Name
EBCDIC/ASCII-ISO _0000030
EBCDIC/ASCII-ANSI _0000060
EBCDIC/ASCII-OEM _00000A0
EBCDIC/ASCII-MAC _0000120
ASCII-ISO/ASCII-ANSI _0000050
ASCII-ISO/ASCII-OEM _0000090
ASCII-ISO/ASCII-MAC _0000110
ASCII-ANSI/ASCII-OEM _00000C0
ASCII-ANSI/ASCII-MAC _0000140
ASCII-OEM/ASCII-MAC _0000180

Character-translation catalog entries are stored in the SASUSER.PROFILE and SASHELP.HOST catalogs. The translation process locates a given translation entry by searching first the SASUSER.PROFILE catalog and then the SASHELP.HOST catalog.

The remote engine performs all data conversion on behalf of itself and the server. Therefore, only the translation tables in the client SAS session are used. However, the server translation tables are used in data conversion for clients other than SAS because the server performs all data conversion on behalf of itself and these clients.

SAS site administrators can use the TRANTAB procedure to replace or update the translation tables. See the SAS Procedures Guide and SAS Language Reference: Dictionary for details.

CAUTION:
Do not attempt to update a translation table in a client session while connected across architectures to a host that requires its usage. You cannot ensure that the new version of the table will be used for subsequent conversions.  [cautionend]

Note:   The default character set for Microsoft Windows hosts is ASCII-ANSI, so the supplied _00000C0 TRANTAB entry contains an import translation table for ASCII-OEM to ASCII-ANSI and an export table for ASCII-ANSI to ASCII-OEM. If the WINCHARSET SAS system option is used to switch the character set to ASCII-OEM, the _00000C0 TRANTAB entry must be updated to switch the import and export tables if they differ.  [cautionend]


Data Translation Considerations

Data translation in SAS/SHARE software may have some implications that users need to consider. For example, suppose that a user has assigned two SAS data libraries, FOO and ZOO, through a server on a host that has a different architecture from the user's machine. The user copies the data sets contained in FOO to ZOO:

proc copy in=foo out=zoo mt=data;
run;

The contents of the copied data sets in ZOO are not guaranteed to be identical to the contents of the original data sets in FOO. The data sets in ZOO have undergone two translations from the originals in FOO. This may have resulted in a slight change in the precision of numeric variables in the data sets in ZOO.

For another example, suppose a user is using the FSEDIT procedure to edit a data set across architectures. A user who enters a DUP command and who then modifies variable X before saving the new record may find that, aside from the value of variable X, the new record is not identical to the old record. The original values of the duplicated record have undergone two translations, from server-machine format to user-machine format and back, while the new value that the user entered for the variable X has undergone only one translation from user-machine format to server-machine format.

Note:   When editing or updating a data set across architectures by using the FSEDIT procedure, the FSVIEW procedure, DATA step MODIFY statement, and so forth, any variables that are not updated in an updated observation will be exempt from translation and will be unaltered.  [cautionend]


Chapter Contents

Previous

Next

Top of Page

Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.