Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The EXPAND Procedure

Transformation Operations

The operations that can be used in the TRANSFORMIN= and TRANSFORMOUT= options are shown in Table 11.1. Operations are applied to each value of the series. Each value of the series is replaced by the result of the operation.

In Table 11.1, xt or x represents the value of the series at a particular time period t before the transformation is applied, yt represents the value of the result series, and N represents the total number of observations.

The notation [n] indicates that the argument n is optional; the default is 1. The notation window is used as the argument for the moving statistics operators, and it indicates that you can specify either an integer number of periods n or a list of n weights in parentheses. The notation s indicates the length of seasonality, and it is a required argument.

Table 11.1: Transformation Operations
Syntax Result
+ numberadds the specified number: x+number
- numbersubtracts the specified number: x-number
* numbermultiplies by the specified number: x*number
& numberdivides by the specified number: x & number
ABSabsolute value: |x|
[] CD_I sclassical decomposition irregular component
CD_S sclassical decomposition seasonal component
CD_SA sclassical decomposition seasonally adjusted series
CD_TC sclassical decomposition trend-cycle component
CDA_I sclassical decomposition (additive) irregular component
CDA_S sclassical decomposition (additive) seasonal component
CDA_SA sclassical decomposition (additive) seasonally adjusted series
CEILsmallest integer greater than or equal to x: ceil(x)
CMOVAVE windowcentered moving average
CMOVCSS windowcentered moving corrected sum of squares
CMOVMAX ncentered moving maximum
CMOVMED ncentered moving median
CMOVMIN ncentered moving minimum
CMOVRANGE ncentered moving range
CMOVSTD windowcentered moving standard deviation
CMOVSUM ncentered moving sum
CMOVUSS windowcentered moving uncorrected sum of squares
CMOVVAR windowcentered moving variance
CUAVE [n]cumulative average
CUCSS [n]cumulative corrected sum of squares
CUMAX [n]cumulative maximum
CUMED [n]cumulative median
CUMIN [n]cumulative minimum
CURANGE [n]cumulative range
CUSTD [n]cumulative standard deviation
CUSUM [n]moving sum
CUUSS [n]cumulative uncorrected sum of squares
CUVAR [n]cumulative variance
DIF [n]lag n difference: xt-xt-n
EWMA numberexponentially weighted moving average of x with
 smoothing weight number, where 0 < number < 1:
 yt = number   xt + (1-number) yt-1.
 This operation is also called simple exponential smoothing.
EXPexponential function: exp(x)
FLOORlargest integer less than or equal to x: floor(x)
ILOGITinverse logistic function: [exp(x)/(1+exp(x))]
LAG [n]value of the series n periods earlier: xt-n
LEAD [n]value of the series n periods later: xt+n
LOGnatural logarithm: log(x)
LOGITlogistic function: log([x/(1-x)])
MAX numbermaximum of x and number: max(x,number)
MIN numberminimum of x and number: min(x,number)
> numbermissing value if x <= number, else x
>= numbermissing value if x < number, else x
= numbermissing value if {x {\ne} number}, else x
\wedge= numbermissing value if x = number, else x
< numbermissing value if x >= number, else x
<= numbermissing value if x > number, else x
MOVAVE nmoving average of n neighboring values:
 {\frac{1}n \sum_{j=0}^{n-1}{x_{t-j}}}
MOVAVE(w1 ... wn)weighted moving average of neighboring values:
 {{(\sum_{j=1}^n{w_{j}x_{t-j+1})}}/{(\sum_{j=1}^n{w_{j}})}}
MOVAVE windowbackward moving average
MOVCSS windowbackward moving corrected sum of squares
MOVMAX nbackward moving maximum
MOVMED nbackward moving median
MOVMIN nbackward moving minimum
MOVRANGE nbackward moving range
MOVSTD windowbackward moving standard deviation
MOVSUM nbackward moving sum
MOVUSS windowbackward moving uncorrected sum of squares
MOVVAR windowbackward moving variance
MISSONLY <MEAN>indicates that the following moving time window
 statistic operator should replace only missing values with the
 moving statistic and should leave nonmissing values unchanged.
 If the option MEAN is specified, then missing values are
 replaced by the overall mean of the series.
NEGchanges the sign: -x
NOMISSindicates that the following moving time window
 statistic operator should not allow missing values.
RECIPROCALreciprocal: 1/x
REVERSEreverse the series: {x_{_{N-t}}}
SETMISS numberreplaces missing values in the series with the number specified.
SIGN-1, 0, or 1 as x is < 0, equals 0, or > 0 respectively
SQRTsquare root: {\sqrt{x}}
SQUAREsquare: x2
SUMcumulative sum: {\sum_{j=1}^t{x_{j}}}
SUM ncumulative sum of n-period lags:
 xt+xt-n+xt-2n+ ...
TRIM nsets xt to missing a value if {t {\le} n} or {t {\ge} N-n+1}.
TRIMLEFT nsets xt to missing a value if {t {\le} n}.
TRIMRIGHT nsets xt to missing a value if {t {\ge} N-n+1}.

Moving Time Window Operators

Some operators compute statistics for a set of values within a moving time window; these are called moving time window operators. There are backward and centered versions of these operators.

The centered moving time window operators are CMOVAVE, CMOVCSS, CMOVMAX, CMOVMED, CMOVMIN, CMOVRANGE, CMOVSTD, CMOVSUM, CMOVUSS, and CMOVVAR. These operators compute statistics of the n values xi  for observations {t-(n-1)/2  {\le}  i  {\le}  t+(n-1)/2}.

The backward moving time window operators are MOVAVE, MOVCSS, MOVMAX, MOVMED, MOVMIN, MOVRANGE, MOVSTD, MOVSUM, MOVUSS, and MOVVAR. These operators compute statistics of the n values xt, xt-1, ... , xt-n+1.

All the moving time window operators accept an argument n specifying the number of periods to include in the time window. For example, the following statement computes a five-period backward moving average of X.

      convert x=y / transformout=( movave 5 );

In this example, the final result is yt = (xt + xt-1 + xt-2 + xt-3 + xt-4) / 5.

The following statement computes a five-period centered moving average of X.

      convert x=y / transformout=( cmovave 5 );

In this example, the final result is yt = (xt-2 + xt-1 + xt + xt+1 + xt+2/5.

If the window with a centered moving time window operator is not an odd number, one more lagged value than lead value is included in the time window. For example, the result of the CMOVAVE 4 operator is yt = (xt-2 + xt-1 + xt + xt+1)/4.

You can compute a forward moving time window operation by combining a backward moving time window operator with the REVERSE operator. For example, the following statement computes a five-period forward moving average of X.

      convert x=y / transformout=( reverse movave 5 reverse );

In this example, the final result is yt = (xt + xt+1 + xt+2 + xt+3 + xt+4)/5.

Some of the moving time window operators enable you to specify a list of weight values to compute weighted statistics. These are CMOVAVE, CMOVCSS, CMOVSTD, CMOVUSS, CMOVVAR, MOVAVE, MOVCSS, MOVSTD, MOVUSS, and MOVVAR.

To specify a weighted moving time window operator, enter the weight values in parentheses after the operator name. The window width n is equal to the number of weights that you specify; do not specify n.

For example, the following statement computes a weighted five-period centered moving average of X.

      convert x=y / transformout=( cmovave( .1 .2 .4 .2 .1 ) );

In this example, the final result is yt = .1 xt-2 + .2 xt-1 + .4 xt + .2 xt+1 + .1 xt+2.

The weight values must be greater than zero. If the weights do not sum to 1, the weights specified are divided by their sum to produce the weights used to compute the statistic.

At the beginning of the series, and at the end of the series for the centered operators, a complete time window is not available. The computation of the moving time window operators is adjusted for these boundary conditions as follows.

For backward moving window operators, the width of the time window is shortened at the beginning of the series. For example, the results of the MOVSUM 3 operator are

y_{1} &=& x_{1} \y_{2} &=& x_{1} + x_{2} \y_{3} &=& x_{1} + x_{2} + x_{3} \y_{4} &=& x_{2} + x_{3} + x_{4} \y_{5} &=& x_{3} + x_{4} + x_{5} \& ... & \

For centered moving window operators, the width of the time window is shortened at both the beginning and the end of the series. When an observation is unavailable at one side of the moving time window, the corresponding observation at the other side of the window is ignored. For example, the results of the CMOVSUM 5 operator are

y_{1} &=& x_{1} \y_{2} &=& x_{1} + x_{2} + x_{3} \y_{3} &=& x_{1} + x_{2} + x_{3...
 ...{N}} \y_{_{N-1}} &=& x_{_{N-2}} + x_{_{N-1}} + x_{_{N}} \y_{_{N}} &=& x_{_{N}} \

For weighted moving time window operators, the weights for the unavailable or unused observations are ignored and the remaining weights renormalized to sum to 1.

Cumulative Statistics Operators

Some operators compute cumulative statistics for a set of current and previous values of the series. The cumulative statistics operators are CUAVE, CUCSS, CUMAX, CUMED, CUMIN, CURANGE, CUSTD, CUSUM, CUUSS, and CUVAR. These operators compute statistics of the values xt, xt-n, xt-2n, ... , xt-in for t-in > 0.

By default, the cumulative statistics operators compute the statistics from all previous values of the series, so that yt is based on the set of values x1, x2, ... , xt. For example, the following statement computes yt as the cumulative sum of nonmissing xi values for {i {\le} t}.

      convert x=y / transformout=( cusum );

You can also specify a lag increment argument n for the cumulative statistics operators. In this case, the statistic is computed from the current and every nth previous value. For example, the following statement computes yt as the cumulative sum of nonmissing xi values for odd i when t is odd and for even i when t is even.

      convert x=y / transformout=( cusum 2 );

The results of this example are

y_{1} &=& x_{1} \y_{2} &=& x_{2} \y_{3} &=& x_{1} + x_{3} \y_{4} &=& x_{2} + x_{4} \y_{5} &=& x_{1} + x_{3} + x_{5} \y_{6} &=& x_{2} + x_{4} + x_{6} \& ...  & \

Missing Values

You can truncate the length of the result series by using the TRIM, TRIMLEFT, and TRIMRIGHT operators to set values to missing at the beginning or end of the series.

You can use these functions to trim the results of moving time window operators so that the result series contains only values computed from a full width time window. For example, the following statements compute a centered five-period moving average of X, and they set to missing values at the ends of the series that are averages of fewer than five values.

      convert x=y / transformout=( movave 5 trim 2 );

Normally, the moving time window and cumulative statistics operators ignore missing values and compute their results for the nonmissing values. When preceded by the NOMISS operator, these functions produce a missing result if any value within the time window is missing. The NOMISS operator does not perform any calculations, but serves to modify the operation of the moving time window operator that follows it. The NOMISS operator has no effect unless it is followed by a moving time window operator.

For example, the following statement computes a five-period moving average of the variable X but produces a missing value when any of the five values are missing.

      convert x=y / transformout=( nomiss movave 5 );

The following statement computes the cumulative sum of the variable X but produces a missing value for all periods after the first missing X value.

      convert x=y / transformout=( nomiss cusum );

Similar to the NOMISS operator, the MISSONLY operator does not perform any calculations (unless followed by the MEAN option), but it serves to modify the operation of the moving time window operator that follows it. When preceded by the MISSONLY operator, these moving time window operators replace any missing values with the moving statistic and leave nonmissing values unchanged.

For example, the following statement replaces any missing values of the variable X with an exponentially weighted moving average of the past values of X and leaves nonmissing values unchanged. The missing values are then interpolated using an exponentially weighted moving average or simple exponential smoothing.

      convert x=y / transformout=( missonly ewma 0.3 );

For example, the following statement replaces any missing values of the variable X with the overall mean of X.

      convert x=y / transformout=( missonly mean );

You can use the SETMISS operator to replace missing values with a specified number. For example, the following statement replaces any missing values of the variable X with the number 8.77.

      convert x=y / transformout=( setmiss 8.77 );

Classical Decomposition Operators

If yt is a seasonal time series with s observations per season, classical decomposition methods "break down" a time series into four components: trend, cycle, seasonal, and irregular components. The trend and cycle components are often combined to form the trend-cycle component. There are two forms of decomposition: multiplicative and additive.

y_{t} &=& TC_{t}S_{t}I_{t} \y_{t} &=& TC_{t} + S_{t} + I_{t} \
where

TCt
is the trend-cycle component.

St
is the seasonal component or seasonal factors that are periodic with period $s$ and with mean one (multiplicative) or zero (additive).

It
is the irregular or random component that is assumed to have mean one (multiplicative) or zero (additive).

The CD_TC operator computes the trend-cycle component for both the multiplicative and additive models. When s is odd, this operator computes an s-period centered moving average as follows:
TC_{t}
= \sum_{k=-{\lfloor s/2 \rfloor}}^{{\lfloor s/2 \rfloor}}{y_{t+k}/s}

In the case s=5, the CD_TC s operator is equivalent to the following CMOVAVE operator:
      convert x=tc / transformout=( cmovave 5 trim 2 );


When s is even, the CD_TC s operator computes the average of two adjacent s-period centered moving averages as follows:
TC_{t}
= \sum_{k = -{\lfloor s/2 \rfloor}}^{{\lfloor s/2 \rfloor}-1}{(y_{t+k}+y_{t+1+k})/2s}

In the case s=12, the CD_TC s operator is equivalent to the following CMOVAVE operator:
      convert x=tc / transformout=(cmovave 12 movave 2 trim 6);


The CD_S and CDA_S operators compute the seasonal components for the multiplicative and additive models, respectively. First, the trend-cycle component is computed as shown previously. Second, the seasonal-irregular component is computed by SIt = yt/TCt for the multiplicative model and by SIt = yt-TCt for the additive model. The seasonal component is obtained by averaging the seasonal-irregular component for each season.
S_{k+js} = \sum_{t = k \bmod s}{SI_t \over n/s}
where {0 {\le} j {\le} n/s } and {1 {\le} k {\le} s }.The seasonal components are normalized to sum to one (multiplicative) or zero (additive).

The CD_I and CDA_I operators compute the irregular component for the multiplicative and additive models respectively. First, the seasonal component is computed as shown previously. Next, the irregular component is determined from the seasonal-irregular and seasonal components as appropriate.

I_{t} &=& SI_{t}/S_{t} \I_{t} &=& SI_{t}-S_{t} \

The CD_SA and CDA_SA operators compute the seasonally adjusted time series for the multiplicative and additive models, respectively. After decomposition, the original time series can be seasonally adjusted as appropriate.

\tilde y_{t} &=& y_{t}/S_{t} = TC_{t}I_{t} \\tilde y_{t} &=& y_{t} - S_{t} = TC_{t} + I_{t} \

The following statements compute all the multiplicative classical decomposition components for the variable X for s=12.

      convert x=tc / transformout=( cd_tc 12 );
      convert x=s  / transformout=( cd_s  12 );
      convert x=i  / transformout=( cd_i  12 );
      convert x=sa / transformout=( cd_sa 12 );

The following statements compute all the additive classical decomposition components for the variable X for s=4.

      convert x=tc / transformout=( cd_tc  4 );
      convert x=s  / transformout=( cda_s  4 );
      convert x=i  / transformout=( cda_i  4 );
      convert x=sa / transformout=( cda_sa 4 );

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.