Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Working with Time Series Data

Using PROC GPLOT

The following statements use the GPLOT procedure to plot CPI in the USCPI data set against DATE. (The USCPI data set was shown in a previous example; the data set plotted in the following example contains more observations than shown previously.) The SYMBOL statement is used to draw a smooth line between the plotted points and to specify the plotting character.

   proc gplot data=uscpi;
      symbol i=spline v=circle h=2;
      plot cpi * date;
   run;

The plot is shown in Figure 2.6.

tsdgs07.gif (4077 bytes)

Figure 2.6: Plot of Monthly CPI Over Time

Controlling the Time Axis: Tick Marks and Reference Lines

It is possible to control the spacing of the tick marks on the time axis. The following statements use the HAXIS= option to tell PROC GPLOT to mark the axis at the start of each quarter. (The GPLOT procedure prints a warning message indicating that the intervals on the axis are not evenly spaced. This message simply reflects the fact that there is a different number of days in each quarter. This warning message can be ignored.)

   proc gplot data=uscpi;
      symbol i=spline v=circle h=2;
      format date yyqc.;
      plot cpi * date /
           haxis= '1jan89'd to '1jul91'd by qtr;
   run;

The plot is shown in Figure 2.7.

tsdgs08.gif (3778 bytes)

Figure 2.7: Plot of Monthly CPI Over Time

The following example changes the plot by using year and quarter value to label the tick marks. The FORMAT statement causes PROC GPLOT to use the YYQC format to print the date values. This example also shows how to place reference lines on the plot with the HREF=option. Reference lines are drawn to mark the boundary between years.

   proc gplot data=uscpi;
      symbol i=spline v=circle h=2;
      plot cpi * date /
           haxis= '1jan89'd to '1jul91'd by qtr
           HREF='1jan90'd to '1jan91'd by year;
      format date yyqc6.;
   run;

The plot is shown in Figure 2.8.

tsdgs09.gif (4472 bytes)

Figure 2.8: Plot of Monthly CPI Over Time

Overlay Plots of Different Variables

You can plot two or more series on the same graph. Plot series stored in different variables by specifying multiple plot requests on one PLOT statement, and use the OVERLAY option. Specify a different SYMBOL statement for each plot.

For example, the following statements plot the CPI, FORECAST, L95, and U95 variables produced by PROC ARIMA in a previous example. The SYMBOL1 statement is used for the actual series. Values of the actual series are labeled with a star, and the points are not connected. The SYMBOL2 statement is used for the forecast series. Values of the forecast series are labeled with an open circle, and the points are connected with a smooth curve. The SYMBOL3 statement is used for the upper and lower confidence limits series. Values of the upper and lower confidence limits points are not plotted, but a broken line is drawn between the points. A reference line is drawn to mark the start of the forecast period. Quarterly tick marks with YYQC format date values are used.

   proc arima data=uscpi;
      identify var=cpi(1);
      estimate q=1;
      forecast id=date interval=month lead=12 out=arimaout;
   run;
   
   proc gplot data=arimaout;
      symbol1 i=none   v=star h=2;
      symbol2 i=spline v=circle h=2;
      symbol3 i=spline l=5;
      format date yyqc4.;
      plot cpi * date = 1 
           forecast * date = 2 
           ( l95 u95 ) * date = 3 /
           overlay
           haxis= '1jan89'd to '1jul92'd by qtr
           HREF='15jul91'd ;
   run;

The plot is shown in Figure 2.9.

tsdgs10.gif (4909 bytes)

Figure 2.9: Plot of ARIMA Forecast

Overlay Plots of Interleaved Series

You can also plot several series on the same graph when the different series are stored in the same variable in interleaved form. Plot interleaved time series by using the values of the ID variable to distinguish the different series and by selecting different SYMBOL statements for each plot.

The following example plots the output data set produced by PROC FORECAST in a previous example. Since the residual series has a different scale than the other series, it is excluded from the plot with a WHERE statement.

The _TYPE_ variable is used on the PLOT statement to identify the different series and to select the SYMBOL statements to use for each plot. The first SYMBOL statement is used for the first sorted value of _TYPE_, which is _TYPE_=ACTUAL. The second SYMBOL statement is used for the second sorted value of the _TYPE_ variable (_TYPE_=FORECAST), and so forth.

   proc forecast data=uscpi interval=month lead=12
                 out=foreout outfull outresid;
      var cpi;
      id date;
   run;
   
   proc gplot data=foreout;
      symbol1 i=none   v=star h=2;
      symbol2 i=spline v=circle h=2;
      symbol3 i=spline l=20;
      symbol4 i=spline l=20;
      format date yyqc4.;
      plot cpi * date = _type_ /
           haxis= '1jan89'd to '1jul92'd by qtr
           HREF='15jul91'd ;
      where _type_ ^= 'RESIDUAL';
   run;

The plot is shown in Figure 2.10.

tsdgs11.gif (4848 bytes)

Figure 2.10: Plot of Forecast

Residual Plots

The following example plots the residuals series that was excluded from the plot in the previous example. The SYMBOL statement specifies a needle plot, so that each residual point is plotted as a vertical line showing deviation from zero.

   proc gplot data=foreout;
      symbol1 i=needle v=circle width=6;
      format date yyqc4.;
      plot cpi * date /
           haxis= '1jan89'd to '1jul91'd by qtr ;
      where _type_ = 'RESIDUAL';
   run;

The plot is shown in Figure 2.11.

tsdgs12.gif (5204 bytes)

Figure 2.11: Plot of Residuals

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.