SOME USEFUL OPERATIONS IN SHAZAM AND REGRESSION ANALYSIS Contents: 1. Plots 2. Time Trends 3. Converting Monthly to Quarterly Data 4. Creating Indicator (i.e., Dummy) Variables 5. Creating Seasonal Indicator Variables 6. Creating Lagged Variables 7. Adjusting for inflation 8. Adjusting for population growth 9. Using more than one datafile 1. PLOT COMMAND This command can be used to get a graph of one variable plotted against another variable or of one variable plotted against time. It permits you to use your eyeballs to decide whether a relationship is better modelled as linear or curvilinear. To plot Y against X, the command is: PLOT Y X / options to plot Y against time the command is: PLOT Y / TIME options For more details issue the command HELP PLOT (press the "pause" key after you issue the help command to stop it from all flashing past you on the screen, and the "print screen" key to resume scrolling). 2. TIME TRENDS You may wish to include a time trend as an independant variable to try to capture the effect of some factors on which you have no data. If you issue the command: GENR T = TIME (x) a time index with name T is created whose first value is x+1. Thus if you have annual data beginning in 1971, the command: GENR T = TIME (1970) creates a variable T giving you the year of each observation. If you have quarterly data, the command GENR T = TIME(0) creates a variable T giving you, for each observation, the number of quarters since the first observation. 3. CONVERTING MONTHLY TO QUARTERLY DATA Suppose you have 96 monthly observations on X in the file "mdata" on your A: drive which you need to convert to 32 quarterly observations. Do the following: Type into a program file on your A: drive: file 4 A:mdata file 11 A:data file 12 A:qdata file 13 A:adata READ (4) X SMPL 1 32 GENR X1 = (SUM(X,3))/3 READ (12) Q PRINT X X 1 WRITE (13) Q X1 STOP This program takes the monthly data in MDATA, converts it to quarterly data, then takes the quarterly data in QDATA and puts the combined dataset into the file ALLDATA. You do not divide by 3 in generating X1 if X is a stock rather than a flow. The references to the A drive are for an IBM PC. For a MAC, the name of the disk would replace `A'. 4. INDICATOR / DUMMY VARIABLES Such variables are used as regressors when we believe there is a difference in the constant term for one part of our dataset as opposed to the rest of the dataset. Indicator/ dummy variables are 1 for one part of our observations and 0 for the rest of the observations. There are two ways of generating such variables in SHAZAM. For example, suppose we have annual data from 1961 -1987 and we want an indicator / dummy variable which was 0 before 1973 (the date of the oil crisis) and 1 from 1973 - 1987. Use either of: GENR D = (YEAR.GE. 1973) GENR D = DUM (YEAR-1972) To use these commands you must have read or created a variable called YEAR which gives the year of each observation. In the first command .GE. means greater than or equal to. The command creates a variable which is 1 when the expression in brackets is true, 0 when it is false. Other logical operators which can be used are .EQ., .NE., .GT., .LE., .AND., .OR., .NOT.. 5. SEASONAL INDICATOR VARIABLES With quarterly data it is often useful to allow the constant term to vary from season to season. This can be done by creating seasonal indicator variables with the command: MATRIX S = SEAS(nob,nseas) For nob put the number of observations you want and for nseas put the number of seasons you want. The variables S:1, S:2, S:3 and S:4 are the four seasonal indicator variables. THREE of these variables can be used in your OLS command. 6. LAGGED VARIABLES Sometimes theory or common sense suggest that the previous value of a dependant variable may be important in explaining the current value of the dependant variable. The command: GENR LQ = LAG(Q) creates a variable whose value in period t is equal to the value of Q in period t-1. LQ can be used as a regressor but remember to change SMPL command to SMPL 2 N before doing the regression because the first observation of LQ is set equal to 0. 7. USE OF PRICE INDICES Time series such as prices of commodities or wages or incomes should in many contexts be adjusted to remove the effects of overall inflation in the purchasing power of the dollar. The usual procedure is to divide by the consumer price index or some other price index appropriate to the situation. 8. USE OF PER CAPITA VARIABLES In some contexts it is appropriate to divide a variable by the population involved. For example, in comparing welfare among provinces, it is provincial domestic income per capita that is important not provincial domestic income. 9. USING MORE THAN ONE DATAFILE More than one datafile can be read during a SHAZAM program. Assign a number to each datafile you wish to read. The numbers can be 4 or any number from 11 to 89. A separate READ command will be needed in your program for each datafile you wish to read. SHAZAM commands (a) smpl 1 120 - Set sample range to 1, 120. (b) file 11 a:data - Assign unit 11 to the file data in the a: drive. (c) read(11) x y z - Read varaibles x, y & z from unit 11 (i.e., a:data). (d) print x y z / beg=1 end=20 - Print the first 20 observations of x, y & z. (e) Stat x y z / pcor - Report summary statistics on x, y & z. - Print the correlation matrix. (f) Plot y / time - Plot the time-series y against time. (g) Plot y x - Plot y against x. (h) genr lnx = log(x) - Take the natural log of x and store in a variable lnx. (i) genr t = time(0) - Generate a linear time trend and store in t. (j) genr x2 = x**2 - Take the square of x and store in x2. (k) genr d = ((time(0).ge.110).and.(time(0).le.115)) - Generate a dummy variable, which equal to 1 from observation 110 to 115, and 0 otherwise. (l) smpl 2 120 genr lagp = lag(p) genr inf = ((p - lagp) / lagp)*100 - Compute the rate of change of p. - If p is the price level, inf is the inflation rate. (m) ols y x / auxrsqr rstat resid=err predict=yhat - Regress y on x. Report auxiliary R-squared and residual statistics. - Store the residuals in err and the fitted values in yhat. (n) stop - I can't remember what it does! Sample Program The following sample program will provide all the necessary statistics and output to answer the questions stated in the project outline. It is written on the basis that cross reference can be found in the textbook. * ---------------------------------------------------- Specify the output file file output a:output * ------------------------------------------------------------- Documentation * Name : Put your name here * I.D. : Your student id number * Tut.# : Please make sure that the tutorial number is correct * T.A. : M.H. Franco Wong * Date : What date is it? * * y = Sales of clothing in Montreal, current $ D450541 * x1 = Personal income, current $ D10111 * x2 = CPI for clothing in Montreal (1981=100) D486183 * cpi= CPI for all item in Canada (1981=100) D484000 * x3 = Bank rate B14006 * ---------------------------------------------------------- Read monthly data file 11 a:data smpl 1 120 read(11) month y x2 cpi x3 print y x2 cpi x3 / beg=1 end=20 print y x2 cpi x3 / beg=101 end=120 * ------------------------------------------ Convert monthly data to quarterly smpl 1 40 genr y = sum(y, 3) genr x2 = sum(x2,3)/3 genr cpi = sum(cpi,3)/3 genr x3 = sum(x3,3)/3 *--------------------------------------------------------- Read quarterly data read(11) quarter x1 *-------------------------------------------------- Transform data (deflating) genr y = (y/cpi)*100 genr x1 = (x1/cpi)*100 genr x2 = (x2/cpi)*100 print y x1 x2 x3 plot y / time nopretty * ---------------------------------------------------------- Multicollinearity stat x1 x2 x3 / pcor plot x1 x2 / nopretty plot x1 x3 / nopretty plot x2 x3 / nopretty * ------------------------------------------------------------- OLS estimation ols y x1 x2 x3 / auxrsqr rstat resid=err predict=yhat * --------------------------------------------------------- Heteroskedasticity genr errsq = err**2 plot errsq yhat / nopretty plot errsq x1 / nopretty plot errsq x2 / nopretty plot errsq x3 / nopretty ols errsq yhat / noanova gen1 lmstat = $n * $r2 print lmstat * ------------------------------------------------------------ Autocorrelation plot err / time nopretty stop