Economics 435 Summer 1994 D. Maki MIDTERM EXAMINATION Instructions: Three are three parts to this examination, each with two questions. Answer one question from each part (a total of three questions). Questions are equally weighted. Total time allowed = three hours. Part I. Answer one of the following two questions. 1. The following table gives data on expenditure on tobacco (Y) and disposable personal income (X) for the United Kingdom for the years noted. Year Y X Year Y X 1962 2701 51484 | 1972 2747 70214 1963 2787 53684 | 1973 2918 75059 1964 2753 55754 | 1974 2885 74049 1965 2652 56970 | 1975 2748 74005 1966 2737 58278 | 1976 2653 73437 1967 2753 59226 | 1977 2523 72288 1968 2740 60367 | 1978 2746 78259 1969 2707 60576 | 1979 2731 83666 1970 2702 62485 | 1980 2685 84771 1971 2605 64544 | 1981 2492 82903 A variable T was generated using the SHAZAM command: genr t=time(0). The following output is reported: |_ols y x t R-SQUARE = 0.5593 R-SQUARE ADJUSTED = 0.5074 VARIABLE ESTIMATED STANDARD T-RATIO NAME COEFFICIENT ERROR 17 DF X 0.30500E-01 0.7087E-02 4.303 T -57.714 12.60 -4.579 CONSTANT 1257.4 351.4 3.578 |_ols y x R-SQUARE = 0.0156 R-SQUARE ADJUSTED = -0.0391 VARIABLE ESTIMATED STANDARD T-RATIO NAME COEFFICIENT ERROR 18 DF X -0.11883E-02 0.2225E-02 -0.5340 CONSTANT 2793.6 152.2 18.36 |_ols y t R-SQUARE = 0.0791 R-SQUARE ADJUSTED = 0.0280 VARIABLE ESTIMATED STANDARD T-RATIO NAME COEFFICIENT ERROR 18 DF T -4.7602 3.827 -1.244 CONSTANT 2763.2 45.85 60.27 a. Interpret the results of the estimations in economic terms. Is there anything which is "surprising"? Explain. b. What econometric problems do you suspect are present in the above estimations? (Examine the individual observations carefully). Explain. 2. Assume a student estimated three models explaining the death rate due to coronary heart disease in the U.S. using annual data covering the period 1947-80 (n = 34), with t- values shown on parentheses. Variable Model 1 Model 2 Model 3 Constant 226.002 247.004 139.678 (1.54) (1.94) (1.79) CAL -69.983 -77.762 (-0.89) (-1.06) CIG 10.116 10.640 10.706 (2.00) (2.32) (2.33) UNEMP -0.613 (-0.39) EDFAT 2.810 2.733 3.380 (1.68) (2.40) (3.50) MEAT 0.112 (0.46) SPIRITS 21.716 23.650 26.749 (2.57) (3.11) (3.80) BEER -3.467 -3.489 -4.132 (-2.67) (-4.27) (-4.79) WINE -4.562 (-0.28) _ R2 0.645 0.674 0.672 The variables are defined as: CAL = per capita consumption of calcium; UNEMP = percent of civilian labor force unemployed; CIG = per capita consumption of cigarettes; EDFAT = per capital intake of edible fats and oils; MEAT = per capita consumption of meat (beef, veal, pork, and mutton); SPIRITS = per capita consumption of distilled spirits; BEER = per capita consumption of malted liquor, and WINE = per capita consumption of wine. a. Interpret the results of the estimations. Is there anything which is "surprising"? Explain. b. What econometric problems do you suspect are present in the estimations? Explain. c. If you had access to the requisite data, what additional estimations (if any) would you perform? Explain. Part II. Answer one of the following two questions. 3. Answer all three (unrelated) parts of this question: a. In the equation: Y = a + B1X1 + B2X2 + B3X3 + u, explain how you would test the joint hypotheses B1=B2 and B3=1. b. True, false or uncertain: "Compared with the unconstrained regression, estimation of a least squares regression under a constraint will result is a higher R2 if the constraint is true and a lower R2 if the constraint is false". Explain. c. Explain how you would estimate a linear regression equation which is piecewise linear with a joint ( or knot) at X = Xo if: (i) Xo is known, and (ii) Xo is unknown. 4. Answer both (unrelated) parts of this question. a. Assume you have biannual data (2 observations per year), and you define a seasonal dummy variable, D1, = unity for the first half observation, and zero otherwise. Consider the two procedures: I: Y = A + B1D1 + B2X + u II. Y = a1 + a2D1 + v X = a3 + a4D1 + w (Note v and w are seasonally adjusted Y and X, respectively) v = b1 + b2w + u' Derive the relationship between B1 and b1. b. Given two models: I. Y = ao + a1X1 + a2X2 + a3X3 + a4X4 + e II. Y = bo + b1X1 + b2X2 + b3X5 + b4X6 + v how would you test which model is preferable? Be specific about how you would interpret the results of your tests. Part III. Answer one of the following two questions. 5. Assume you have micro data (observations on individuals) giving annual income (Y) in some year; educational attainment measured as highest level = (i) less than high school completion, (ii) high school diploma, (iii) some post-secondary, but no university degree, or (iv) university degree; gender; and age measured as (i) less than 19 years, (ii) 20-35 years, (iii) 36-55 years, or (iv) 56 years and over. Assume you want a model which makes Y = f(education, gender, age). Explain how you would set up the model, define the variables, and do the estimation. How would you test the hypotheses: a. Education does not affect Y b. Gender does not affect Y c. Age does not affect Y d. The effect of age on Y is nonlineaer. e. The effect of education on income differs between males and females. 6. Assume we wish to explain the geographical mobility of persons between provinces, and that we have data as of some year on the net flows of persons for that year between each pair of provinces (45 observations). Assume that all net flows are recorded as positive numbers. Assume further that the relevant theory suggests that persons move in response to basic factors (differences in income levels, differences in employment probabilities, costs of moving proxied by distance) as well as non-basic factors (climate; educational, health and cultural facilities; opportunities for recreation). a. What variables would you use to estimate a regression model based on this theory? Please define your variables carefully and completely. b. In terms of your model, how would you test the hypothesis that the basic factors are more important than the non-basic factors? Explain.