BUEC 333 Summer 2001 D. Maki FINAL EXAMINATION - A Important - record on the top front of your answer sheet the letter "A", "B" or "C" from the examination title above. This examination consists of 30 multiple choice questions, with a value of 5 points each, for a total of 150 points on the examination. Choose the letter corresponding to the one best answer to each question. Your grade will be computed on the basis of the number of correct answers. The exams will be machine graded, and making the answers legible to the machine is your responsibility. Use soft pencil (HB or softer) only. Fill in the appropriate circles corresponding to your name on the answer sheet. This is a closed book examination - no notes are allowed. A formula sheet and tables are attached. Total time allowed = 3 hours. Infinite populations are assumed in all questions where population size is relevant. Note: the notation X**2 means X-squared. 1. A statistician constructed 95% confidence intervals for predictions based on values of the X variable of 3, 4 and 5. If the sample size was 50 and the sum of the X's was 190, which of the three X values would provide the widest interval? a. 3 b. 4 c. 5 d. they would all be the same width e. cannot be determined from the given information 2. The standard error of estimate for a simple (one X) regression equation given by: SSE/(n-2) equals 18.69 for an estimation based on 10 observations. If the variance of the 10 X values was 18 and the variance of the 10 Y values was 27, what is the estimated standard error of the slope coefficient, b? a. 1.47 b. 0.68 c. 1.39 d. 4.41 e. cannot be determined from the given information Questions 3 and 4 refer to the following information. A college admission office is interested in seeing how well the number of applications received by January 1 predicts the number of new students entering in the fall. The admission officer knows from a study conducted 10 years ago that the slope of the regression line for predicting new enrollment was 0.85 at that time. To see if the slope has changed she uses annual data for the 10 years since the previous study. She finds that the standard error of the estimated regression slope coefficient is 0.06. 3. Testing whether the slope is the same as it was 10 years ago, at the 0.05 level, what will be the upper and lower limits of the acceptance region? a. 0.612, 0.888 b. 0.717, 0.983 c. 0.712, 0.988 d. 0.79, 0.91 e. cannot be determined from the information given 4. If SSR = 11.25, what was the value of the estimated slope coefficient? a. about 0.8 b. about 0.7 c. about 0.6 d. about 0.5 e. cannot be determined from the information given 5. To test whether the mean number of units sold per day was equal to 7, a random sample of daily sales by a company for 9 different days was taken, producing a sample mean of 6.0. If the variance of daily sales is known to be 4, what is the p-value associated with this test? a. 0.668 b. 0.1336 c. greater than 0.40 d. less than 0.10 e. cannot be determined from the information given 6. Consider a Mann-Whitney test with the following values: n1=12, n2=15, R1=192, R2=186. If all of the observations came from identical populations, what is the mean of the U statistic? a. 66 b. 90 c. 114 d. 420 e. none of the above Questions 7-10 use the following information. To model teacher effectiveness, a principal uses class average test scores on a standardized exam to indicate effectiveness and uses two independent variables, gender of teacher (a dummy variable that = 1 if the teacher is male, 0 otherwise) and years of experience. The data are: Average test score Gender of teacher (X1) Experience (X2) 4 M 10 5 F 2 3 M 12 1 M 8 2 F 1 4 M 12 2 M 8 7. For this problem, Sum(X1Y) is: a. 105 b. 158 c. 21 d. 14 e. none of the above 8. To solve for the least squares coefficients, the equation with Sum(Y) on the left hand side is: a. 21=5a+5b1+53b2 b. 14=7a+5b1+50b2 c. 21=7a+5b1+53b2 d. 14=5a+5b1+50b2 e. none of the above 9. The most obvious potential problem with this estimation is: a. multicollinearity b. nonlinearity c. autocorrelation d. heteroscedasticity e. bias due to the use of a dummy variable 10. Assume for this estimation the equation turns out to be: Y-hat=2.636-5.594X1+0.576X2. Then the predicted test score for a student taught by a female teacher with 5 years of experience is: a. 3.212 b. -0.078 c. -2.382 d. 5.516 e. none of the above 11. A really tough-marking professor believes that in his class 5% of students should receive A's, 15% B's, 30% C's, 30% D's and 20% F's. In a class of 150, there were 10 A's, 30 B's, 36 C's, 50 D's and 24 F's. Using a goodness of fit test, what is the value of the calculated test statistic? a. 6.75 b. 1.363 c. 21.334 d. 16.919 e. 6.889 12. The estimating equation developed by the method of least squares does NOT ensure that: a. the values of the slope and intercept are unique b. the sum of the errors of the observed points around the regression line is zero c. outlier data points are more influential than other data points d. the observed errors are uncorrelated with all of the X's e. the observed errors are uncorrelated with the Y variable Questions 13 and 14 use the following information. A rock shop sells four types of rocks and minerals. The following gives data for three years on the price per gram and quantity sold (thousands of grams). 1998 1999 2000 Mineral Price Quantity Price Quantity Price Quantity Agate .3 3 .3 5 .3 5 Amethyst .25 2 .3 4 .3 3 Hematite .25 2 .2 3 .25 4 Pyrite .2 5 .25 5 .25 6 13. What is the Laspeyres price index for 1998 using 2000 as the base year? a. 56.3 b. 177.6 c. 90.8 d. 110.1 e. 95.9 14. What is the Laspeyres quantity index for 1999 using 1998 as the base year? a. 177.6 b. 146.6 c. 150.0 d. 110.1 e. 148.8 15. Someone calculated that no customers entered a store during a half-hour period 92 times out of 240 samples. If he wants to test whether no customers enter in a half-hour period less than 40% of the time, what is the p-value for this test? a. .2981 b. .7019 c. .7981 d. .2019 e. cannot be determined from the information given 16. For matched pairs data, testing the hypothesis that the variances of the observations on X and Y are equal versus the alternative that the variance of X is greater than the variance of Y: a. can be done with an F test, as long as the variance of X is divided by the variance of Y b. can be done with an F test, as long as the larger estimated variance is put in the numerator c. can be done by testing whether the correlation of (X-Y) and (X+Y) is positive d. can be done by testing whether the correlation of (X-Y) and (X+Y) is negative e. cannot be done with any technique discussed in Newbold Questions 17 and 18 use the following information. An ambidextrous basketball star wishes to analyze her free- throw shooting right-handed and left-handed. She takes two random samples of 50 shots each, and makes 46 right-handed and 42 left-handed. 17. Testing whether the percentages right-handed and left- handed differ, the standard error is: a. .0011 b. .0325 c. .0042 d. .0650 e. 1.230 18. The alternative hypothesis that the proportion of left- handed free throws made is less than .85 is tested with alpha=.01. If the actual proportion is 0.80, what is the power of the test? a. .1170 b. .0901 c. .9099 d. .8830 e. .5000 19. A researcher estimates a multiple regression equation with five independent variables and an intercept using 100 observations of quarterly data, obtaining an R-square value of 0.84. To test for seasonality, three dummy variables are added, and the R-square increases to 0.93. Testing the null hypothesis that there is no seasonality, the calculated test statistic is: a. 0.390 b. 3.90 c. 39.00 d. 390.00 e. cannot be determined from the information given 20. It is desired to test whether a time series of observations is random. The observations (in time order) are: 17, 27, 18, 19, 52, 43, 22, 33, 35, 40. Testing the one-tailed alternative hypothesis of a positive association between adjacent observations, the p-value is: a. 0.040 b. 0.167 c. 0.357 d. 0.643 e. impossible to determine from the information given 21. An ichthyologist wants to test two types of fish food to determine if they have different effects on weight gain. He takes 40 pairs of fish (each pair of the same species and age) and feeds on fish of each pair food A and the other food B. After two weeks, the fish are weighed to see which of the pair is heavier. Food A produced the heaviest fish in 26 pairs, food B in 14 pairs. For a sign test with these data, the upper limit of a 95% acceptance region is: a. 0.350 b. 0.148 c. 0.648 d. 0.822 e. 0.655 22. Suspecting that men and women appreciate different characteristics of automobiles, a group of men was asked to rank 6 automobiles, 1=best, 6=worst; and then a group of women was asked to do the same. The results were (both presented in the same vehicle order): Men's rank 1 2 3 4 5 6 Women's rank 4 3 5 2 1 6 The Spearman's rank correlation for these data is: a. 0.7 b. 0.07 c. 0.3 d. 0.03 e. greater than 0.7 Questions 23 and 24 refer to the following information. Information is available on 15 successful dot.com companies giving the age of the CEO - call this sample 1, and on 9 failed dot.com companies giving the age of the CEO - call this sample 2. The mean for sample 1 = 21, and the estimated variance is 87. The mean for sample 2 =29, and the estimated variance is 95. 23. Testing to see if the mean ages of CEO's in the two types of companies are equal, the calculated test statistic is: a. 1.98 b. 1.96 c. significant at the .05 level d. less than 1.50 e. 2.00 24. Testing to see if the variances are equal in the two samples at the .10 level, the conclusion is: a. accept the null hypothesis since 1.09 < 2.70 b. accept the null hypothesis since 1.09 < 3.22 c. accept the null hypothesis since the calculated test statistic is < 1.00 d. reject the null hypothesis e. none of the above 25. A researcher used data on 150 households to investigate the quantity of oranges purchased per year (Y). The independent variables were price of oranges (PO), price of grapefruit (PG), total expenditure (E) and number of persons in the household (NP). The results were (standard errors in parentheses): Y = 3.1 + 0.83E + 0.1PO - 0.56GP + 12.5NP (1.0) (0.83) (0.02) (0.06) (15.2) R-square=0.20 Which of the following is a reasonable interpretation? a. the R-square is too low b. none of the slope coefficients are significant c. this appears to be a good estimation d. household size is a silly variable, and should be dropped e. there must be a mistake - both the coefficient and the standard error of the coefficient of E are the same 26. In a monthly time series, the first 13 observations (in order) are: 5, 7, 4, 9, 13, 8, 10, 6, 10, 12, 15, 17, 20. In obtaining a centred moving average for the purposes of calculating a seasonal index by the ratio-to-moving-average method, the first observation in the CMA is: a. 7.25 b. 8.50 c. 9.75 d. greater than 10 e. none of the above 27. In a regression of 19 time series observations with 5 independent variables, if one tests the null hypothesis of no autocorrelation against a two-tailed alternative using the .05 Durbin-Watson table, the conclusion is: a. it is impossible to test the null hypothesis b. it is impossible to clearly accept the null hypothesis c. it is impossible to clearly reject the null hypothesis d. it is impossible to end up with an indeterminate result e. none of the above 28. Given the four data points: {X=-2, Y=2}, {X=-1, Y=0}, {X=1, Y=0} and (X=2, Y=2}, regression of Y on X would disclose: a. a positive relationship for Y=a+b*X+e b. a negative relationship for Y=a+b*X+e c. a negative b1 and positive B2 for Y=a+b1*X+b2*X**2 d. a positive b1 and negative b2 for Y=a+b1*X+b2*X**2 e. none of the above Questions 29 and 30 refer to the following information. A government administrator selected a ssample of 8 hospitals from the large number in operation and obtained the results shown below when he regressed each hospital's operating budget (in $ millions) for 1999 (Y) on its bed size (X): Y-hat=0.83 + 0.0124X MSE=SSE/(n-2)=0.27 Sum(X - Xbar)**2=805,000 X-bar=380 29. The 90 % confidence interval for the slope coefficient is: a. -.9972 - 1.0220 b. -.3998 - .4246 c. -.0211 - .0459 d. .0113 - .0135 e. .0118 - .0130 30. A ninth hospital is selected from the population of operating hospitals. If the hospital has X = 500 beds, then a 90% prediction interval for its operating budget for 1999 is: a. 5.95 - 8.11 b. 4.84 - 7.56 c. 5.67 - 8.39 d. 6.47 - 7.59 e. 6.76 - 7.30