|
Assignment Answers
The Answers to Assignment #1
Here's the output:
asking price
Mean 56112.34568
Standard Error 1813.565466
Median 52900
Mode 49900
Standard Deviation 16322.08919
Sample Variance 266410595.7
Kurtosis 0.429181538
Skewness 0.781429828
Range 74000
Minimum 25900
Maximum 99900
Sum 4545100
Count 81
Confidence Level(95.0%) 3609.113807
selling price
Mean 51939.44444
Standard Error 1751.837902
Median 48000
Mode 45000
Standard Deviation 15766.54112
Sample Variance 248583818.8
Kurtosis 0.381142789
Skewness 0.797841069
Range 70000
Minimum 25000
Maximum 95000
Sum 4207095
Count 81
Confidence Level(95.0%) 3486.271919
METHOD 1
Doing the F-test Manually
using the output from the
Descriptive Statistics tool
F= 266410595.7
248583818.8
= 1.072
METHOD 2
F-Test Two-Sample for
Variances
asking price selling price
Mean 56112.34568 51939.44444
Variance 266410595.7 248583818.8
Observations 81 81
df 80 80
F 1.07171
P(F<=f) one-tail 0.37876
F Critical one-tail 1.44773
The null hypothesis states that the two variances are infact identical. Given that the observed F-stat had a P-value of 37.876%, we can not reject the null hypothesis at a 5, or even 10% level of significance. Therefore, we conclude that, based upon the statistical evidence, we can accept the null hypothesis that the variance of the selling prices is the same as the variance of the asking prices.
The Answers to Assignment #2
Before you actually run any regressions, it is always a good idea to see exactly what data looks like. Following section 14.1 of the text book, you should have been able to produce the following chart.
The graph and regressed trend-line do not give you very much information, however. In order to get more detailed information, you will have to run a full regression. The output is as follows:
SUMMARY
OUTPUT
Regression
Statistics
Multiple R 0.984629521
R Square 0.969495294
Adjusted R 0.969109159
Square
Standard 2868.736216
Error
Observations 81
ANOVA
df SS MS
Regression 1 20662705504 2.0663E+10
Residual 79 650142150.5 8229647.47
Total 80 21312847654
F Significance
F
Regression 2510.764351 1.23254E-61
Residual
Total
Coefficients Standard t Stat
Error
Intercept 3169.232921 1103.622675 2.87166347
selling price 1.019323817 0.020342728 50.1075279
P-value Lower 95% Upper 95% Lower Upper 95.0%
95.0%
0.005239803 972.5250795 5365.940762 972.525079 5365.940762
1.23254E-61 0.978832595 1.059815039 0.9788326 1.059815039
RESIDUAL
OUTPUT
Observation Predicted Residuals Standard
asking price Residuals
1 45980.83323 -1080.83323 -0.3767628
2 42413.19987 -1413.199871 -0.4926211
3 53625.76186 -725.7618568 -0.2529901
4 64838.32384 -2338.323843 -0.8151059
5 65347.98575 -347.985751 -0.1213028
6 70444.60484 -544.6048354 -0.1898414
7 70444.60484 2455.395165 0.85591528
8 73502.57629 -602.5762861 -0.2100494
9 88588.56878 -2688.568776 -0.9371962
10 93379.39072 120.6092846 0.04204265
11 94908.37644 4991.623559 1.74000786
12 34258.60934 -2358.609336 -0.8221771
13 30690.97598 -790.9759769 -0.2757228
14 39864.89033 -1964.890329 -0.6849324
15 40884.21415 -984.2141457 -0.3430828
16 41801.60558 -1901.605581 -0.6628722
17 44451.8475 -1551.847505 -0.5409516
18 44706.67846 10193.32154 3.55324463
19 49038.80468 861.1953192 0.30020025
20 49038.80468 861.1953192 0.30020025
21 52606.43804 -5706.43804 -1.9891819
22 70342.67245 1557.327546 0.54286188
23 41801.60558 -1901.605581 -0.6628722
24 41903.53796 -3.537962625 -0.0012333
25 45980.83323 -3080.83323 -1.073934
26 46286.63038 -2386.630375 -0.8319449
27 47509.81896 -1609.818956 -0.5611596
28 48019.48086 -2019.480864 -0.7039619
29 48529.14277 1370.857228 0.47786102
30 48936.8723 63.12770086 0.0220054
31 49038.80468 3861.195319 1.34595691
32 52096.77613 5403.223869 1.88348578
33 53116.09995 -3216.099948 -1.121086
34 55664.40949 3235.590509 1.12788011
35 65347.98575 -347.985751 -0.1213028
36 65347.98575 -1447.985751 -0.5047469
37 66061.51242 -3161.512423 -1.1020576
38 71463.92865 5436.071348 1.89493594
39 86753.78591 146.2140944 0.05096812
40 58722.38094 -2222.380941 -0.7746899
41 70954.26674 545.7332561 0.19023473
42 74521.9001 -4621.900103 -1.6111276
43 86753.78591 -3853.785906 -1.3433741
44 96947.02407 52.97592551 0.01846664
45 41903.53796 -2003.537963 -0.6984044
46 48019.48086 -3519.480864 -1.2268402
47 44961.50941 -1061.509413 -0.3700268
48 47000.15705 899.8429529 0.31367225
49 52096.77613 2403.223869 0.83772912
50 59028.17809 -1028.178086 -0.358408
51 59232.04285 667.9571503 0.23284021
52 61270.69048 2229.309517 0.77710509
53 72483.25247 2416.747531 0.84244327
54 100004.9955 -104.9955251 -0.0365999
55 41903.53796 7996.462037 2.78745114
56 28652.32834 -2752.328343 -0.9594219
57 28902.06268 797.9373218 0.27814942
58 60251.36667 4648.633333 1.62044642
59 39355.22842 -855.2284204 -0.2981203
60 33748.94743 6151.052572 2.14416806
61 52096.77613 -2296.776131 -0.800623
62 52096.77613 -196.7761315 -0.0685933
63 56174.0714 -1274.071399 -0.4441229
64 58212.71903 -2312.719033 -0.8061804
65 60251.36667 -351.3666666 -0.1224813
66 64328.66193 671.3380659 0.23401875
67 64328.66193 571.3380659 0.1991602
68 71463.92865 -1563.928652 -0.5451629
69 51077.45231 3822.547685 1.3324849
70 74521.9001 4378.099897 1.52614237
71 49038.80468 861.1953192 0.30020025
72 55154.74758 745.2524179 0.25978423
73 36297.25697 2202.74303 0.7678444
74 37826.2427 -2326.242695 -0.8108946
75 38845.56651 3154.433488 1.09958994
76 39864.89033 35.10967115 0.01223872
77 49038.80468 861.1953192 0.30020025
78 51077.45231 -3177.452315 -1.107614
79 59232.04285 -2332.04285 -0.8129164
80 68304.02482 -3404.02482 -1.1865939
81 54135.42377 -1235.423765 -0.4306509
The following two graphs are also part of the regression output:
The equation you were looking for is
The Answers to Assignment #3Your output should have looked something like this:
SUMMARY
OUTPUT
Regression
Statistics
Multiple R 0.985265572
R Square 0.970748247
Adjusted R 0.969608568
Square
Standard 2748.602963
Error
Observations 81
ANOVA
df SS MS
Regression 3 19304984495 6434994832
Residual 77 581721005.1 7554818.247
Total 80 19886705500
ANOVA Cont.
F Significance F
Regression 851.7736127 6.16519E-59
Residual
Total
The Regression Output
Coefficients Standard t Stat
Error
Intercept -809.4898084 1213.059803 -0.667312367
asking price 0.939904447 0.024139085 38.93703723
days on sale -17.60678093 9.811878374 -1.794435301
lot size 0.217499996 0.282492101 0.76993302
P-value Lower 95% Upper 95%
Intercept 0.506567616 -3225.003385 1606.023768
asking price 2.04821E-52 0.89183733 0.987971564
days on sale 0.076668613 -37.14475041 1.931188559
lot size 0.443695326 -0.345014319 0.780014312
Lower 95.0% Upper 95.0%
Intercept -3225.003385 1606.023768
asking price 0.89183733 0.987971564
days on sale -37.14475041 1.931188559
lot size -0.345014319 0.780014312
RESIDUAL
OUTPUT
Observation Predicted selling Residuals Standard
price Residuals
1 40798.27607 1201.723933 0.437212631
2 37909.92805 590.0719478 0.214680678
3 48943.16132 556.8386812 0.202589711
4 57919.55029 2580.449705 0.938822282
5 60452.73135 547.2686537 0.199107933
6 65019.93267 980.0673319 0.356569263
7 67866.77992 -1866.779915 -0.67917409
8 68403.063 596.937001 0.21717833
9 80725.80725 3074.192753 1.118456465
10 86859.2835 1640.716499 0.596927428
11 93050.74737 -3050.747367 -1.109926536
12 28523.06025 1976.939745 0.71925257
etc.....etc..........
InterpretationAll students should at least have been able to generate the following equation from this
However, it doesn't have to end there, and when you do Assignments 4 and 5, you will want to do further analysis: The t-tests on the Constant and Lot do not look very good. At a 10% level of significance, we would accept the null hypothesis that the coefficient was equal to zero for both at a 10% level of significance.
There are three things to note here:
Important Note: When it comes to doing the project, you will realize that it isn't quite so easy to drop a variable, as we have done here with Lot size. The reason for this is fairly simple. When you conduct more thorough investigations of your data, as you are expected to do if your project, you often find that you can no longer trust your t-tests and f-tests. Why is this? T and F tests are no loner valid when the errors are not independently and idtentically distributed according to a normal distribution with a mean of zero { this is often shortened to IID~N(0) }. When you have Heteroscedasticity or autocorrelation, the errors are nolonger independently and idtentically distributed according to a normal distribution with a mean of zero. T tests also cease to be beliveable when there is serious multicollinearity in the data. Almost all projects have at least one of these problems. ( Hetero, Auto, Multi ) Thus, as you can now see, in the real world, it isn't quite so easy to drop a variable.
The Answers to Assignment #4These answers will not be posted until week the end (ie late Friday afternoon) of week# 9. If you do not see the answers up by Saturday morning, please email the Lab-Ta.
The Answers to Assignment #5These answers will not be posted until week the end of week# 10. If you do not see them up by the end of that week, please email the Lab-TA and remind them.
|