Postscript version of these notes
is N(0,6/10), U is
,
V is
,
X is t6,
since Z1 and V are independent and
,
and
Y=(U/4)/(V/6) has an F4,6distribution, using the fact that U and V are independent. Your answer
should specify the mean and variance for
,
the various degrees of
freedom and note the required independence for X and Y.
The required probabilities are 0.197, 0.95, 0.725, 0.025, 0.688, 0.034.
I used SPlus to compute these; if you used tables your answers will be less
accurate, particularly for
.
You need to remember:
Differentiate
with respect to
and
get
which is 0 if and only if
.
The second derivative of the function being
minimized is
so this is a minimum.
Note: I saw a dismaying tendency to write things like
Let
.
Then
.
Use
to see that
You have to compute
which is simply
.
The standard error is then
.
If we assume that the errors are independent
)
random
variables then
is independent of the usual estimate of
,
samely
s2 = 0.12/3=0.04 in this case. The usual tstatistic then has a t distribution and the confidence interval
is
which boils down to
.
Note: several people used formulas for simple linear regression as if they had fitted an intercept as well. They then had 2 degrees of freedom in spite of the fact that the question says there are n-1 degrees of freedom for error. Also: many of you neglected to assume that the errors have a normal distribution.
Let
;
then
.
We have
.
The difference
is then simply
In this case Yi is
and the likelihood is
Writing the data as
YT = [Y1,1, Y1,2,Y1,3,Y2,1,Y2,2,
Y2,3] the design matrix is
The matrix XaT Xa is 6 by 6 but has rank only 4 so is singular and
must have determinant 0. The normal equations are
NOTE: You need to check that the equations are consistent. Many students row reduced only the matrix of coefficients forgetting the fact that if A has less than full rank Ax=b might have 0 solutions. My solution above row reduces the augmented matrix, checking that there is at least one solution..
The restrictions give
and
.
In each model equation which mentions either
or
you replace that parameter by the equivalent formula. So, for
example,
Note: some people eliminated
and not
.
This is fine
and leads to a similar matrix with the rows in a different order.
This just makes the design matrix Xc just the corresponding columns, 1, 3, 5 and 6 of Xa.
To write
Xc = Xa A just let A be the
matrix which picks
out columns 1,3,5 and 6 of Xa, namely,
The easy way to do this is to say: the fitted vector
is the closest point
in the column space of the design matrix to the data vector Y. Since all three have the
same column space they all have the same closest point and so the same
.
Algebra is an alternative tactic:
The matrix Abc is invertible and we have
The algebraic approach makes it a bit more difficult to deal with the case of Xabecause the normal equations have many solutions.
Suppose that
is any solutions of the normal equations
.
Then
Thus
The following SAS code was used for all these parts:
data gpa; infile "CH01PR19.DAT"; input gpa test; proc glm; model gpa=test; estimate "fit_at_5" intercept 1 test 5; estimate "fit4.7" intercept 1 test 4.7; output out=gpaout r=resid p=fitted ; run; proc print data=gpaout; var test gpa fitted resid; run; proc means data=gpaout; var resid; run; proc rank normal=vw data=gpaout out=gpaout2; var resid; ranks normscr; run; proc gplot data=gpaout2; plot gpa*test fitted *test /overlay; run; proc gplot data=gpaout2; plot fitted*test; plot resid*normscr; run;obtaining the following output:
Dependent Variable: GPA
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 1 6.43372807 6.43372807 34.00 0.0001
Error 18 3.40627193 0.18923733
Corrected Total 19 9.84000000
R-Square C.V. Root MSE GPA Mean
0.653834 17.40057 0.43501 2.50000
Source DF Type I SS Mean Square F Value Pr > F
TEST 1 6.43372807 6.43372807 34.00 0.0001
Source DF Type III SS Mean Square F Value Pr > F
TEST 1 6.43372807 6.43372807 34.00 0.0001
Dependent Variable: GPA
T for H0: Pr > |T| Std Error of
Parameter Estimate Parameter=0 Estimate
fit_at_5 2.50000000 25.70 0.0001 0.09727213
fit4.7 2.24802632 21.12 0.0001 0.10643937
T for H0: Pr > |T| Std Error of
Parameter Estimate Parameter=0 Estimate
INTERCEPT -1.699561404 -2.34 0.0311 0.72677682
TEST 0.839912281 5.83 0.0001 0.14404759
OBS TEST GPA FITTED RESID
1 5.5 3.1 2.91996 0.18004
2 4.8 2.3 2.33202 -0.03202
3 4.7 3.0 2.24803 0.75197
4 3.9 1.9 1.57610 0.32390
5 4.5 2.5 2.08004 0.41996
6 6.2 3.7 3.50789 0.19211
7 6.0 3.4 3.33991 0.06009
8 5.2 2.6 2.66798 -0.06798
9 4.7 2.8 2.24803 0.55197
10 4.3 1.6 1.91206 -0.31206
11 4.9 2.0 2.41601 -0.41601
12 5.4 2.9 2.83596 0.06404
13 5.0 2.3 2.50000 -0.20000
14 6.3 3.2 3.59189 -0.39189
15 4.6 1.8 2.16404 -0.36404
16 4.3 1.4 1.91206 -0.51206
17 5.0 2.0 2.50000 -0.50000
18 5.9 3.8 3.25592 0.54408
19 4.1 2.2 1.74408 0.45592
20 4.7 1.5 2.24803 -0.74803
Analysis Variable : RESID
N Mean Std Dev Minimum Maximum
----------------------------------------------------------
20 -7.77156E-17 0.4234117 -0.7480263 0.7519737
----------------------------------------------------------
Now to answer the questions:
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 1 6.43372807 6.43372807 34.00 0.0001
Error 18 3.40627193 0.18923733
Corrected Total 19 9.84000000