.
be the observation at x=2500 and
the observation
x=2000. Then the mean of
is 1.3(500)=650 while the SD of the
difference is
. The probability we want is that
a standard normal is more than (1000-650)/495=0.707.This is roughly 0.24.
is normal with mean
and SD 495. We want
the probability that
or the probability to the right of
to be 0.95. This means that
or
. The answer is then 626.
options pagesize =60 linesize=80;
data q16;
infile 'q16.dat';
input x y;
proc reg;
model y=x;
plot residual.*predicted.;
plot y*x;
run;
The output for the full data set is
Dependent Variable: Y
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Prob>F
Model 1 538208.57051 538208.57051 385.024 0.0001
Error 4 5591.42949 1397.85737
C Total 5 543800.00000
Root MSE 37.38793 R-square 0.9897
Dep Mean 560.00000 Adj R-sq 0.9871
C.V. 6.67642
Parameter Estimates
Parameter Standard T for H0:
Variable DF Estimate Error Parameter=0 Prob > |T|
INTERCEP 1 137.875631 26.37756553 5.227 0.0064
X 1 9.311567 0.47454663 19.622 0.0001
----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+----
RESIDUAL | |
| |
50 + +
| |
| |
| 1 |
| |
40 + +
| |
| |
| |
| |
30 + +
| |
| |
| |
| |
20 + 1 +
| |
| |
| |
R | 1 |
e 10 + +
s | |
i | |
d | 1 |
u | |
a 0 + +
l | |
| |
| |
| |
-10 + +
| |
| |
| |
| |
-20 + +
| |
| |
| |
| |
-30 + +
| |
| |
| |
| 1 |
-40 + 1 +
| |
| |
----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+----
200 300 400 500 600 700 800 900 1000 1100 1200
Predicted Value of Y PRED
----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+----
Y | |
| |
| |
| |
| |
1200 + 1 +
| |
| |
| |
1100 + +
| |
| |
| |
1000 + +
| |
| |
| |
900 + +
| |
| |
| |
800 + +
| |
| |
| |
700 + +
| |
| |
| |
600 + +
| |
| 1 |
| |
500 + 1 +
| 1 |
| |
| |
400 + +
| |
| 1 |
| |
300 + +
| 1 |
| |
| |
200 + +
| |
| |
| |
| |
----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+----
10 20 30 40 50 60 70 80 90 100 110 120
X
while that for the edited data set is
Sum of Mean
Source DF Squares Square F Value Prob>F
Model 1 49839.81693 49839.81693 61.274 0.0043
Error 3 2440.18307 813.39436
C Total 4 52280.00000
Root MSE 28.52007 R-square 0.9533
Dep Mean 432.00000 Adj R-sq 0.9378
C.V. 6.60187
Parameter Estimates
Parameter Standard T for H0:
Variable DF Estimate Error Parameter=0 Prob > |T|
INTERCEP 1 190.352403 33.40167361 5.699 0.0107
X 1 7.551487 0.96470574 7.828 0.0043
-+----+----+----+----+----+----+----+----+----+----+----+----+----+--
RESIDUAL | |
| |
| |
| |
30 + 1 +
| |
| |
| |
| 1 |
| |
| |
20 + +
| |
| |
| |
| |
| |
| |
10 + +
| |
| |
R | |
e | |
s | |
i | |
d 0 + +
u | |
a | |
l | |
| |
| |
| |
-10 + +
| |
| |
| |
| 1 1 |
| |
| |
-20 + +
| |
| 1 |
| |
| |
| |
| |
-30 + +
| |
| |
| |
-+----+----+----+----+----+----+----+----+----+----+----+----+----+--
280 300 320 340 360 380 400 420 440 460 480 500 520 540
Predicted Value of Y PRED
-----+----+----+----+----+----+----+----+----+----+----+----+----+----+-----
600 + +
| |
| |
| |
| |
| |
| 1 |
550 + +
| |
| |
| |
| |
| |
| |
500 + 1 +
| |
| |
Y | |
| 1 |
| |
| |
450 + +
| |
| |
| |
| |
| |
| |
400 + +
| |
| |
| |
| |
| |
| |
350 + 1 +
| |
| |
| |
| |
| |
| |
300 + +
| |
| |
| 1 |
| |
| |
| |
250 + +
-----+----+----+----+----+----+----+----+----+----+----+----+----+----+-----
12.5 15.0 17.5 20.0 22.5 25.0 27.5 30.0 32.5 35.0 37.5 40.0 42.5 45.0
X
.
but I doubt the utility of the
calculation unless the x levels were set by random sampling of
pairs. The value is 0.99.
is given on page 503 and you could carry
out a t-test using
and getting P-values from the
t-distribution on 3 degrees of freedom.
. The right hand side is
which is evidently
.
Sum of Mean
Source DF Squares Square F Value Prob>F
Model 1 8.17906 8.17906 674.982 0.0001
Error 8 0.09694 0.01212
C Total 9 8.27600
Root MSE 0.11008 R-square 0.9883
Dep Mean 3.92000 Adj R-sq 0.9868
C.V. 2.80814
Parameter Estimates
Parameter Standard T for H0:
Variable DF Estimate Error Parameter=0 Prob > |T|
INTERCEP 1 2.141648 0.07679262 27.889 0.0001
X 1 0.006801 0.00026176 25.980 0.0001
which is 3.06. You get a
P-value from t tables on 8 degrees of freedom and conclude
that the slope is not 0.0060 (
two tailed).
(in my
notation). You are allowed to assume that
. Now

Putting the two pieces we get
which is just
.
and
. The fitted slope
based on the starred data is

The estimated standard error
of
is

To evaluate this note first that
. Next

Assembling the pieces shows that

on 12
degrees of freedom. For a two sided test I get P a bit over 0.10
and conclude that there is only quite weak evidence fo a correlation
between content and gas porosity.
%.
must be either
0.16 or -0.16 which is a pretty weak correlation.)
Sum of Mean
Source DF Squares Square F Value Prob>F
Model 1 25.62223 25.62223 17.604 0.0057
Error 6 8.73277 1.45546
C Total 7 34.35500
Root MSE 1.20643 R-square 0.7458
Dep Mean 77.72500 Adj R-sq 0.7034
C.V. 1.55217
Parameter Estimates
Parameter Standard T for H0:
Variable DF Estimate Error Parameter=0 Prob > |T|
INTERCEP 1 81.173057 0.92589886 87.669 0.0001
X 1 -0.133258 0.03176040 -4.196 0.0057
.
is inversely proportional
to
. Putting 4 data points at 0 and 4 at
50 makes
equal to 5000 while the value for
the data set is only about 1450.
Thus the new design estimates
more precisely.
Using only 3 points at 0 and at 50 gives a sum of 3750 which is still much
more precise than the design used.
General Linear Models Procedure
R-Square C.V. Root MSE Y Mean
0.070002 26.22198 198.15080 755.66667
T for H0: Pr > |T| Std Error of
Parameter Estimate Parameter=0 Estimate
INTERCEPT 684.4057037 5.78 0.0007 118.3236387
X 14.8804795 0.73 0.4915 20.5000646
The test statistic is 0.73 with 7 degrees of freedom. Since the test is one
sided we get a P value of 0.4915/2 which is certainly not significant.
Thus it seems quite possible that there is no (linear) relation between
eye weight and thickness.
it clearly suffices to
check that
. But this last value is

as required.

Divide through by SSTotal and use the formula
where
, etc., to get

Then

which is the usual t-statistic. Note the use of the fact that
.