Reading: 6.5, Chapter 15, Appendix A.
An informal method of selecting p, the model order, is based on
Note: adding more terms always increases R2.
Formal methods can be based on hypothesis tests. We can
test
and then, if we accept this test
and then, if we accept that test
and so on stopping when we first reject a hypothesis. This is
``backwards elimination''.
Justification: Unless
there is no good reason to
suppose that
and so on.
Apparent conclusion in our example: p=5 is best; look at the P values in the SAS outputs.
Problems arising with that conclusion:
Question: What is distribution theory?
Answer: How to compute the ``distribution'' of an estimator, test or other statistic, T:
In this course we
The standard normal density is
Reminder: if X has density f(x) then
So:
implies
Next we compute the variance of Z remembering that
:
where
and
dv = -ze-z2/2 dz. We do integration
by parts and see that
v=e-z2/2 and
.
This gives
because the integral of the normal density is 1. We have thus shown
that
Definition: If
then
.
Note:
Definition: If
are independent N(0,1)
then
We can define
and
for vectors like Z as
follows:
If X is a random vector of length n, say
then
Definition:
Definition: If M is a matrix then
is a matrix
whose ijth entry is
.
So
has ijth entry
and
diagonal entries
.
In class I started discussion of the Normal distribution. I computed the
mean and variance of a standard normal and then of
.
Here I will just show you a few more integrals:
The kth moment of a standard normal is
We can also compute the moment generating function of Z, that is,
Now if
then
so
that the
central moment of X, namely
is 0 for odd k and
for k even.
Similarly