STAT 330 Lecture 29
Reading for Today's Lecture: 12.1, 12.2,12.3
Goals of Today's Lecture:
Today's notes
In the model
if the levels of
are the only levels of interest of
Factor 1 we call Factor 1 (and the
) a fixed effect. If on the
other hand they are a sample of size I from a population of possible
levels we refer to Factor 1 as a random effect. Often Randomized
Blocks designs have blocks which are regarded as random. For instance
in an experiment where 5 runs of some production process can be run on
a single day we often treat DAY as a blocking factor and then pretend
the days we tried are a sample of possible days.
We call the Model a Fixed Effects model if both factors are fixed, a Random Effects model if both are random and a Mixed model if we have one fixed and one random factor. For mixed models with replicates we get different F tests for main effects. Moreover, the injunction that we test main effects only when there are no interactions is no longer relevant.
END OF CHAPTER 11
Simple Linear Regression and Correlation
Here are two experimental designs used to investigate the relation between two continuous variables.
1: Controlled Experiment: A variable, X is set at values
and
corresponding values
of a response variable are measured.
Example: Chapter 12, question 9. x is the "Burner area liberation rate" and
Y is the
(nitrous oxides) emission rate.
2: A sample of n pairs: We sample a population of n pairs of numbers and
get
.
Example: we sample 1074 families and measure the Father's height (X) and Son's height (Y) for each family.
In this section our goal is to predict Y from the value of X and not the other way around. We do not treat the variables symmetrically.
Regression Models:
We assume for each observation a model equation of the form
where
Assumptions:
Definition: The regression is called linear if
is a linear function of
. (This jargon is used also when each of X and
is a vector.)
Our example is Simple Linear Regression.
where the
are independent mean 0 homoscedastic errors.
(Notice that the map
is a linear function
of
. At the same time this model describes a straight line
function of x.
Estimation
Estimation is based on least squares. We choose
to minimize
To minimize this we take the derivatives
and
and set them both equal to 0. We get
and
These two equations are called the normal equations usually written in the form
and
The solution is
and
There is an ANOVA table for this least squares analysis based on the identity
where
is the so-called fitted value namely
.
The quantity
is called the Error Sum of Squares and
the quantity
is called the Regression
Sum of Squares. We get the following ANOVA table.
In this table the P value is used to test
. However, for simple
linear regression it is usually better to use a technique which easily provides confidence
intervals for
and can be used to test other values of
.
Let
.and note that because
If the errors
are normal so that the
s are normal then
is normal and we can compute the mean and variance
of
as follows:
So
is unbiased. Next: