Course outline:
Basic structure of typical multivariate data set:
Case by variables: data in matrix. Each row is a case, each column is a variable.
Example: Fisher's iris data: 5 rows of 150 by 5 matrix:
Case | Sepal | Sepal | Petal | Petal | |
# | Variety | Length | Width | Length | Width |
1 | Setosa | 5.1 | 3.5 | 1.4 | 0.2 |
2 | Setosa | 4.9 | 3.0 | 1.4 | 0.2 |
&vellip#vdots; | &vellip#vdots; | &vellip#vdots; | &vellip#vdots; | &vellip#vdots; | &vellip#vdots; |
51 | Versicolor | 7.0 | 3.2 | 4.7 | 1.4 |
&vellip#vdots; | &vellip#vdots; | &vellip#vdots; | &vellip#vdots; | &vellip#vdots; | &vellip#vdots; |
Vector valued random variable: function such that, writing ,
Cumulative Distribution Function (CDF) of : function on defined by
Defn: Distribution of rv is absolutely continuous if there is a function such that
Defn: Any satisfying (1) is a density of .
For most is differentiable at and
Basic tactic: specify density of
Tools: marginal densities, conditional densities, independence, transformation.
Marginalization: Simplest multivariate problem
is the marginal density of and the joint density of but they are both just densities. ``Marginal'' just to distinguish from the joint density of .
Def'n: Events and are independent if
Def'n: ,
are
independent if
Def'n: and are independent if
Def'n: Rvs independent:
Theorem:
Theorem: If are independent and then are independent. Moreover, and are independent.
Conditional density of given :
Suppose with having density . Assume is a one to one (``injective") map, i.e., if and only if . Find :
Step 1: Solve for in terms of : .
Step 2: Use basic equation:
Equivalent formula inverts the matrix:
Example: The density
Solve for in terms of :
argument | |||
Next: marginal densities of , ?
Factor as where
Then
Remark: easy to check .
Thus: have proved original bivariate normal density integrates to 1.
Put . Get
Notation:
Defn: The transpose, , of an matrix is the matrix whose entries are given by
Defn: rank of matrix , rank: # of linearly independent columns of . We have
rank | dimcolumn space of | |
dimrow space of | ||
rank |
If is then rank.
For now: all matrices square .
If there is a matrix such that then we call the inverse of . If exists it is unique and and we write . The matrix has an inverse if and only if rank.
Inverses have the following properties:
Again is . The determinant if a function on the set of matrices such that:
det | ||
det | ||
det |
Here are some properties of the determinant:
Defn: Two vectors and are orthogonal if .
Defn: The inner product or dot product of and is
Defn: and are orthogonal if .
Defn: The norm (or length) of is
is orthogonal if each column of has length 1 and is orthogonal to each other column of .
Suppose is an matrix. The function
If is and and such that
Therefore det.
Conversely: if singular then there is such that .
Fact: det is polynomial in of degree .
Each root is an eigenvalue.
General the roots could be multiple roots or complex valued.
Matrix is diagonalized by a non-singular matrix if is diagonal.
If so then so each column of is eigenvector of with the th column having eigenvalue .
Thus to be diagonalizable must have linearly independent eigenvectors.
If is symmetric then
Defn: A symmetric matrix is non-negative definite if for all . It is positive definite if in addition implies .
is non-negative definite iff all its eigenvalues are non-negative.
is positive definite iff all eigenvalues positive.
A non-negative definite matrix has a symmetric non-negative definite square root. If
Suppose vector subspace of , basis for . Given any there is a unique which is closest to ; minimizes
Note that and that
Choose to minimize: minimize second term.
Achieved by making .
Since can take
Summary: closest point in is
Notice that the matrix is idempotent:
Suppose matrix, , and . Make matrix by putting in 2 by 2 matrix:
We can work with partitioned matrices just like ordinary matrices always making sure that in products we never change the order of multiplication of things.
Note partitioning of and must match.
Addition: dimensions of and must be the same.
Multiplication formula must have as many columns as has rows, etc.
In general: need to make sense for each .
Works with more than a 2 by 2 partitioning.
Defn: block diagonal matrix: partitioned matrix for which if . If
Partitioned inverses. Suppose , are symmetric positive definite. Look for inverse of
Solve to get
Defn: iff
Defn: if and only if with the independent and each .
In this case according to our theorem
Defn: has a multivariate normal distribution if it has the same distribution as for some , some matrix of constants and .
, singular: does not have a density.
invertible: derive multivariate normal density by change of variables:
For which , is this a density?
Any but if then
Conversely, if is a positive definite symmetric matrix then there is a square invertible matrix such that so that there is a distribution. ( can be found via the Cholesky decomposition, e.g.)
When is singular will not have a density: such that ; is confined to a hyperplane.
Still true: distribution of depends only on : if then and have the same distribution.
Defn: If has density then
FACT: if for a smooth (mapping )
Linearity: for real and .
Defn: The moment (about the origin) of a real rv is (provided it exists). We generally use for .
Defn: The central moment is
Defn: For an valued random vector
Fact: same idea used for random matrices.
Defn: The ( ) variance covariance matrix of is
Example moments: If then
If now , that is, , then and
Similarly for we have with and
Theorem: If are independent and each is integrable then is integrable and
Defn: The moment generating function of a real valued is
Defn: The moment generating function of is
Example: If then
Theorem: () If is finite for all in a neighbourhood of
then
Note: means has continuous derivatives of all orders. Analytic
means has convergent power series expansion in neighbourhood of each
.
The proof, and many other facts about mgfs, rely on techniques of complex variables.
Theorem: Suppose and are valued random vectors such that
The proof relies on techniques of complex variables.
If are independent and then mgf of is product mgfs of individual :
Example: If are independent then
Conclusion: If then
Example: If then and
Theorem: Suppose
and
where
and
. Then
and have the same distribution if and only iff the
following two conditions hold:
Alternatively: if , each MVN
then
and
imply that and have
the same distribution.
Proof: If 1 and 2 hold the mgf of is
Thus mgf is determined by and .
Theorem: If then there is a matrix such that has same distribution as for .
We may assume that is symmetric and non-negative definite, or that is upper triangular, or that is lower triangular.
Proof: Pick any such that such as from the spectral decomposition. Then .
From the symmetric square root can produce an upper triangular square root by the Gram Schmidt process: if has rows then let be . Choose proportional to where so that has unit length. Continue in this way; you automatically get if . If has columns then is orthogonal and is an upper triangular square root of .
Defn: The covariance between and is
Properties:
Properties of the distribution
1: All margins are multivariate normal: if
2: : affine transformation of MVN is normal.
3: If
4: All conditionals are normal: the conditional distribution of given is Proof of ( 1): If then
So
Compute mean and variance to check rest.
Proof of ( 2): If then
Proof of ( 3): If
Proof of ( 4): first case: assume has an inverse.
Define
Now joint density of and factors
Specialization to bivariate case:
Write
Then
This simplifies to
More generally: any and :
0 | ||
Defn: Multiple correlation between and
Thus: maximize
Note
Summary: maximum squared correlation is
Notice: since is squared correlation between two scalars ( and ) we have
Correlation matrices, partial correlations:
Correlation between two scalars and is
If has variance then the correlation matrix of is with entries
If are MVN with the usual partitioned variance covariance matrix then the conditional variance of given is
From this define partial correlation matrix
Note: these are used even when are NOT MVN
Given data with model :
Definition: The likelihood function is map : domain , values given by
Key Point: think about how the density depends on not about how it depends on .
Notice: , observed value of the data, has been plugged into the formula for density.
We use likelihood for most inference problems:
Maximum Likelihood Estimation
To find MLE maximize .
Typical function maximization problem:
Set gradient of equal to 0
Check root is maximum, not minimum or saddle point.
Often is product of terms (given independent observations).
Much easier to work with logarithm of : log of product is sum and logarithm is monotone increasing.
Definition: The Log Likelihood function is
Simplest problem: collect replicate measurements from single population.
Model: are iid .
Parameters (): . Parameter space: and is some positive definite matrix.
Log likelihood is
Fact: if second derivative matrix is negative definite everywhere then function is concave; no more than 1 critical point.
Summary: is maximized at
More difficult: differentiate wrt .
Somewhat simpler: set
First derivative wrt is matrix with entries
Need: derivative of two functions:
Fact: th entry of is
Fact: ; expansion by minors.
Conclusion
Set = 0 and find only critical point is
Usual sample covariance matrix is
Properties of MLEs:
1)
2) .
Distribution of ? Joint distribution of and ?
Theorem: Suppose are independent random variables. Then
Proof: Let .
Then are independent .
So is multivariate standard normal.
Note that and Thus
So: reduced to and .
Step 1: Define
Put . Since
Thus is independent of .
Since is a function of we see that and are independent.
Also, see .
First 2 parts done.
Consider . Note that .
Now: distribution of quadratic forms:
Suppose and is symmetric. Put for diagonal, orthogonal.
Then
So: has same distribution as
Special case: if all are either 0 or 1 then has a chi-squared distribution with df = number of equal to 1.
When are eigenvalues all 1 or 0?
Answer: if and only if is idempotent.
1) If idempotent and is an eigenpair the
2) Conversely if all eigenvalues of are 0 or 1 then has 1s and 0s on diagonal so
Since it has the law
So eigenvalues are those of and is iff is idempotent and .
Our case: . Check . How many degrees of freedom: .
Defn: The trace of a square matrix is
Property: .
So:
Conclusion: df for is
Derivation of the density:
Suppose independent . Define distribution to be that of . Define angles by
Matrix of partial derivatives is
FACT: multiplying a column in a matrix by multiplies the determinant by .
SO: Jacobian of transformation is
Thus joint density of is
Answer has the form
Evaluate by making
Fourth part: consequence of first 3 parts and def'n of distribution.
Defn: if has same distribution as
Derive density of in this definition:
Theorem: Suppose are independent random variables. Then
Proof: Let where and are independent .
So .
Note that and
Consequences. In 1, 2 and 4: can assume and . In 3 can take . Step 1: Do general . Define
Compute variance covariance matrix
Defn: If is and is then is the matrix with the pattern
Conclusions so far:
1) and are independent.
2)
Next: Wishart law.
Defn: The distribution is the distribution of
Properties of Wishart.
1) If then
2) if independent then
Proof of part 3: rewrite
Uses further props of Wishart distribution.
3: If and then
4: If and then
5: If then
6: If is partitioned into components then
Given data iid test
Example: no realistic ones. This hypothesis is not intrinsically useful. However: other tests can sometimes be reduced to it.
Example: Ten water samples split in half. One half of each to each of two labs. Measure biological oxygen demand (BOD) and suspended solids (SS). For sample let be BOD for lab A, be SS for lab A, be BOD for lab B and be SS for lab B. Question: are labs measuring the same thing? Is there bias in one or the other?
Notation is vector of 4 measurements on sample .
Data:
Lab A | Lab B | |||
Sample | BOD | SS | BOD | SS |
1 | 6 | 27 | 25 | 15 |
2 | 6 | 23 | 28 | 13 |
3 | 18 | 64 | 36 | 22 |
4 | 8 | 44 | 35 | 29 |
5 | 11 | 30 | 15 | 31 |
6 | 34 | 75 | 44 | 64 |
7 | 28 | 26 | 42 | 30 |
8 | 71 | 124 | 54 | 64 |
9 | 43 | 54 | 34 | 56 |
10 | 33 | 30 | 29 | 20 |
11 | 20 | 14 | 39 | 21 |
Model: are iid .
Multivariate problem because: not able to assume independence between any two measurements on same sample.
Potential sub-model: each measurement is
true mmnt + lab bias + mmnt error.
Model for measurement error vector is multivariate normal mean 0 and diagonal covariance matrix .
Lab bias is unknown vector .
True measurement should be same for both labs so has form
This would give structured model
This model has variance covariance matrix
We skip this model and let be unrestricted.
Question of interest:
Reduction: partition as
Define . Then our model makes iid . Our hypothesis is
Carrying out our test in SPlus:
Working on CSS unix workstation:
Start SPlus then read in, print out data:
[61]ehlehl% mkdir .Data [62]ehlehl% Splus S-PLUS : Copyright (c) 1988, 1996 MathSoft, Inc. S : Copyright AT&T. Version 3.4 Release 1 for Sun SPARC, SunOS 5.3 : 1996 Working data will be in .Data > # Read in and print out data > eff <- read.table("effluent.dat",header=T) > eff BODLabA SSLabA BODLabB SSLabB 1 6 27 25 15 2 6 23 28 13 3 18 64 36 22 4 8 44 35 29 5 11 30 15 31 6 34 75 44 64 7 28 26 42 30 8 71 124 54 64 9 43 54 34 56 10 33 30 29 20 11 20 14 39 21Do some graphical preliminary analysis.
Look for non-normality, non-linearity, outliers.
Make plots on screen or saved in file.
> # Make pairwise scatterplots on screen using > # motif graphics device and then in a postscript > # file. > motif() > pairs(eff) > postscript("pairs.ps",horizontal=F, + height=6,width=6) > pairs(eff) > dev.off() Generated postscript file "pairs.ps". motif 2
> cor(eff) BODLabA SSLabA BODLabB SSLabB BODLabA 0.9999999 0.7807413 0.7228161 0.7886035 SSLabA 0.7807413 1.0000000 0.6771183 0.7896656 BODLabB 0.7228161 0.6771183 1.0000001 0.6038079 SSLabB 0.7886035 0.7896656 0.6038079 1.0000001Notice high correlations.
Mostly caused by variation in true levels from sample to sample.
Get partial correlations.
Adjust for overall BOD and SS content of sample.
> aug <- cbind(eff,(eff[,1]+eff[,3])/2, + (eff[,2]+eff[,4])/2) > aug BODLabA SSLabA BODLabB SSLabB X2 X3 1 6 27 25 15 15.5 21.0 2 6 23 28 13 17.0 18.0 3 18 64 36 22 27.0 43.0 4 8 44 35 29 21.5 36.5 5 11 30 15 31 13.0 30.5 6 34 75 44 64 39.0 69.5 7 28 26 42 30 35.0 28.0 8 71 124 54 64 62.5 94.0 9 43 54 34 56 38.5 55.0 10 33 30 29 20 31.0 25.0 11 20 14 39 21 29.5 17.5 > bigS <- var(aug)
Now compute partial correlations for first four variables given means of BOD and SS:
> S11 <- bigS[1:4,1:4] > S12 <- bigS[1:4,5:6] > S21 <- bigS[5:6,1:4] > S22 <- bigS[5:6,5:6] > S11dot2 <- S11 - S12 %*% solve(S22,S21) > S11dot2 BODLabA SSLabA BODLabB SSLabB BODLabA 24.804665 -7.418491 -24.804665 7.418491 SSLabA -7.418491 59.142084 7.418491 -59.142084 BODLabB -24.804665 7.418491 24.804665 -7.418491 SSLabB 7.418491 -59.142084 -7.418491 59.142084 > S11dot2SD <- diag(sqrt(diag(S11dot2))) > S11dot2SD [,1] [,2] [,3] [,4] [1,] 4.980428 0.000000 0.000000 0.000000 [2,] 0.000000 7.690389 0.000000 0.000000 [3,] 0.000000 0.000000 4.980428 0.000000 [4,] 0.000000 0.000000 0.000000 7.690389 > R11dot2 <- solve(S11dot2SD)%*% + S11dot2%*%solve(S11dot2SD) > R11dot2 [,1] [,2] [,3] [,4] [1,] 1.000000 -0.193687 -1.000000 0.193687 [2,] -0.193687 1.000000 0.193687 -1.000000 [3,] -1.000000 0.193687 1.000000 -0.193687 [4,] 0.193687 -1.000000 -0.193687 1.000000Notice little residual correlation. Carry out Hotelling's test of .
> w <- eff[,1:2]-eff[3:4] > dimnames(w)<-list(NULL,c("BODdiff","SSdiff")) > w BODdiff SSdiff [1,] -19 12 [2,] -22 10 etc [8,] 17 60 etc > Sw <- var(w) > cor(w) BODdiff SSdiff BODdiff 1.0000001 0.3057682 SSdiff 0.3057682 1.0000000 > mw <- apply(w,2,mean) > mw BODdiff SSdiff -9.363636 13.27273 > Tsq <- 11*mw%*%solve(Sw,mw) > Tsq [,1] [1,] 13.63931 > FfromTsq <- (11-2)*Tsq/(2*(11-1)) > FfromTsq [,1] [1,] 6.13769 > 1-pf(FfromTsq,2,9) [1] 0.02082779Conclusion: Pretty clear evidence of difference in mean level between labs. Which measurement causes the difference?
> TBOD <- sqrt(11)*mw[1]/sqrt(Sw[1,1]) > TBOD BODdiff -2.200071 > 2*pt(TBOD,1) BODdiff 0.2715917 > 2*pt(TBOD,10) BODdiff 0.05243474 > TSS <- sqrt(11)*mw[2]/sqrt(Sw[2,2]) > TSS SSdiff 2.15153 > 2*pt(-TSS,10) SSdiff 0.05691733 > postscript("differences.ps", + horizontal=F,height=6,width=6) > plot(w) > abline(h=0) > abline(v=0) > dev.off()Conclusion? Neither? Not a problem - summarizes evidence!
Problem: several tests at level 0.05 on same data. Simultaneous or Multiple comparisons.
Confidence interval for :
Give coverage intervals for 6 parameters of interest: 4 entries in and and
Problem: each confidence interval has 5% error rate. Pick out last interval (on basis of looking most interesting) and ask about error rate?
Solution: adjust 2.23, multiplier to get
Based on inequality:
Proof by Cauchy Schwarz: inner product of vectors and .
Put and to get
In fact the probability of this happening is exactly equal to because for each data set the supremum of
Our case
Coverage probability of single interval using ? From distribution:
Probability all 6 intervals would cover using ?
Use Bonferroni inequality:
Usually just use
General Bonferroni strategy. If we want intervals for get interval for at level . Simultaneous coverage probability is at least . Notice that Bonferroni narrower in our example unless giving .
Motivations for :
1: Hypothesis is true iff all hypotheses are true. Natural test for rejects if
Fact:
2: likelihood ratio method.
Compute
In our case to test find
Now write
Again conclude: likelihood ratio test rejects for where chosen to make level .
3: compare estimates of .
In univariate regression tests to compare a restricted model with a full model have form
Here: substitute matrices.
Analogue of ESS for full model:
Analogue of ESS for reduced model:
In 1 sample example:
Test of based on comparing
To make comparison. If null true
Measures of size based on eigenvalues of
Suggested size measures for :
For our matrix: eigenvalues all 0 except for one. (So really-matrix not close to .)
Largest eigenvalue is
But: see two sample problem for precise tests based on suggestions.
Test .
Case 1: for motivation only. known .
Natural test statistic: based on
If not known must estimate. No universally agreed best procedure (even for -- called Behrens-Fisher problem).
Usually: assume .
If so: MLE of is and of is
Possible test developments:
1) By analogy with 1 sample:
2) Union-intersection: test of based on
Get
3) Likelihood ratio: the MLE of for the unrestricted model is
This simplifies to
If are the eigenvalues of then
Two sample analysis in SAS on css network
data long; infile 'tab57sh'; input group a b c; run; proc print; run; proc glm; class group; model a b c = group; manova h=group / printh printe; run;Notes:
1) First 4 lines form DATA step:
a) creates data set named long by reading in 4 columns of data from file named tab57sh stored in same directory as I was in when I typed sas.
b) Calls variables group (=1 or 2 as label for the two groups) and a, b, c which are names for the 3 test scores for each subject.
2) Next two lines: print out data: result is (slightly edited)
Obs group a b c 1 1 19 20 18 2 1 20 21 19 3 1 19 22 22 etc till 11 2 15 17 15 12 2 13 14 14 13 2 14 16 133) Then use proc glm to do analysis:
a) class group declares that the variable group defines levels of a categorical variable.
b) model statement says to regress the variables a, b, c on variable group.
c) manova statement says to do both 3 univariate regressions and a mulivariate regression and to print out the and matrices where is the matrix corresponding to the presence of the factor group in the model.
Output of MANOVA: First univariate results
The GLM Procedure Class Level Information Class Levels Values group 2 1 2 Number of observations 13 Dependent Variable: a Sum of Source DF Squares Mean Square F Value Pr > F Model 1 54.276923 54.276923 19.38 0.0011 Error 11 30.800000 2.800000 Corrd Tot 12 85.076923 R-Square Coeff Var Root MSE a Mean 0.637975 10.21275 1.673320 16.38462 Source DF Type ISS Mean Square F Value Pr > F group 1 54.276923 54.276923 19.38 0.0011 Source DF TypeIIISS Mean Square F Value Pr > F group 1 54.276923 54.276923 19.38 0.0011 Dependent Variable: b Sum of Source DF Squares Mean Square F Value Pr > F Model 1 70.892308 70.892308 34.20 0.0001 Error 11 22.800000 2.072727 Corrd Tot 12 93.692308 Dependent Variable: c Sum of Source DF Squares Mean Square F Value Pr > F Model 1 94.77692 94.77692 39.64 <.0001 Error 11 26.30000 2.39090 Corrd Tot 12 121.07692The matrices and .
E = Error SSCP Matrix a b c a 30.8 12.2 10.2 b 12.2 22.8 3.8 c 10.2 3.8 26.3 Partial Correlation Coefficients from the Error SSCP Matrix / Prob > |r| DF = 11 a b c a 1.000000 0.460381 0.358383 0.1320 0.2527 b 0.460381 1.000000 0.155181 0.1320 0.6301 c 0.358383 0.155181 1.000000 0.2527 0.6301 H = Type III SSCP Matrix for group a b c a 54.276923077 62.030769231 71.723076923 b 62.030769231 70.892307692 81.969230769 c 71.723076923 81.969230769 94.776923077The eigenvalues of .
Characteristic Roots and Vectors of: E Inverse * H H = Type III SSCP Matrix for group E = Error SSCP Matrix Characteristic Characteristic Vector V'EV=1 Root Percent a b c 5.816159 100.00 0.00403434 0.12874606 0.13332232 0.000000 0.00 -0.09464169 -0.10311602 0.16080216 0.000000 0.00 -0.19278508 0.16868694 0.00000000 MANOVA Test Criteria and Exact F Statistics for the Hypothesis of No Overall group Effect H = Type III SSCP Matrix for group E = Error SSCP Matrix S=1 M=0.5 N=3.5 Statistic Value F NumDF DenDF Pr > F Wilks' Lambda 0.1467 17.45 3 9 0.0004 Pillai's Trace 0.8533 17.45 3 9 0.0004 Hotelling-Lawley Tr 5.8162 17.45 3 9 0.0004 Roy's Greatest Root 5.8162 17.45 3 9 0.0004Things to notice:
Wilk's Lambda:
Data .
Model independent .
First problem of interest: test
Based on and . MLE of is .
1 19 20 18 1 20 21 19 1 19 22 22 1 18 19 21 1 16 18 20 1 17 22 19 1 20 19 20 1 15 19 19 2 12 14 12 2 15 15 17 2 15 17 15 2 13 14 14 2 14 16 13 3 15 14 17 3 13 14 15 3 12 15 15 3 12 13 13 4 8 9 10 4 10 10 12 4 11 10 10 4 11 7 12Code
data three; infile 'tab57for3sams'; input group a b c; run; proc print; run; proc glm; class group; model a b c = group; manova h=group / printh printe; run; data four; infile 'table5.7'; input group a b c; run; proc print; run; proc glm; class group; model a b c = group; manova h=group / printh printe; run;Pieces of output: first set of code does first 3 groups.
So: has rank 2.
Characteristic Roots & Vectors of: E Inverse * H Characteristic Characteristic Vector V'EV=1 Root Percent a b c 6.90568180 96.94 0.01115 0.14375 0.08795 0.21795125 3.06 -0.07763 -0.09587 0.16926 0.00000000 0.00 -0.18231 0.13542 0.02083 S=2 M=0 N=5 Statistic Value F NumDF Den DF Pr > F Wilks' 0.1039 8.41 6 24 <.0001 Pillai's 1.0525 4.81 6 26 0.0020 Hotelling-Lawley 7.1236 13.79 6 14.353 <.0001 Roy's 6.9057 29.92 3 13 <.0001 NOTE: F Statistic for Roy's is an upper bound. NOTE: F Statistic for Wilks'is exact.Notice two eigenvalues not 0. Note that exact distribution for Wilk's Lambda is available. Now 4 groups
Root Percent a b c 15.3752900 98.30 0.01128 0.13817 0.08126 0.2307260 1.48 -0.04456 -0.09323 0.15451 0.0356937 0.23 -0.17289 0.09020 0.04777 S=3 M=-0.5 N=6.5 Statistic Value F NumDF Den DF Pr > F Wilks' 0.04790913 10.12 9 36.657 <.0001 Pillai's 1.16086747 3.58 9 51 0.0016 Hot'ng-Lawley 15.64170973 25.02 9 20.608 <.0001 Roy's 15.37528995 87.13 3 17 <.0001 NOTE: F Statistic for Roy's is an upper bound.
Test ?
Define Then put
Data
.
Model: independent, .
Note: this is the fixed effects model.
Usual approach: define grand mean, main effects, interactions:
Test additive effects: for all .
Usual test based on ANOVA:
Stack observations into vector , say.
Estimate , , etc by least squares.
Form vectors with entries , etc.
Write
Fact: all vectors on RHS are independent and orthogonal. So:
Our problem is like this one BUT the errors are not modeled as independent.
In the analogy:
labels group.
labels the columns: ie is a, b, c.
runs from 1 to .
But
Now do analysis in SAS.
Tell SAS that the variables A, B and C are repeated measurements of the same quantity.
proc glm; class group; model a b c = group; repeated scale; run;The results are as follows:
General Linear Models Procedure Repeated Measures Analysis of Variance Repeated Measures Level Information Dependent Variable A B C Level of SCALE 1 2 3 Manova Test Criteria and Exact F Statistics for the Hypothesis of no SCALE Effect H = Type III SS&CP Matrix for SCALE E = Error SS&CP Matrix S=1 M=0 N=7 Statistic Value F NumDF DenDF Pr > F Wilks' Lambda 0.56373 6.1912 2 16 0.0102 Pillai's Trace 0.43627 6.1912 2 16 0.0102 Hotelling-Lawley 0.77390 6.1912 2 16 0.0102 Roy's 0.77390 6.1912 2 16 0.0102Note: should look at interactions first.
Manova Test Criteria and F Approximations for the Hypothesis of no SCALE*GROUP Effect S=2 M=0 N=7 Statistic Value F NumDF DenDF Pr > F Wilks' Lambda 0.56333 1.7725 6 32 0.1364 Pillai's Trace 0.48726 1.8253 6 34 0.1234 Hotelling-Lawley 0.68534 1.7134 6 30 0.1522 Roy's 0.50885 2.8835 3 17 0.0662 NOTE: F Statistic for Roy's Greatest Root is an upper bound. NOTE: F Statistic for Wilks' Lambda is exact.
Repeated Measures Analysis of Variance Tests of Hypotheses for Between Subjects Effects Source DF Type III SS Mean Square F Pr > F GROUP 3 743.900000 247.966667 70.93 0.0001 Error 17 59.433333 3.496078 Repeated Measures Analysis of Variance Univariate Tests of Hypotheses for Within Subject Effects Source: SCALE Adj Pr > F DF TypeIIISS MS F Pr > F G - G H - F 2 16.624 8.312 5.39 0.0093 0.0101 0.0093 Source: SCALE*GROUP DF TypeIII MS F Pr > F G - G H - F 6 18.9619 3.160 2.05 0.0860 0.0889 0.0860 Source: Error(SCALE) DF TypeIII SS Mean Square 34 52.4667 1.54313725 Greenhouse-Geisser Epsilon = 0.9664 Huynh-Feldt Epsilon = 1.2806Greenhouse-Geisser, Huynh-Feldt test to see if has certain structure.
Return to 2 way anova model. Express as:
For fixed effects model is iid .
For MANOVA model vector of is MVN but with covariance as for .
Intermediate model. Put in subject effect.
Assume
Essentially model says
Do univariate anova: The data reordered:
1 1 1 19 1 1 2 20 1 1 3 18 2 1 1 20 2 1 2 21 2 1 3 19 et cetera 2 4 2 10 2 4 3 12 3 4 1 11 3 4 2 10 3 4 3 10 4 4 1 11 4 4 2 7 4 4 3 12The four columns are now labels for subject number, group, scale (a, b or c) and the response. The sas commands:
data long; infile 'table5.7uni'; input subject group scale score; run; proc print; run; proc glm; class group; class scale; class subject; model score =group subject(group) scale group*scale; random subject(group) ; run;Some of the output:
Dependent Variable: SCORE Sum of Mean Source DF Squares Square F Pr > F Model 28 843.5333 30.126 19.52 0.0001 Error 34 52.4667 1.543 Total 62 896.0000 Root MSE SCORE Mean 1.242231 15.33333 Source DF TypeISS MS F Pr > F GROUP 3 743.9000 247.9667 160.69 0.0001 SUBJECT(GROUP) 17 59.4333 3.4961 2.27 0.0208 SCALE 2 21.2381 10.6190 6.88 0.0031 GROUP*SCALE 6 18.9620 3.1603 2.05 0.0860 Source DF TypeIIISS MS F Pr > F GROUP 3 743.9000 247.9667 160.69 0.0001 SUBJECT(GROUP) 17 59.4333 3.4961 2.27 0.0208 SCALE 2 16.6242 8.3121 5.39 0.0093 GROUP*SCALE 6 18.9619 3.1603 2.05 0.0860 Source Type III Expected Mean Square GROUP Var(Error) + 3 Var(SUBJECT(GROUP)) + Q(GROUP,GROUP*SCALE) SUBJECT(GROUP) Var(Error) + 3 Var(SUBJECT(GROUP)) SCALE Var(Error) + Q(SCALE,GROUP*SCALE) GROUP*SCALE Var(Error) + Q(GROUP*SCALE)Type I Sums of Squares:
Type III Sums of Squares:
Notice hypothesis of no group by scale interaction is acceptable.
Under the assumption of no such group by scale interaction the hypothesis of no group effect is tested by dividing group ms by subject(group) ms.
Value is 70.9 on 3,17 degrees of freedom.
This is NOT the F value in the table above since the table above is for FIXED effects.
Notice that the sums of squares in this table match those produced in the repeated measures ANOVA. This is not accidental.