#one sample Hotelling T^2 test
#X2 is our target mean vector
#we actually do not have two samples in this case

x1=c(2.218, 0.503,1.598)
x2=c(2.20,0.50,1.60)
x=cbind(x1,x2)

n=50
p=length(x1)



S=matrix(0.001*c(1.48, 0.85, 0.33, 0.85, 2.10, 0.42,0.33, 0.42, 1.40),nrow=3,byrow=T)

invS=solve(S)

#null hypothesis is sample mean vector (X1) equals 
#the target mean vector (X2)

hotelling=n*(x1-x2)%*%invS%*%t(t(x1-x2))

#hotelling has F distribution, with df1=p=3,df2=n-p=50-3=47
#p_value = P(T2>hotelling)=P(F>hotelling*(n-p)/(n-1)*p) this is coming
#from the relationship between Hotelling T^2 distribution and F distribution

def1=p
def2=n-p


h=hotelling*(n-p)/(n-1)*p


(p_value=1-pf(h,df1=def1,df2=def2))

#Since p_value=8.169021e-13 < 0.05=alpha
#we reject the null hypothesis
#the conclusion is: sample mean vector is significantly different
#from the target mean vector at 0.05, and at any other acceptable
#level of significance (alpha= {0.01,0.1})


#Since Hotelling test rejectd the null you can
#construct the simultaneous confidence intervals for the individual mean
#differences
#i.e. x_bar+- sqrt((n-1)*p/(n-p)*F df1=p,df2=n-p,alpha) *s/sqrt(n)
# more reading on page 276, see Example 6.1

#construct the first confidence interval

ci_lower=x1[1]-sqrt((n-1)*p/(n-p)*qf(.95,df1=def1,df2=def2))* sqrt(S[1,1]/n)
ci_upper=x1[1]+sqrt((n-1)*p/(n-p)*qf(.95,df1=def1,df2=def2))* sqrt(S[1,1]/n)
print("The first confidence interval is:")
(CI=c(ci_lower,ci_upper))


# W=sqrt(qf(.95,def1,def2)*2)
#Bonferroni 100(1-alpha)% confidence interval
#i.e. x_bar+- t n-1,alpha/2p * s/sqrt(n)
alpha.2p=0.05/2*p

ci_lower=x1[1]-qt(alpha.2p,df=n-1)* sqrt(S[1,1]/n)
ci_upper=x1[1]+qt(alpha.2p,df=n-1)* sqrt(S[1,1]/n)
print("The first confidence interval is:")
(CI=c(ci_lower,ci_upper))


#Bonferroni method is more appropriate since it will provide 
#adequate simultaneous protection, but with less conservatism. 


#Since Hotelling rejected the null, we can use univariate t-base confidence
#intervals (Bonferroni, or Sheffe), so that is why we choose Bonferroni
#Also notice the Bonferroni interval is narrover than Hotelling F distributon
#based confidence interval

#This is analogous to Fisher's recomendation for handling multiple 
#compariosons in univariate analysis of variance


#Suppose there is only one variable that violates the null hypothesis 

#This may cause T2 to be significant

#So if there a p variables, we will test p-1 valid null hypotheses and
#may find spurious evidence of significance with a probability (p_value)
#that is significantly higher than aplha