Canonical Correlations

General goal: explore correlation structure between two sets of variables $ {\bf X}_1$ and $ {\bf X}_2$.

Begin with population definitions.

Assume

$\displaystyle {\bf X}= \left[\begin{array}{c} {\bf X}_1\\  {\bf X}_2\end{array}\right]
$

where $ {\bf X}_1$ has $ p_1$ components and $ {\bf X}_2$ has $ p_2$ components.

Partition variance covariance matrix of $ {\bf X}$ as usual into $ {\boldsymbol\Sigma}_{ij}$.

Which linear combination of $ {\bf X}_1$ entries is most correlated with which linear combination of $ {\bf X}_2$ entries?

Consider vectors $ {\bf a}$, $ {\bf b}$.

$\displaystyle {\rm Corr}({\bf a}^T{\bf X}_1,{\bf b}^T{\bf X}_2) =
\frac{{\bf a...
... a}^T{\boldsymbol\Sigma}_{11}{\bf a}{\bf b}^T{\boldsymbol\Sigma}_{22}{\bf b}}}
$

Scale invariant as function of either $ {\bf a}$ or $ {\bf b}$.

So: maximize $ {\bf a}^T{\boldsymbol\Sigma}_{12} {\bf b}$ subject to two conditions:

$\displaystyle {\bf a}^T{\boldsymbol\Sigma}_{11i}{\bf a}=1={\bf b}^T{\boldsymbol\Sigma}_{22}{\bf b}.
$

Two Lagrange multipliers gives equations:

$\displaystyle {\boldsymbol\Sigma}_{12}{\bf b}$ $\displaystyle =\lambda_1{\boldsymbol\Sigma}_{11}{\bf a}$    
$\displaystyle {\boldsymbol\Sigma}_{21}{\bf a}$ $\displaystyle =\lambda_2{\boldsymbol\Sigma}_{22}{\bf b}$    

Manipulate to get

$\displaystyle {\boldsymbol\Sigma}_{12}{\boldsymbol\Sigma}_{22}^{-1} {\boldsymbol\Sigma}_{21}{\bf a} =
\lambda_1\lambda_2{\boldsymbol\Sigma}_{11}{\bf a}
$

Homework problem: maximize $ {\bf a}^T {\bf x} $ subject to $ {\bf x}^T{\bf Q}{\bf x}= 1$ over $ {\bf x}$.

Equivalent to maximizing

$\displaystyle \frac{{\bf a}^T {\bf x}}{\sqrt{{\bf x}^T{\bf Q}{\bf x}}}
$

Solution is

$\displaystyle {\bf x} = {\bf Q}^{-1} {\bf a}
$

Maximum value is

$\displaystyle \sqrt{{\bf a}^T{\bf Q}^{-1} {\bf a}}
$

Apply to our problem to maximize over $ {\bf b}$ with $ {\bf a}$ fixed. Get

$\displaystyle \sqrt{\frac{{\bf a}^T{\boldsymbol\Sigma}_{12}{\boldsymbol\Sigma}_...
...} {\boldsymbol\Sigma}_{21}{\bf a}}{
{\bf a}^T{\boldsymbol\Sigma}_{11}{\bf a}}}
$

Generalized eigenvalue problems: suppose $ {\bf A}$, $ {\bf B}$ symmetric and suppose $ {\bf B}$ not singular. Find solutions of

$\displaystyle ({\bf A}-\lambda{\bf B}){\bf v} = 0
$

for non-zero $ {\bf v}$. (Notice solution set is a vector space.)

Write $ {\bf B}={\bf B}^{1/2}{\bf B}^{1/2}$ for symmetric $ {\bf B}^{1/2}$.

Define $ {\bf w} ={\bf B}^{1/2} {\bf v}$, multiply basic equation by $ {\bf B}^{-1/2}$ to get

$\displaystyle {\bf B}^{-1/2}{\bf A}{\bf B}^{-1/2}{\bf w} = \lambda {\bf w}
$

Thus solutions are of form $ (\lambda,{\bf B}^{-1/2}{\bf w})$ where $ (\lambda,{\bf w})$ is an eigenvalue-eigenvector pair for

$\displaystyle {\bf B}^{-1/2}{\bf A}{\bf B}^{-1/2}{\bf w}.$

Equivalently: find eigenvalues of $ {\bf B}^{-1}{\bf A}$ or of $ {\bf A}{\bf B}^{-1}$.

Corresponding maximization problem: maximize

$\displaystyle \frac{{\bf v}^t {\bf A}{\bf v}}{{\bf v}^t{\bf B}{\bf v}}
$

Maximum value is largest $ \lambda$. Application to our problem:

Having maximized over $ {\bf b}$ now must maximize correlation squared by maximizing

$\displaystyle \frac{{\bf a}^T{\boldsymbol\Sigma}_{12}{\boldsymbol\Sigma}_{22}^{-1} {\boldsymbol\Sigma}_{21}{\bf a}}{
{\bf a}^T{\boldsymbol\Sigma}_{11}{\bf a}}
$

so $ {\bf B}={\boldsymbol\Sigma}_{11}$ and $ {\bf A}= {\boldsymbol\Sigma}_{12}{\boldsymbol\Sigma}_{22}^{-1}{\boldsymbol\Sigma}_{21}$.

Maximal squared correlation is largest eigenvalue of

$\displaystyle {\boldsymbol\Sigma}_{11}^{-1}{\boldsymbol\Sigma}_{12}{\boldsymbol\Sigma}_{22}^{-1}{\boldsymbol\Sigma}_{21}.
$

First canonical correlates: find largest eigenvalue of

$\displaystyle {\boldsymbol\Sigma}_{11}^{-1}{\boldsymbol\Sigma}_{12}{\boldsymbol\Sigma}_{22}^{-1}{\boldsymbol\Sigma}_{21}
$

and $ {\bf a}_1$ the corresponding eigenvector. Then

$\displaystyle {\bf b}_1 = {\boldsymbol\Sigma}_{22}^{-1} {\boldsymbol\Sigma}_{21}{\bf a}_1$

Maximal value of squared correlation is largest eigenvalue

Corresponding $ {\bf a}_1$, $ {\bf b}_1$ are eigenvectors of

$\displaystyle {\boldsymbol\Sigma}_{11}^{-1}{\boldsymbol\Sigma}_{12}{\boldsymbol\Sigma}_{22}^{-1}{\boldsymbol\Sigma}_{21}
$

and

$\displaystyle {\boldsymbol\Sigma}_{22}^{-1}{\boldsymbol\Sigma}_{21}{\boldsymbol\Sigma}_{11}^{-1}{\boldsymbol\Sigma}_{12}
$

Usually normalized.

Second canonical correlates: repeat maximization but require:

Independence of $ {\bf a}_2^T {\bf X}_1$ and $ {\bf a}_1^T {\bf X}_1$ and of $ {\bf b}_2^T {\bf X}_2$ and $ {\bf b}_1^T {\bf X}_2$.

And so on to get $ \min\{p_1,p_2\}$ triples:

$\displaystyle \lambda_i,{\bf a}_i,{\bf b}_i
$

with $ \lambda_i$ in decreasing order.

With data: just replace $ {\boldsymbol\Sigma}$ by $ {\bf S}$.

Canonical Correlations example

Data: Table 9.12 in Johnson and Wichern.

Sales data: 3 measurements on the sales performance of 50 salespeople for a large firm and 4 test scores.

Use SAS to do canonical correlation analysis between the first 3 variables and the remaining 4.

Here is the SAS code.


data sales;
 infile "T9-12.DAT";
 input growth profit new 
         create mech abst math;
proc cancorr;
 var growth profit new;
 with create mech abst math;
run;

And here is the output.


        Canonical Correlation Analysis
            Adjusted   Approx   Squared
  Canonical Canonical Standard Canonical
   Correln  Correln   Error    Correln
1 0.994483  0.994021  0.001572  0.988996
2 0.878107  0.872097  0.032704  0.771071
3 0.383606  0.366795  0.121835  0.147153
Eigenvalues of INV(E)*H =CanRsq/(1-CanRsq)

 Eigenvalue Difference Proportion  Cumul
1  89.8745   86.5063     0.9621    0.9621
2   3.3682    3.1956     0.0361    0.9982
3   0.1725         .     0.0018    1.0000
Notice the first canonical correlates account for most of the correlation between the two sets of variables.


   Canonical Correlation Analysis
Test of H0: The canonical correlations in 
  current row and all that follow are zero
  Likelihood
      Ratio  Approx F Num DF Den DF  Pr > F
1 0.00214847  87.3915   12   114.06  0.0001
2 0.19524127  18.5263    6    88     0.0001
3 0.85284669   3.8822    2    45     0.0278
Multivariate Statistics and F Approximations
S=3    M=0    N=20.5
Statistic    Value     F  NumDF DenDF Pr > F
Wilks'      0.0022   87.39  12  114.1 0.0001
Pillai's    1.9072   19.63  12  135   0.0001
Hotel'g-Ly 93.4152  324.35  12  125   0.0001
Roy's      89.8745 1011.08   4   45   0.0001
NOTE: F Statistic for Roy's Greatest Root is 
an upper bound.
Clearly all 3 of the canonical correlations are non-zero. The multivariate tests are of $ H_o: \Sigma_{12}=0$.


Canonical Correlation Analysis
Raw Canonical Coefficients 
       for the 'VAR' Variables
            V1      V2       V3
GROWTH  0.06238  -0.17407 -0.37715
PROFIT  0.02093   0.24216  0.10352
NEW     0.07826  -0.23829  0.38342
Raw Canonical Coefficients 
       for the 'WITH' Variables
          W1       W2       W3
CREATE 0.06975 -0.19239  0.24656
MECH   0.03074  0.20157 -0.14190
ABST   0.08956 -0.49576 -0.28022
MATH   0.06283  0.06832  0.01133
Canonical Correlation Analysis
Standardized Canonical Coefficients 
     for the 'VAR' Variables
          V1     V2      V3
GROWTH 0.4577 -1.2772 -2.7673
PROFIT 0.2119  2.4517  1.0480
NEW    0.3688 -1.1229  1.8067
Standardized Canonical Coefficients 
     for the 'WITH' Variables
         W1      W2      W3
CREATE 0.2755 -0.7600  0.9739
MECH   0.1040  0.6823 -0.4803
ABST   0.1916 -1.0607 -0.5996
MATH   0.6621  0.7199  0.1194

The standardized coefficients are easiest to interpret.

For instance a quantity which is not too different from the average of the 3 sales indices is strongly correlated with a weighted average of the psychological test scores which puts most of the weight on Math.

The second set of correlates focus on the relation between profitability minus the average of the other two sales indices correlated with (Math + Mech) - (Abstract + Creativity).

The last one is not particularly meaningful to me but it is a very small part of the correlation structure between the two sets of variables.


Canonical Structure
Correlations Between the 'VAR' Variables 
         and Their Canonical Variables
         V1      V2      V3
GROWTH 0.9799  0.0006 -0.1996
PROFIT 0.9464  0.3229  0.0075
NEW    0.9519 -0.1863  0.2434
Correlations Between the 'WITH' Variables 
         and Their Canonical Variables

         W1      W2      W3
CREATE 0.6383 -0.2157  0.6514
MECH   0.7212  0.2376 -0.0677
ABST   0.6472 -0.5013 -0.5742
MATH   0.9441  0.1975 -0.0942
Canonical Structure
Correlations Between the 'VAR' Variables and the
Canonical Variables of the 'WITH' Variables
         W1      W2     W3
GROWTH 0.9745  0.0006 -0.0766
PROFIT 0.9412  0.2835  0.0029
NEW    0.9466 -0.1636  0.0934
Correlations Between the 'WITH' Variables and
the Canonical Variables of the 'VAR' Variables
         V1      V2     V3
CREATE 0.6348 -0.1894  0.2499
MECH   0.7172  0.2086 -0.0260
ABST   0.6437 -0.4402 -0.2203
MATH   0.9389  0.1735 -0.0361



Richard Lockhart
2002-11-22