Linear Algebra Review

Notation:

Defn: The transpose, $ A^T$, of an $ m \times n$ matrix $ A$ is the $ n \times m$ matrix whose entries are given by

$\displaystyle (A^T)_{ij} = A_{ji}
$

so that $ A^T$ is $ n \times m$. We have

$\displaystyle (A+B)^T = A^T+B^T
$

and

$\displaystyle (AB)^T = B^T A^T \, .
$

Defn: rank of matrix $ A$, rank$ (A)$: # of linearly independent columns of $ A$. We have

rank$\displaystyle (A)$ $\displaystyle =$   dim$\displaystyle ($column space of $ A$$\displaystyle )$    
  $\displaystyle =$   dim$\displaystyle ($row space of $ A$$\displaystyle )$    
  $\displaystyle =$   rank$\displaystyle (A^T)$    

If $ A$ is $ m \times n$ then rank$ (A) \le \min(m,n)$.

Matrix inverses

For now: all matrices square $ n \times n$.

If there is a matrix $ B$ such that $ BA=I_{n \times n}$ then we call $ B$ the inverse of $ A$. If $ B$ exists it is unique and $ AB=I$ and we write $ B=A^{-1}$. The matrix $ A$ has an inverse if and only if rank$ (A) = n$.

Inverses have the following properties:

$\displaystyle (AB)^{-1} = B^{-1}A^{-1}
$

(if one side exists then so does the other) and

$\displaystyle (A^T)^{-1} = (A^{-1})^T
$

Determinants

Again $ A$ is $ n \times n$. The determinant if a function on the set of $ n \times n$ matrices such that:

  1. det$ (I) = 1$.

  2. If $ A^\prime$ is the matrix $ A$ with two columns interchanged then

       det$\displaystyle (A^\prime) = -$   det$\displaystyle (A)\, .
$

    (So: two equal columns implies det$ (A)=0$.)

  3. det$ (A)$ is a linear function of each column of $ A$. If $ A = ( a_1, \ldots, a_n)$ with $ a_i$ denoting the $ i$th column of the matrix then

    det$\displaystyle (a_1,\ldots,$ $\displaystyle a_i+b_i,\ldots,a_n)$    
    $\displaystyle =$ det$\displaystyle (a_1,\ldots,a_i,\ldots,a_n)$    
      $\displaystyle +$   det$\displaystyle (a_1,\ldots,b_i,\ldots,a_n)$    

Here are some properties of the determinant:

  1. det$ (A^T) =$   det$ (A)$.

  2. det$ (AB) =$   det$ (A)$det$ (B)$.

  3. det$ (A^{-1})_ = 1/$det$ (A)$.

  4. $ A$ is invertible if and only if det$ (A) \neq 0$ if and only if rank$ (A) = n$.

  5. Determinants can be computed (slowly) by expansion by minors.

Special Kinds of Matrices

  1. $ A$ is symmetric if $ A^T=A$.

  2. $ A$ is orthogonal if $ A^T=A^{-1}$ (or $ AA^T = A^TA=I$).

  3. $ A$ is idempotent if $ AA\equiv A^2 = A$.

  4. $ A$ is diagonal if $ i \neq j$ implies $ A_{ij}=0$.

Inner Products, orthogonal, orthonormal vectors

Defn: Two vectors $ x$ and $ y$ are orthogonal if $ x^T y = \sum x_i y_i
= 0$.

Defn: The inner product or dot product of $ x$ and $ y$ is

$\displaystyle <x,y> = x^Ty = \sum x_i y_i
$

Defn: $ x$ and $ y$ are orthogonal if $ x^Ty=0$.

Defn: The norm (or length) of $ x$ is $ \vert\vert x\vert\vert = (x^Tx)^{1/2} = (\sum x_i^2)^{1/2}$

$ A$ is orthogonal if each column of $ A$ has length 1 and is orthogonal to each other column of $ A$.

Quadratic Forms

Suppose $ A$ is an $ n \times n$ matrix. The function

$\displaystyle g(x) = x^T A x
$

is called a quadratic form. Now

$\displaystyle g(x)$ $\displaystyle = \sum_{ij} A_{ij} x_i x_j$    
  $\displaystyle = \sum_i A_{ii} x_i^2 + \sum_{i < j} (A_{ij}+A_{ji}) x_ix_j$    

so that $ g(x)$ depends only on the total $ A_{ij}+A_{ji}$. In fact

$\displaystyle x^T Ax = x^T A^T x = x^T\left(\frac{A+A^T}{2} \right) x
$

Thus we will assume that $ A$ is symmetric.

Eigenvalues and eigenvectors

If $ A$ is $ n \times n$ and $ v\neq 0\in \mathbb {R}^n$ and $ \lambda\in \mathbb {R}$ such that

$\displaystyle Av=\lambda v
$

then $ \lambda$ is eigenvalue (or characteristic or latent value) of $ A$; $ v$ is corresponding eigenvector. Since $ Av-\lambda v = (A-\lambda I)v=0$ matrix $ A-\lambda I$ is singular.

Therefore det$ (A-\lambda I) =0$.

Conversely: if $ A-\lambda I$ singular then there is $ v \neq 0$ such that $ (A-\lambda I)v=0$.

Fact: det$ (A-\lambda I)$ is polynomial in $ \lambda$ of degree $ n$.

Each root is an eigenvalue.

General $ A$ the roots could be multiple roots or complex valued.

Diagonalization

Matrix $ A$ is diagonalized by a non-singular matrix $ P$ if $ P^{-1} A P \equiv D$ is diagonal.

If so then $ AP=PD$ so each column of $ P$ is eigenvector of $ A$ with the $ i$th column having eigenvalue $ D_{ii}$.

Thus to be diagonalizable $ A$ must have $ n$ linearly independent eigenvectors.

Symmetric Matrices

If $ A$ is symmetric then

  1. Every eigenvalue of $ A$ is real (not complex).

  2. $ A$ is diagonalizable; columns of $ P$ may be taken unit length, mutually orthogonal: $ A$ is diagonalizable by an orthogonal matrix $ P$; in symbols $ P^TAP = D$.

  3. Diagonal entries in $ D=$ eigenvalues of $ A$.

  4. If $ \lambda_1 \neq \lambda_2$ are two eigenvalues of $ A$ and $ v_1$ and $ v_2$ are corresponding eigenvectors then

    $\displaystyle v_1^T A v_2 = v_1^T \lambda_2 v_2 = \lambda_2 v_1^T v_2
$

    and

    $\displaystyle (v_1^T A v_2 )$ $\displaystyle = (v_1^T A v_2 )^T = v_2^T A^T v_1$    
      $\displaystyle = v_2^T A v_1 = v_2^T \lambda_1 v_1 = \lambda_1 v_2^T v_1$    

    Since $ (\lambda_1-\lambda_2)v_1^T v_2 =0$ and $ \lambda_1 \neq \lambda_2$ we see $ v_1^T v_2 = 0$. Eigenvectors corresponding to distinct eigenvalues are orthogonal.

Positive Definite Matrices

Defn: A symmetric matrix $ {\bf A}$ is non-negative definite if $ x^T {\bf A}x \ge 0$ for all $ x$. It is positive definite if in addition $ x^T {\bf A}x = 0$ implies $ x=0$.

$ {\bf A}$ is non-negative definite iff all its eigenvalues are non-negative.

$ {\bf A}$ is positive definite iff all eigenvalues positive.

A non-negative definite matrix has a symmetric non-negative definite square root. If

$\displaystyle {\bf P} {\bf D} {\bf P}^T = {\bf A}
$

for $ {\bf P}$ orthogonal and $ \bf D$ diagonal then

$\displaystyle {\bf A}^{1/2} = {\bf P} {\bf D}^{1/2} {\bf P}^T
$

is symmetric, non-negative definite and

$\displaystyle {\bf A}^{1/2}{\bf A}^{1/2}= {\bf A}
$

Here $ {\bf D}^{1/2}$ is diagonal with

$\displaystyle ({\bf D}^{1/2})_{ii} = ({\bf D}_{ii})^{1/2}.
$

Many other square roots possible. If $ {\bf A}{\bf A}^T = {\bf M}$ and $ \bf P$ is orthogonal and $ {\bf A}^*={\bf A}{\bf P}$ then $ {\bf A}^*({\bf A}^*)^T={\bf M}$.

Orthogonal Projections

Suppose $ S$ vector subspace of $ \mathbb {R}^n$, $ a_1,\ldots,a_m$ basis for $ S$. Given any $ x\in \mathbb {R}^n$ there is a unique $ y\in S$ which is closest to $ x$; $ y$ minimizes

$\displaystyle (x-y)^T(x-y)
$

over $ y\in S$. Any $ y$ in $ S$ is of the form

$\displaystyle y = c_1 a_1 + \cdots + c_m a_m = Ac
$

$ A$, $ n \times m$, columns $ a_1,\ldots,a_m$; $ c$ column with $ i$th entry $ c_i$. Define

$\displaystyle Q= A(A^TA)^{-1}A^T
$

($ A$ has rank $ m$ so $ A^TA$ is invertible.) Then

$\displaystyle (x$ $\displaystyle -Ac)^T (x-Ac)$    
  $\displaystyle = (x-Qx+Qx-Ac)^T (x-Qx+Qx-Ac)$    
  $\displaystyle = (x-Qx)^T(x-Qx) + (Qx-Ac)^T(x-Qx)$    
  $\displaystyle \quad +(x-Qx)^T(Qx-Ac)$    
  $\displaystyle \quad + (Qx-Ac)^T(Qx-Ac)$    

Note that $ x-Qx = (I-Q)x$ and that

$\displaystyle QAc = A(A^TA)^{-1}A^TAc = Ac
$

so that

$\displaystyle Qx-Ac = Q(x-Ac)
$

Then

$\displaystyle (Qx-Ac)^T(x-Qx) = (x-Ac)^T Q^T (I-Q) x
$

Since $ Q^T=Q$ we see that

$\displaystyle Q^T(I-Q)$ $\displaystyle = Q(I-Q)$    
  $\displaystyle = Q-Q^2$    
  $\displaystyle = Q-A(A^TA)^{-1}A^TA(A^TA)^{-1}A^T$    
  $\displaystyle = Q-Q = 0$    

This shows that

$\displaystyle (x-Ac)^T(x-Ac) =$ $\displaystyle (x-Qx)^T(x-Qx)$    
$\displaystyle \quad$ $\displaystyle + (Qx-Ac)^T(Qx-Ac)$    

Choose $ Ac$ to minimize: minimize second term.

Achieved by making $ Qx=Ac$.

Since $ Qx=A(A^TA)^{-1}A^T x$ can take

$\displaystyle c=(A^TA)^{-1}A^T x.$

Summary: closest point $ y$ in $ S$ is

$\displaystyle y=Qx=A(A^TA)^{-1}A^T x
$

call $ y$ the orthogonal projection of $ x$ onto $ S$.

Notice that the matrix $ Q$ is idempotent:

$\displaystyle Q^2=Q
$

We call $ Qx$ the orthogonal projection of $ x$ on $ S$ because $ Qx$ is perpendicular to the residual $ x-Qx = (I-Q)x$.

Partitioned Matrices

Suppose $ A_{11}$ $ p \times r$ matrix, $ A_{1,2}$ $ p \times s$, $ A_{2,1}$ $ q \times r$ and $ A_{2,2}$ $ q \times s$. Make $ (p+q) \times (r+s)$ matrix by putting $ A_{ij}$ in 2 by 2 matrix:

$\displaystyle A = \left[ \begin{array}{cc}
A_{11} & A_{12}
\\
A_{21} & A_{22}
\end{array}\right]
$

For instance if

$\displaystyle A_{11} = \left[ \begin{array}{cc} 1 & 0 \\  0 & 1 \end{array}\right]
$

$\displaystyle A_{12} = \left[ \begin{array}{c} 2 \\  3 \end{array}\right]
$

$\displaystyle A_{21} = \left[ \begin{array}{cc} 4 & 5\end{array}\right]
$

and

$\displaystyle A_{22} = \left[ 6 \right]
$

then

$\displaystyle A = \left[ \begin{array}{cc\vert c}
1 & 0 & 2
\\
0 & 1 & 3
\\
\hline
4 & 5 & 6
\end{array}\right]
$

Lines indicate partitioning.

We can work with partitioned matrices just like ordinary matrices always making sure that in products we never change the order of multiplication of things.

$\displaystyle \left[ \begin{array}{cc} A_{11} & A_{12} \\ \\ A_{21} & A_{22} \end{array} \right]$ $\displaystyle + \left[ \begin{array}{cc} B_{11} & B_{12} \\ \\ B_{21} & B_{22} \end{array} \right]$    
$\displaystyle =$ $\displaystyle \left[ \begin{array}{cc} A_{11}+B_{11} & A_{12}+B_{12} \\ \\ A_{21}+B_{21} & A_{22}+B_{22} \end{array} \right]$    

and

\begin{multline*}
\left[ \begin{array}{cc}
A_{11} & A_{12}
\\
\\
A_{21} & A_{2...
...{11}+A_{22}B_{21} &A_{21}B_{12}+ A_{22}B_{22}
\end{array}\right]
\end{multline*}

Note partitioning of $ A$ and $ B$ must match.

Addition: dimensions of $ A_{ij}$ and $ B_{ij}$ must be the same.

Multiplication formula $ A_{12}$ must have as many columns as $ B_{21}$ has rows, etc.

In general: need $ A_{ij}B_{jk}$ to make sense for each $ i,j,k$.

Works with more than a 2 by 2 partitioning.

Defn: block diagonal matrix: partitioned matrix $ A$ for which $ A_{ij}=0$ if $ i \neq j$. If

$\displaystyle A = \left[ \begin{array}{cc}
A_{11} & 0
\\
0 & A_{22}
\end{array}\right]
$

then $ A$ is invertible iff each $ A_{ii}$ is invertible and then

$\displaystyle A^{-1} = \left[ \begin{array}{cc}
A_{11}^{-1} & 0
\\
0 & A_{22}^{-1}
\end{array}\right]
$

Moreover det$ (A) =$   det$ (A_{11})$   det$ (A_{22})$. Similar formulas work for larger matrices.

Partitioned inverses. Suppose $ {\bf A}$, $ {\bf C}$ are symmetric positive definite. Look for inverse of

$\displaystyle \left[\begin{array}{cc} {\bf A}& {\bf B}\\  {\bf B}^T & {\bf C}\end{array}\right]
$

of form

$\displaystyle \left[\begin{array}{cc} {\bf E}& {\bf F}\\  {\bf F}^T & {\bf G}\end{array}\right]
$

Multiply to get equations

$\displaystyle {\bf A}{\bf E}+{\bf B}{\bf F}^T$ $\displaystyle = {\bf I}$    
$\displaystyle {\bf A}{\bf F}+{\bf B}{\bf G}$ $\displaystyle = {\bf0}$    
$\displaystyle {\bf B}^T{\bf E}+{\bf C}{\bf F}^T$ $\displaystyle = {\bf0}$    
$\displaystyle {\bf B}^T{\bf F}+{\bf C}{\bf G}$ $\displaystyle = {\bf I}$    

Solve to get

$\displaystyle {\bf F}^T$ $\displaystyle =-{\bf C}^{-1} {\bf B}^T{\bf E}$    
$\displaystyle {\bf A}{\bf E}-{\bf B}{\bf C}^{-1}{\bf B}^T {\bf E}$ $\displaystyle = {\bf I}$    
$\displaystyle {\bf E}$ $\displaystyle = ({\bf A}-{\bf B}{\bf C}^{-1}{\bf B}^T)^{-1}$    
$\displaystyle {\bf G}$ $\displaystyle = ({\bf C}-{\bf B}^T{\bf A}^{-1}{\bf B})^{-1}$    



Richard Lockhart
2002-09-24