web

Linear Algebra Review

Notation:

Vectors $x\in \mathbb {R}^n$ are column vectors

$\displaystyle x=\left[\begin{array}{c} x_1 \\ \vdots \\ x_n \end{array}\right]$
An $m \times n$ matrix has rows, columns and entries $A_{ij}$ .
Matrix and vector addition defined componentwise:

$\displaystyle (A+B)_{ij} = A_{ij} +B_{ij} ; \qquad (x+y)_i = x_i + y_i$
If is $m \times n$ and is $n \times r$ then is the $m \times r$ matrix

$\displaystyle (AB)_{ij} = \sum_{k=1}^n A_{ik} B_{kj}$

The matrix or sometimes $I_{n \times n}$ which is an $n \times n$ matrix with $I_{ii} = 1$ for all and $I_{ij}=0$ for any pair $i \neq j$ is called the $n \times n$ identity matrix.
The span of a set of vectors $\{ x_1,\ldots,x_p\}$ is the set of all vectors of the form $x=\sum c_i x_i$ . It is a vector space. The column space of a matrix, , is the span of the set of columns of . The row space is the span of the set of rows.
A set of vectors $\{ x_1,\ldots,x_p\}$ is linearly independent if $\sum c_i x_i = 0$ implies for all . The dimension of a vector space is the cardinality of the largest possible set of linearly independent vectors.

Defn: The transpose, , of an $m \times n$ matrix is the $n \times m$ matrix whose entries are given by

$\displaystyle (A^T)_{ij} = A_{ji}$

so that

is $n \times m$ . We have

$\displaystyle (A+B)^T = A^T+B^T$

and

$\displaystyle (AB)^T = B^T A^T \, .$

Defn: rank of matrix , rank $ (A)$ : # of linearly independent columns of . We have

rank $\displaystyle (A)$	$\displaystyle =$ dim $\displaystyle ($ column space of $\displaystyle )$
	$\displaystyle =$ dim $\displaystyle ($ row space of $\displaystyle )$
	$\displaystyle =$ rank $\displaystyle (A^T)$

If is $m \times n$ then rank $(A) \le \min(m,n)$ .

Matrix inverses

For now: all matrices square $n \times n$ .

If there is a matrix such that $BA=I_{n \times n}$ then we call the inverse of . If exists it is unique and and we write $B=A^{-1}$ . The matrix has an inverse if and only if rank $ (A) = n$ .

Inverses have the following properties:

$\displaystyle (AB)^{-1} = B^{-1}A^{-1}$

(if one side exists then so does the other) and

$\displaystyle (A^T)^{-1} = (A^{-1})^T$

Determinants

Again is $n \times n$ . The determinant if a function on the set of $n \times n$ matrices such that:

det.
If $A^\prime$ is the matrix with two columns interchanged then

det $\displaystyle (A^\prime) = -$ det $\displaystyle (A)\, .$
(So: two equal columns implies det.)
det is a linear function of each column of . If $A = ( a_1, \ldots, a_n)$ with denoting the th column of the matrix then

det $\displaystyle (a_1,\ldots,$ $\displaystyle a_i+b_i,\ldots,a_n)$

$\displaystyle =$ det $\displaystyle (a_1,\ldots,a_i,\ldots,a_n)$

$\displaystyle +$ det $\displaystyle (a_1,\ldots,b_i,\ldots,a_n)$

Here are some properties of the determinant:

det det.
det detdet.
det $(A^{-1})_ = 1/$ det.
is invertible if and only if det $(A) \neq 0$ if and only if rank.
Determinants can be computed (slowly) by expansion by minors.

Special Kinds of Matrices

is symmetric if .
is orthogonal if $A^T=A^{-1}$ (or ).
is idempotent if $AA\equiv A^2 = A$ .
is diagonal if $i \neq j$ implies $A_{ij}=0$ .

Inner Products, orthogonal, orthonormal vectors

Defn: Two vectors and are orthogonal if $x^T y = \sum x_i y_i = 0$ .

Defn: The inner product or dot product of and is

$\displaystyle <x,y> = x^Ty = \sum x_i y_i$

Defn: and are orthogonal if $ x^Ty=0$ .

Defn: The norm (or length) of is $\vert\vert x\vert\vert = (x^Tx)^{1/2} = (\sum x_i^2)^{1/2}$

is orthogonal if each column of has length 1 and is orthogonal to each other column of .

Quadratic Forms

Suppose is an $n \times n$ matrix. The function

$\displaystyle g(x) = x^T A x$

is called a quadratic form. Now

$\displaystyle g(x)$	$\displaystyle = \sum_{ij} A_{ij} x_i x_j$
	$\displaystyle = \sum_i A_{ii} x_i^2 + \sum_{i < j} (A_{ij}+A_{ji}) x_ix_j$

so that

depends only on the total $A_{ij}+A_{ji}$ . In fact

$\displaystyle x^T Ax = x^T A^T x = x^T\left(\frac{A+A^T}{2} \right) x$

Thus we will assume that

is symmetric.

Eigenvalues and eigenvectors

If is $n \times n$ and $v\neq 0\in \mathbb {R}^n$ and $\lambda\in \mathbb {R}$ such that

$\displaystyle Av=\lambda v$

then $\lambda$ is eigenvalue (or characteristic or latent value) of

;

is corresponding eigenvector. Since $Av-\lambda v = (A-\lambda I)v=0$ matrix $A-\lambda I$ is singular.

Therefore det $(A-\lambda I) =0$ .

Conversely: if $A-\lambda I$ singular then there is $v \neq 0$ such that $(A-\lambda I)v=0$ .

Fact: det $(A-\lambda I)$ is polynomial in $\lambda$ of degree .

Each root is an eigenvalue.

General the roots could be multiple roots or complex valued.

Diagonalization

Matrix is diagonalized by a non-singular matrix if $P^{-1} A P \equiv D$ is diagonal.

If so then so each column of is eigenvector of with the th column having eigenvalue $D_{ii}$ .

Thus to be diagonalizable must have linearly independent eigenvectors.

Symmetric Matrices

If is symmetric then

Every eigenvalue of is real (not complex).
is diagonalizable; columns of may be taken unit length, mutually orthogonal: is diagonalizable by an orthogonal matrix ; in symbols .
Diagonal entries in eigenvalues of .
If $\lambda_1 \neq \lambda_2$ are two eigenvalues of and and are corresponding eigenvectors then

$\displaystyle v_1^T A v_2 = v_1^T \lambda_2 v_2 = \lambda_2 v_1^T v_2$
and

$\displaystyle (v_1^T A v_2 )$ $\displaystyle = (v_1^T A v_2 )^T = v_2^T A^T v_1$

$\displaystyle = v_2^T A v_1 = v_2^T \lambda_1 v_1 = \lambda_1 v_2^T v_1$

Since $(\lambda_1-\lambda_2)v_1^T v_2 =0$ and $\lambda_1 \neq \lambda_2$ we see . Eigenvectors corresponding to distinct eigenvalues are orthogonal.

Positive Definite Matrices

Defn: A symmetric matrix ${\bf A}$ is non-negative definite if $x^T {\bf A}x \ge 0$ for all . It is positive definite if in addition $x^T {\bf A}x = 0$ implies .

${\bf A}$ is non-negative definite iff all its eigenvalues are non-negative.

${\bf A}$ is positive definite iff all eigenvalues positive.

A non-negative definite matrix has a symmetric non-negative definite square root. If

$\displaystyle {\bf P} {\bf D} {\bf P}^T = {\bf A}$

for ${\bf P}$ orthogonal and $\bf D$ diagonal then

$\displaystyle {\bf A}^{1/2} = {\bf P} {\bf D}^{1/2} {\bf P}^T$

is symmetric, non-negative definite and

$\displaystyle {\bf A}^{1/2}{\bf A}^{1/2}= {\bf A}$

Here ${\bf D}^{1/2}$ is diagonal with

$\displaystyle ({\bf D}^{1/2})_{ii} = ({\bf D}_{ii})^{1/2}.$

Many other square roots possible. If ${\bf A}{\bf A}^T = {\bf M}$ and $\bf P$ is orthogonal and ${\bf A}^*={\bf A}{\bf P}$ then ${\bf A}^*({\bf A}^*)^T={\bf M}$ .

Orthogonal Projections

Suppose vector subspace of $\mathbb {R}^n$ , $a_1,\ldots,a_m$ basis for . Given any $x\in \mathbb {R}^n$ there is a unique $y\in S$ which is closest to ; minimizes

$\displaystyle (x-y)^T(x-y)$

over $y\in S$ . Any

is of the form

$\displaystyle y = c_1 a_1 + \cdots + c_m a_m = Ac$

, $n \times m$ , columns $a_1,\ldots,a_m$ ;

column with

th entry

. Define

$\displaystyle Q= A(A^TA)^{-1}A^T$

(

has rank

is invertible.) Then

$\displaystyle (x$	$\displaystyle -Ac)^T (x-Ac)$
	$\displaystyle = (x-Qx+Qx-Ac)^T (x-Qx+Qx-Ac)$
	$\displaystyle = (x-Qx)^T(x-Qx) + (Qx-Ac)^T(x-Qx)$
	$\displaystyle \quad +(x-Qx)^T(Qx-Ac)$
	$\displaystyle \quad + (Qx-Ac)^T(Qx-Ac)$

Note that $ x-Qx = (I-Q)x$ and that

$\displaystyle QAc = A(A^TA)^{-1}A^TAc = Ac$

so that

$\displaystyle Qx-Ac = Q(x-Ac)$

Then

$\displaystyle (Qx-Ac)^T(x-Qx) = (x-Ac)^T Q^T (I-Q) x$

Since

we see that

$\displaystyle Q^T(I-Q)$	$\displaystyle = Q(I-Q)$
	$\displaystyle = Q-Q^2$
	$\displaystyle = Q-A(A^TA)^{-1}A^TA(A^TA)^{-1}A^T$
	$\displaystyle = Q-Q = 0$

This shows that

$\displaystyle (x-Ac)^T(x-Ac) =$	$\displaystyle (x-Qx)^T(x-Qx)$
$\displaystyle \quad$	$\displaystyle + (Qx-Ac)^T(Qx-Ac)$

Choose to minimize: minimize second term.

Achieved by making $ Qx=Ac$ .

Since $Qx=A(A^TA)^{-1}A^T x$ can take

$\displaystyle c=(A^TA)^{-1}A^T x.$

Summary: closest point in is

$\displaystyle y=Qx=A(A^TA)^{-1}A^T x$

call

the orthogonal projection of

onto

Notice that the matrix is idempotent:

$\displaystyle Q^2=Q$

We call

the orthogonal projection of

because

is perpendicular to the residual $ x-Qx = (I-Q)x$

Partitioned Matrices

Suppose $A_{11}$ $p \times r$ matrix, $A_{1,2}$ $p \times s$ , $A_{2,1}$ $q \times r$ and $A_{2,2}$ $q \times s$ . Make $(p+q) \times (r+s)$ matrix by putting $A_{ij}$ in 2 by 2 matrix:

$\displaystyle A = \left[ \begin{array}{cc} A_{11} & A_{12} \\ A_{21} & A_{22} \end{array}\right]$

For instance if

$\displaystyle A_{11} = \left[ \begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array}\right]$

$\displaystyle A_{12} = \left[ \begin{array}{c} 2 \\ 3 \end{array}\right]$

$\displaystyle A_{21} = \left[ \begin{array}{cc} 4 & 5\end{array}\right]$

and

$\displaystyle A_{22} = \left[ 6 \right]$

then

$\displaystyle A = \left[ \begin{array}{cc\vert c} 1 & 0 & 2 \\ 0 & 1 & 3 \\ \hline 4 & 5 & 6 \end{array}\right]$

Lines indicate partitioning.

We can work with partitioned matrices just like ordinary matrices always making sure that in products we never change the order of multiplication of things.

$\displaystyle \left[ \begin{array}{cc} A_{11} & A_{12} \\ \\ A_{21} & A_{22} \end{array} \right]$	$\displaystyle + \left[ \begin{array}{cc} B_{11} & B_{12} \\ \\ B_{21} & B_{22} \end{array} \right]$
$\displaystyle =$	$\displaystyle \left[ \begin{array}{cc} A_{11}+B_{11} & A_{12}+B_{12} \\ \\ A_{21}+B_{21} & A_{22}+B_{22} \end{array} \right]$

and

$\begin{multline*} \left[ \begin{array}{cc} A_{11} & A_{12} \\ \\ A_{21} & A_{2... ...{11}+A_{22}B_{21} &A_{21}B_{12}+ A_{22}B_{22} \end{array}\right] \end{multline*}$

Note partitioning of and must match.

Addition: dimensions of $A_{ij}$ and $B_{ij}$ must be the same.

Multiplication formula $A_{12}$ must have as many columns as $B_{21}$ has rows, etc.

In general: need $A_{ij}B_{jk}$ to make sense for each $ i,j,k$ .

Works with more than a 2 by 2 partitioning.

Defn: block diagonal matrix: partitioned matrix for which $A_{ij}=0$ if $i \neq j$ . If

$\displaystyle A = \left[ \begin{array}{cc} A_{11} & 0 \\ 0 & A_{22} \end{array}\right]$

then

is invertible iff each $A_{ii}$ is invertible and then

$\displaystyle A^{-1} = \left[ \begin{array}{cc} A_{11}^{-1} & 0 \\ 0 & A_{22}^{-1} \end{array}\right]$

Moreover det $ (A) =$

det $(A_{11})$ det $(A_{22})$ . Similar formulas work for larger matrices.

Partitioned inverses. Suppose ${\bf A}$ , ${\bf C}$ are symmetric positive definite. Look for inverse of

$\displaystyle \left[\begin{array}{cc} {\bf A}& {\bf B}\\ {\bf B}^T & {\bf C}\end{array}\right]$

of form

$\displaystyle \left[\begin{array}{cc} {\bf E}& {\bf F}\\ {\bf F}^T & {\bf G}\end{array}\right]$

Multiply to get equations

$\displaystyle {\bf A}{\bf E}+{\bf B}{\bf F}^T$	$\displaystyle = {\bf I}$
$\displaystyle {\bf A}{\bf F}+{\bf B}{\bf G}$	$\displaystyle = {\bf0}$
$\displaystyle {\bf B}^T{\bf E}+{\bf C}{\bf F}^T$	$\displaystyle = {\bf0}$
$\displaystyle {\bf B}^T{\bf F}+{\bf C}{\bf G}$	$\displaystyle = {\bf I}$

Solve to get

$\displaystyle {\bf F}^T$	$\displaystyle =-{\bf C}^{-1} {\bf B}^T{\bf E}$
$\displaystyle {\bf A}{\bf E}-{\bf B}{\bf C}^{-1}{\bf B}^T {\bf E}$	$\displaystyle = {\bf I}$
$\displaystyle {\bf E}$	$\displaystyle = ({\bf A}-{\bf B}{\bf C}^{-1}{\bf B}^T)^{-1}$
$\displaystyle {\bf G}$	$\displaystyle = ({\bf C}-{\bf B}^T{\bf A}^{-1}{\bf B})^{-1}$

Richard Lockhart
2002-09-24

det $\displaystyle (a_1,\ldots,$	$\displaystyle a_i+b_i,\ldots,a_n)$
$\displaystyle =$	det $\displaystyle (a_1,\ldots,a_i,\ldots,a_n)$
	$\displaystyle +$ det $\displaystyle (a_1,\ldots,b_i,\ldots,a_n)$

$\displaystyle (v_1^T A v_2 )$	$\displaystyle = (v_1^T A v_2 )^T = v_2^T A^T v_1$
	$\displaystyle = v_2^T A v_1 = v_2^T \lambda_1 v_1 = \lambda_1 v_2^T v_1$