Chapter 07

Diagonalization

00 — Symbol Glossary

01 — What is Diagonalization?

Definition

A matrix $A\in\mathbb{R}^{n\times n}$ is diagonalisable if there exists an invertible matrix $P$ such that

$A=PDP^{-1}$

where $D$ is a diagonal matrix. Equivalently, $P^{-1}AP=D$ .

The columns of $P$ are $n$ linearly independent eigenvectors of $A$ , and the diagonal entries of $D$ are the corresponding eigenvalues.

Note

Diagonalisation rewrites $A$ in its "natural" coordinate system — the eigenvector basis — where the transformation is simply scaling along each axis.

Why is this useful?

Matrix powers: $A^k=PD^kP^{-1}$ , and $D^k=\text{diag}(\lambda_1^k,\ldots,\lambda_n^k)$ is trivial to compute.
Matrix exponential: $e^{tA}=Pe^{tD}P^{-1}$ , needed for ODEs and Markov chains.
Understanding long-run behaviour: as $k\to\infty$ , the dominant eigenvalue governs $A^k\mathbf{x}_0$ .

Common mistake

Wrong: every square matrix $A$ can be written as $PDP^{-1}$ .
Why it happens: eigenvalues always exist (over $\mathbb{C}$ ), so it feels like diagonalisation should too.
Correct: diagonalisation requires $n$ linearly independent eigenvectors. Defective matrices (where geometric multiplicity $<$ algebraic multiplicity for some $\lambda$ ) cannot be diagonalised — they require Jordan form instead.
Check: $A=\begin{pmatrix}2&1\\0&2\end{pmatrix}$ has only one independent eigenvector for $\lambda=2$ but the eigenvalue has algebraic multiplicity 2 — not diagonalisable.

02 — Diagonalizability Criterion

Definition

$A\in\mathbb{R}^{n\times n}$ is diagonalisable if and only if $A$ has $n$ linearly independent eigenvectors.

Sufficient conditions (each independently guarantees diagonalisability):

$A$ has $n$ distinct eigenvalues.
$A$ is symmetric ( $A=A^\top$ ) — the spectral theorem guarantees an orthonormal eigenbasis.
For every eigenvalue, the geometric multiplicity equals the algebraic multiplicity.

Example

If $A$ is $3\times3$ with eigenvalues $-1,\,2,\,5$ (all distinct), then $A$ automatically has 3 linearly independent eigenvectors and is diagonalisable, regardless of the rest of the matrix.

03 — The Diagonalization Procedure

Diagonalise $A=\begin{pmatrix}4&1\\2&3\end{pmatrix}$ — eigenvalues $\lambda_1=5$, $\lambda_2=2$ (from Chapter 06)

For $\lambda_1=5$ : $\mathbf{v}_1=\begin{pmatrix}1\\1\end{pmatrix}$ . For $\lambda_2=2$ : $\mathbf{v}_2=\begin{pmatrix}-1\\2\end{pmatrix}$ .

$P=\begin{pmatrix}1&-1\\1&2\end{pmatrix}$ — column 1 is $\mathbf{v}_1$ for $\lambda_1=5$ ; column 2 is $\mathbf{v}_2$ for $\lambda_2=2$ .

$D=\begin{pmatrix}5&0\\0&2\end{pmatrix}$ — $d_{11}=\lambda_1=5$ corresponds to column 1 of $P$ ; $d_{22}=\lambda_2=2$ corresponds to column 2.

$\det(P)=1\cdot2-(-1)\cdot1=3$ $P^{-1}=\frac{1}{3}\begin{pmatrix}2&1\\-1&1\end{pmatrix}$ — use the $2\times2$ inverse formula: swap diagonal, negate off-diagonal, divide by $\det$ .

$PD=\begin{pmatrix}1&-1\\1&2\end{pmatrix}\begin{pmatrix}5&0\\0&2\end{pmatrix}=\begin{pmatrix}5&-2\\5&4\end{pmatrix}$ $PDP^{-1}=\begin{pmatrix}5&-2\\5&4\end{pmatrix}\cdot\frac{1}{3}\begin{pmatrix}2&1\\-1&1\end{pmatrix}=\frac{1}{3}\begin{pmatrix}10+2&5-2\\10-4&5+4\end{pmatrix}=\frac{1}{3}\begin{pmatrix}12&3\\6&9\end{pmatrix}=\begin{pmatrix}4&1\\2&3\end{pmatrix}=A\,\checkmark$

Common mistake

Wrong: putting $\lambda_1$ first in $D$ but $\mathbf{v}_2$ first in $P$ .
Why it happens: writing eigenvalues in one order and eigenvectors in another without tracking correspondences.
Correct: the $j$ -th column of $P$ must be an eigenvector for the $j$ -th diagonal entry of $D$ . Swapping both simultaneously (swapping column $j$ in $P$ and entry $j$ in $D$ ) is fine.
Check: verify $AP=PD$ column by column: $A\mathbf{v}_j=\lambda_j\mathbf{v}_j$ for each $j$ .

04 — Computing Matrix Powers via Diagonalization

A^k = P D^k P^{-1}, \quad D^k = \begin{pmatrix}\lambda_1^k & 0 & \cdots \\ 0 & \lambda_2^k & \cdots \\ \vdots & & \ddots\end{pmatrix}

Compute $A^{10}$ for $A=\begin{pmatrix}4&1\\2&3\end{pmatrix}$

$D^{10}=\begin{pmatrix}5^{10}&0\\0&2^{10}\end{pmatrix}=\begin{pmatrix}9765625&0\\0&1024\end{pmatrix}$ — $5^{10}=9{,}765{,}625$ ; $2^{10}=1024$ .

$A^{10}=\frac{1}{3}\begin{pmatrix}1&-1\\1&2\end{pmatrix}\begin{pmatrix}9765625&0\\0&1024\end{pmatrix}\begin{pmatrix}2&1\\-1&1\end{pmatrix}$ — use $P$ and $P^{-1}$ from the diagonalisation step.

$PD^{10}=\begin{pmatrix}9765625&-1024\\9765625&2048\end{pmatrix}$ — row $i$ , col $j$ : column $j$ of $P$ scaled by diagonal entry $j$ of $D^{10}$ .

$A^{10}=\frac{1}{3}\begin{pmatrix}9765625\cdot2+1024&9765625-1024\\9765625\cdot2-2\cdot2048&9765625+2\cdot2048\end{pmatrix}$ $=\frac{1}{3}\begin{pmatrix}19532274&9764601\\19561521&9769721\end{pmatrix}$ These are integers because $\det(P)=3$ divides all entries in the product.

Note

Without diagonalisation, computing $A^{10}$ requires 10 matrix multiplications. With diagonalisation, it requires raising scalars to powers — far cheaper for large $k$ or large $n$ .

05 — Orthogonal Diagonalization of Symmetric Matrices

Definition

If $A=A^\top$ (symmetric), then $A$ is orthogonally diagonalisable: there exists an orthogonal matrix $Q$ ( $Q^\top Q=I$ , i.e. $Q^{-1}=Q^\top$ ) such that

$A=QDQ^\top$

The columns of $Q$ are orthonormal eigenvectors; the diagonal entries of $D$ are real eigenvalues.

Example

A covariance matrix $\Sigma$ is symmetric ( $\Sigma=\Sigma^\top$ ) and positive semi-definite. By the spectral theorem: $\Sigma=QDQ^\top, \quad D=\text{diag}(\lambda_1,\ldots,\lambda_p),\;\lambda_i\geq0$ The columns of $Q$ are orthonormal principal directions (PCs); $D$ contains their variances. This makes PCA a stable, unique decomposition.

06 — Quant Application — Matrix Powers in Markov Chains

A Markov chain transition matrix $M$ has columns summing to 1 (stochastic matrix). The state after $k$ steps is $\mathbf{s}_k=M^k\mathbf{s}_0$ .

If $M$ is diagonalisable with $M=PDP^{-1}$ :

$\mathbf{s}_k=PD^kP^{-1}\mathbf{s}_0$

As $k\to\infty$ , eigenvalues with $|\lambda|<1$ decay to zero. The dominant eigenvalue is always $\lambda_1=1$ (for a stochastic matrix), and $M^k\to$ steady-state matrix as $k\to\infty$ .

In credit risk, rating transition matrices are Markov chains. Computing the 5-year migration probability matrix requires $M^5=PD^5P^{-1}$ — diagonalisation makes this tractable.

Similarly, in the Vasicek interest rate model, the mean-reversion dynamics involve a matrix exponential $e^{tA}$ which reduces to $Pe^{tD}P^{-1}$ when $A$ is diagonalisable.

Exercises

EXERCISE 7.1

Check whether the matrix has $n$ distinct eigenvalues (automatic diagonalisability), or has repeated eigenvalues (check geometric vs algebraic multiplicity). A zero eigenvalue does not prevent diagonalisability by itself.

(a) $A=\begin{pmatrix}1&0\\0&3\end{pmatrix}$ : already diagonal; eigenvalues $1,3$ distinct. Diagonalisable.

(b) $B=\begin{pmatrix}2&1\\0&2\end{pmatrix}$ : $\lambda=2$ (multiplicity 2). $B-2I=\begin{pmatrix}0&1\\0&0\end{pmatrix}$ ; $\dim\ker=1 < 2$ . Not diagonalisable (defective).

(c) $C=\begin{pmatrix}0&-1\\1&0\end{pmatrix}$ : $p(\lambda)=\lambda^2+1$ ; roots $\pm i\in\mathbb{C}$ . Not diagonalisable over $\mathbb{R}$ (diagonalisable over $\mathbb{C}$ ).

Determine whether each matrix is diagonalisable (over $\mathbb{R}$ ) and explain why:
(a) $\begin{pmatrix}1&0\\0&3\end{pmatrix}$ , (b) $\begin{pmatrix}2&1\\0&2\end{pmatrix}$ , (c) $\begin{pmatrix}0&-1\\1&0\end{pmatrix}$ .

EXERCISE 7.2

Find eigenvalues from $\det(A-\lambda I)=0$ . Find eigenvectors for each. Form $P$ from eigenvectors as columns; $D$ with eigenvalues matching. Verify $AP=PD$ .

$A=\begin{pmatrix}3&-2\\1&0\end{pmatrix}$ .

$p(\lambda)=(3-\lambda)(0-\lambda)-(-2)(1)=\lambda^2-3\lambda+2=(\lambda-1)(\lambda-2) \Rightarrow \lambda_1=2,\,\lambda_2=1$ .

For $\lambda=2$ : $(A-2I)=\begin{pmatrix}1&-2\\1&-2\end{pmatrix}$ ; $v_1=2v_2$ ; $\mathbf{v}_1=\begin{pmatrix}2\\1\end{pmatrix}$ .

For $\lambda=1$ : $(A-I)=\begin{pmatrix}2&-2\\1&-1\end{pmatrix}$ ; $v_1=v_2$ ; $\mathbf{v}_2=\begin{pmatrix}1\\1\end{pmatrix}$ .

$P=\begin{pmatrix}2&1\\1&1\end{pmatrix}$ , $D=\begin{pmatrix}2&0\\0&1\end{pmatrix}$ .

$\det(P)=2-1=1$ ; $P^{-1}=\begin{pmatrix}1&-1\\-1&2\end{pmatrix}$ .

Check: $PDP^{-1}=\begin{pmatrix}2&1\\1&1\end{pmatrix}\begin{pmatrix}2&0\\0&1\end{pmatrix}\begin{pmatrix}1&-1\\-1&2\end{pmatrix}=\begin{pmatrix}4&1\\2&1\end{pmatrix}\begin{pmatrix}1&-1\\-1&2\end{pmatrix}=\begin{pmatrix}3&-2\\1&0\end{pmatrix}=A\,\checkmark$ .

Diagonalise $A=\begin{pmatrix}3&-2\\1&0\end{pmatrix}$ : find $P$ and $D$ such that $A=PDP^{-1}$ .

EXERCISE 7.3

Once $A=PDP^{-1}$ , use $A^k=PD^kP^{-1}$ . Compute $D^6$ by raising each diagonal entry to the 6th power, then carry out the two matrix multiplications.

Using $A=PDP^{-1}$ from Exercise 7.2 with $\lambda_1=2$ , $\lambda_2=1$ .

$D^6=\begin{pmatrix}2^6&0\\0&1^6\end{pmatrix}=\begin{pmatrix}64&0\\0&1\end{pmatrix}$ .

$A^6=PD^6P^{-1}=\begin{pmatrix}2&1\\1&1\end{pmatrix}\begin{pmatrix}64&0\\0&1\end{pmatrix}\begin{pmatrix}1&-1\\-1&2\end{pmatrix}$ .

$PD^6=\begin{pmatrix}128&1\\64&1\end{pmatrix}$ .

$A^6=\begin{pmatrix}128&1\\64&1\end{pmatrix}\begin{pmatrix}1&-1\\-1&2\end{pmatrix}=\begin{pmatrix}127&-126\\63&-62\end{pmatrix}$ .

Using the diagonalisation from Exercise 7.2, compute $A^6$ .

EXERCISE 7.4

For a symmetric matrix, find eigenvalues, then for each eigenvalue find an eigenvector and normalise it to unit length. The resulting unit vectors form the columns of $Q$ . Verify $Q^\top Q=I$ .

$S=\begin{pmatrix}2&1\\1&2\end{pmatrix}$ . $p(\lambda)=(2-\lambda)^2-1=\lambda^2-4\lambda+3=(\lambda-3)(\lambda-1) \Rightarrow \lambda_1=3,\,\lambda_2=1$ .

For $\lambda=3$ : $S-3I=\begin{pmatrix}-1&1\\1&-1\end{pmatrix}$ ; $\mathbf{v}=\begin{pmatrix}1\\1\end{pmatrix}$ ; normalised: $\mathbf{q}_1=\frac{1}{\sqrt{2}}\begin{pmatrix}1\\1\end{pmatrix}$ .

For $\lambda=1$ : $S-I=\begin{pmatrix}1&1\\1&1\end{pmatrix}$ ; $\mathbf{v}=\begin{pmatrix}-1\\1\end{pmatrix}$ ; normalised: $\mathbf{q}_2=\frac{1}{\sqrt{2}}\begin{pmatrix}-1\\1\end{pmatrix}$ .

$Q=\frac{1}{\sqrt{2}}\begin{pmatrix}1&-1\\1&1\end{pmatrix}$ , $D=\begin{pmatrix}3&0\\0&1\end{pmatrix}$ , so $S=QDQ^\top$ .

Verify: $Q^\top Q=\frac{1}{2}\begin{pmatrix}1&1\\-1&1\end{pmatrix}\begin{pmatrix}1&-1\\1&1\end{pmatrix}=\frac{1}{2}\begin{pmatrix}2&0\\0&2\end{pmatrix}=I\,\checkmark$ .

Orthogonally diagonalise the symmetric matrix $S=\begin{pmatrix}2&1\\1&2\end{pmatrix}$ : find orthogonal $Q$ and diagonal $D$ such that $S=QDQ^\top$ .

EXERCISE 7.5

Long-run behaviour is governed by the dominant eigenvalue. If $|\lambda_1|>|\lambda_2|$ , then $A^k\approx\lambda_1^k\mathbf{v}_1(\mathbf{u}_1^\top\mathbf{x}_0)/\lambda_1^k$ scaled — the component along $\mathbf{v}_2$ decays relative to $\mathbf{v}_1$ . What happens if $|\lambda|>1$ vs $|\lambda|<1$ ?

$A^k\mathbf{x}_0=PD^kP^{-1}\mathbf{x}_0=c_1\lambda_1^k\mathbf{v}_1+c_2\lambda_2^k\mathbf{v}_2$ where $P^{-1}\mathbf{x}_0=(c_1,c_2)^\top$ .

Case $|\lambda_1|>1$ , $|\lambda_2|<1$ : the $\mathbf{v}_2$ component decays. As $k\to\infty$ , $A^k\mathbf{x}_0\approx c_1\lambda_1^k\mathbf{v}_1$ — diverges along $\mathbf{v}_1$ . The dominant eigenvalue $\lambda_1$ governs long-run growth rate.

Case $|\lambda_1|=|\lambda_2|<1$ : both components decay. $A^k\mathbf{x}_0\to\mathbf{0}$ — the system is stable.

Case $|\lambda_1|=1$ , $|\lambda_2|<1$ (stochastic matrix): the state converges to the eigenvector $\mathbf{v}_1$ for $\lambda_1=1$ — the steady state distribution.

Suppose $A=PDP^{-1}$ with $D=\text{diag}(\lambda_1,\lambda_2)$ . Describe the long-run behaviour of $A^k\mathbf{x}_0$ for three cases: $|\lambda_1|>1>|\lambda_2|$ ; $|\lambda_1|=|\lambda_2|<1$ ; $\lambda_1=1>|\lambda_2|$ .

EXERCISE 7.6

A credit rating transition matrix $M$ is stochastic. $M^4$ gives 4-year transitions. Diagonalise $M$ (or note the special structure), raise $D$ to the 4th power, and read off the probability of migrating from AA to default in 4 years. Eigenvalue $1$ corresponds to the absorbing state (default is absorbing here).

$M=\begin{pmatrix}0.9&0.1\\0&1\end{pmatrix}$ (AA stays AA with prob $0.9$ ; default is absorbing).

Upper triangular: eigenvalues $\lambda_1=0.9$ , $\lambda_2=1$ .

$M^4=\begin{pmatrix}0.9^4&\star\\0&1\end{pmatrix}=\begin{pmatrix}0.6561&0.3439\\0&1\end{pmatrix}$ .

(Top-right entry: $1-0.9^4=0.3439$ , since rows must sum to 1 for a stochastic matrix.)

Interpretation: a bond rated AA today has a $34.39\%$ probability of having defaulted within 4 years according to this simple model. The dominant eigenvalue $\lambda_2=1$ is the default (absorbing) state; all probability mass eventually flows there as $k\to\infty$ , because $0.9^k\to0$ .

A simplified two-state credit rating model has transition matrix $M=\begin{pmatrix}0.9&0.1\\0&1\end{pmatrix}$ where state 1 = AA-rated, state 2 = default (absorbing). Compute the 4-year transition matrix $M^4$ and interpret the result for a bond currently rated AA.

Chapter Summary

Concept	Formula / Rule
Diagonalisation	$A=PDP^{-1}$ ; columns of $P$ = eigenvectors; diagonal of $D$ = eigenvalues
Diagonalisable condition	$n$ linearly independent eigenvectors (sufficient: $n$ distinct eigenvalues)
Matrix power	$A^k=PD^kP^{-1}$ ; $D^k=\text{diag}(\lambda_1^k,\ldots,\lambda_n^k)$
Matrix exponential	$e^{tA}=Pe^{tD}P^{-1}$ for diagonalisable $A$
Spectral theorem	Symmetric $A=QDQ^\top$ ; $Q$ orthogonal
Defective matrix	$m_g<m_a$ for some $\lambda$ ; requires Jordan form
Dominant eigenvalue	Governs $A^k\mathbf{x}_0$ as $k\to\infty$

Next chapter: Chapter 08 — Linear Transformations, where we study maps $T:V\to W$ satisfying $T(\mathbf{u}+\mathbf{v})=T(\mathbf{u})+T(\mathbf{v})$ and $T(c\mathbf{v})=cT(\mathbf{v})$ , and connect them to matrix representations.