A matrix A∈Rn×n is diagonalisable if there exists an invertible matrix P such that
A=PDP−1
where D is a diagonal matrix. Equivalently, P−1AP=D.
The columns of P are n linearly independent eigenvectors of A, and the diagonal entries of D are the corresponding eigenvalues.
Note
Diagonalisation rewrites A in its "natural" coordinate system — the eigenvector basis — where the transformation is simply scaling along each axis.
Why is this useful?
Matrix powers:Ak=PDkP−1, and Dk=diag(λ1k,…,λnk) is trivial to compute.
Matrix exponential:etA=PetDP−1, needed for ODEs and Markov chains.
Understanding long-run behaviour: as k→∞, the dominant eigenvalue governs Akx0.
Common mistake
Wrong: every square matrix A can be written as PDP−1. Why it happens: eigenvalues always exist (over C), so it feels like diagonalisation should too. Correct: diagonalisation requires nlinearly independent eigenvectors. Defective matrices (where geometric multiplicity < algebraic multiplicity for some λ) cannot be diagonalised — they require Jordan form instead. Check:A=(2012) has only one independent eigenvector for λ=2 but the eigenvalue has algebraic multiplicity 2 — not diagonalisable.
02 — Diagonalizability Criterion
Definition
A∈Rn×n is diagonalisable if and only if A has n linearly independent eigenvectors.
A is symmetric (A=A⊤) — the spectral theorem guarantees an orthonormal eigenbasis.
For every eigenvalue, the geometric multiplicity equals the algebraic multiplicity.
Example
If A is 3×3 with eigenvalues −1,2,5 (all distinct), then A automatically has 3 linearly independent eigenvectors and is diagonalisable, regardless of the rest of the matrix.
Wrong: putting λ1 first in D but v2 first in P. Why it happens: writing eigenvalues in one order and eigenvectors in another without tracking correspondences. Correct: the j-th column of P must be an eigenvector for the j-th diagonal entry of D. Swapping both simultaneously (swapping column j in P and entry j in D) is fine. Check: verify AP=PD column by column: Avj=λjvj for each j.
04 — Computing Matrix Powers via Diagonalization
Ak=PDkP−1,Dk=λ1k0⋮0λ2k⋯⋯⋱
Compute $A^{10}$ for $A=\begin{pmatrix}4&1\\2&3\end{pmatrix}$
A10=31(11−12)(9765625001024)(2−111)
— use P and P−1 from the diagonalisation step.
PD10=(97656259765625−10242048)
— row i, col j: column j of P scaled by diagonal entry j of D10.
A10=31(9765625⋅2+10249765625⋅2−2⋅20489765625−10249765625+2⋅2048)=31(195322741956152197646019769721)
These are integers because det(P)=3 divides all entries in the product.
Note
Without diagonalisation, computing A10 requires 10 matrix multiplications. With diagonalisation, it requires raising scalars to powers — far cheaper for large k or large n.
05 — Orthogonal Diagonalization of Symmetric Matrices
Definition
If A=A⊤ (symmetric), then A is orthogonally diagonalisable: there exists an orthogonal matrix Q (Q⊤Q=I, i.e. Q−1=Q⊤) such that
A=QDQ⊤
The columns of Q are orthonormal eigenvectors; the diagonal entries of D are real eigenvalues.
Example
A covariance matrix Σ is symmetric (Σ=Σ⊤) and positive semi-definite. By the spectral theorem:
Σ=QDQ⊤,D=diag(λ1,…,λp),λi≥0
The columns of Q are orthonormal principal directions (PCs); D contains their variances. This makes PCA a stable, unique decomposition.
06 — Quant Application — Matrix Powers in Markov Chains
A Markov chain transition matrixM has columns summing to 1 (stochastic matrix). The state after k steps is sk=Mks0.
If M is diagonalisable with M=PDP−1:
sk=PDkP−1s0
As k→∞, eigenvalues with ∣λ∣<1 decay to zero. The dominant eigenvalue is always λ1=1 (for a stochastic matrix), and Mk→ steady-state matrix as k→∞.
In credit risk, rating transition matrices are Markov chains. Computing the 5-year migration probability matrix requires M5=PD5P−1 — diagonalisation makes this tractable.
Similarly, in the Vasicek interest rate model, the mean-reversion dynamics involve a matrix exponential etA which reduces to PetDP−1 when A is diagonalisable.
Exercises
EXERCISE 7.1
Check whether the matrix has n distinct eigenvalues (automatic diagonalisability), or has repeated eigenvalues (check geometric vs algebraic multiplicity). A zero eigenvalue does not prevent diagonalisability by itself.
Diagonalise A=(31−20): find P and D such that A=PDP−1.
EXERCISE 7.3
Once A=PDP−1, use Ak=PDkP−1. Compute D6 by raising each diagonal entry to the 6th power, then carry out the two matrix multiplications.
Using A=PDP−1 from Exercise 7.2 with λ1=2, λ2=1.
D6=(260016)=(64001).
A6=PD6P−1=(2111)(64001)(1−1−12).
PD6=(1286411).
A6=(1286411)(1−1−12)=(12763−126−62).
Using the diagonalisation from Exercise 7.2, compute A6.
EXERCISE 7.4
For a symmetric matrix, find eigenvalues, then for each eigenvalue find an eigenvector and normalise it to unit length. The resulting unit vectors form the columns of Q. Verify Q⊤Q=I.
For λ=3: S−3I=(−111−1); v=(11); normalised: q1=21(11).
For λ=1: S−I=(1111); v=(−11); normalised: q2=21(−11).
Q=21(11−11), D=(3001), so S=QDQ⊤.
Verify: Q⊤Q=21(1−111)(11−11)=21(2002)=I✓.
Orthogonally diagonalise the symmetric matrix S=(2112): find orthogonal Q and diagonal D such that S=QDQ⊤.
EXERCISE 7.5
Long-run behaviour is governed by the dominant eigenvalue. If ∣λ1∣>∣λ2∣, then Ak≈λ1kv1(u1⊤x0)/λ1k scaled — the component along v2 decays relative to v1. What happens if ∣λ∣>1 vs ∣λ∣<1?
Akx0=PDkP−1x0=c1λ1kv1+c2λ2kv2 where P−1x0=(c1,c2)⊤.
Case ∣λ1∣>1, ∣λ2∣<1: the v2 component decays. As k→∞, Akx0≈c1λ1kv1 — diverges along v1. The dominant eigenvalue λ1 governs long-run growth rate.
Case ∣λ1∣=∣λ2∣<1: both components decay. Akx0→0 — the system is stable.
Case ∣λ1∣=1, ∣λ2∣<1 (stochastic matrix): the state converges to the eigenvector v1 for λ1=1 — the steady state distribution.
Suppose A=PDP−1 with D=diag(λ1,λ2). Describe the long-run behaviour of Akx0 for three cases: ∣λ1∣>1>∣λ2∣; ∣λ1∣=∣λ2∣<1; λ1=1>∣λ2∣.
EXERCISE 7.6
A credit rating transition matrix M is stochastic. M4 gives 4-year transitions. Diagonalise M (or note the special structure), raise D to the 4th power, and read off the probability of migrating from AA to default in 4 years. Eigenvalue 1 corresponds to the absorbing state (default is absorbing here).
M=(0.900.11) (AA stays AA with prob 0.9; default is absorbing).
Upper triangular: eigenvalues λ1=0.9, λ2=1.
M4=(0.940⋆1)=(0.656100.34391).
(Top-right entry: 1−0.94=0.3439, since rows must sum to 1 for a stochastic matrix.)
Interpretation: a bond rated AA today has a 34.39% probability of having defaulted within 4 years according to this simple model. The dominant eigenvalue λ2=1 is the default (absorbing) state; all probability mass eventually flows there as k→∞, because 0.9k→0.
A simplified two-state credit rating model has transition matrix M=(0.900.11) where state 1 = AA-rated, state 2 = default (absorbing). Compute the 4-year transition matrix M4 and interpret the result for a bond currently rated AA.
Chapter Summary
Concept
Formula / Rule
Diagonalisation
A=PDP−1; columns of P = eigenvectors; diagonal of D = eigenvalues
Diagonalisable condition
n linearly independent eigenvectors (sufficient: n distinct eigenvalues)
Matrix power
Ak=PDkP−1; Dk=diag(λ1k,…,λnk)
Matrix exponential
etA=PetDP−1 for diagonalisable A
Spectral theorem
Symmetric A=QDQ⊤; Q orthogonal
Defective matrix
mg<ma for some λ; requires Jordan form
Dominant eigenvalue
Governs Akx0 as k→∞
Next chapter: Chapter 08 — Linear Transformations, where we study maps T:V→W satisfying T(u+v)=T(u)+T(v) and T(cv)=cT(v), and connect them to matrix representations.