Chapter 18

Matrix Decompositions — SVD

00 · Symbol Glossary

$A = U\Sigma V^T$A equals U Sigma V transpose — SVD

The singular value decomposition of any $m \times n$ matrix $A$ . $U$ holds left singular vectors (orthonormal in $\mathbb{R}^m$ ), $\Sigma$ is diagonal with nonnegative singular values $\sigma_1 \geq \sigma_2 \geq \cdots \geq 0$ , and $V$ holds right singular vectors (orthonormal in $\mathbb{R}^n$ ). Unlike LU or QR, SVD exists for every matrix — square, rectangular, singular, or rank-deficient.

$\sigma_i$sigma i — singular value

The $i$ -th singular value of $A$ — the square root of the $i$ -th eigenvalue of $A^TA$ (or $AA^T$ ). Read aloud as "sigma-i." $\sigma_1$ is the largest; $\sigma_i \geq 0$ always. In data analysis, $\sigma_i^2$ measures variance along the $i$ -th principal direction.

$\mathbf{u}_i$u sub i — left singular vector

The $i$ -th column of $U$ . Satisfies $A\mathbf{v}_i = \sigma_i \mathbf{u}_i$ and $AA^T\mathbf{u}_i = \sigma_i^2 \mathbf{u}_i$ . Left singular vectors are orthonormal eigenvectors of $AA^T$ .

$\mathbf{v}_i$v sub i — right singular vector

The $i$ -th column of $V$ . Satisfies $A^T\mathbf{u}_i = \sigma_i \mathbf{v}_i$ and $A^TA\mathbf{v}_i = \sigma_i^2 \mathbf{v}_i$ . Right singular vectors are orthonormal eigenvectors of $A^TA$ .

$A^+$A plus — Moore-Penrose pseudoinverse

The pseudoinverse $A^+ = V\Sigma^+ U^T$ , where $\Sigma^+$ inverts nonzero singular values and leaves zeros at zero. For full-rank square $A$ , $A^+ = A^{-1}$ . For overdetermined least squares, $\hat{\mathbf{x}} = A^+\mathbf{b}$ gives the minimum-norm least squares solution.

01 · The Universal Decomposition

LU requires square invertible matrices (with pivoting). QR requires linearly independent columns. Eigenvalue decomposition requires a square diagonalisable matrix. The SVD exists for every $m \times n$ real matrix — no conditions.

Definition — Singular Value Decomposition

For any $A \in \mathbb{R}^{m \times n}$ with rank $r$ , there exist:

$U \in \mathbb{R}^{m \times m}$ — orthogonal ( $U^TU = UU^T = I_m$ ).

$V \in \mathbb{R}^{n \times n}$ — orthogonal ( $V^TV = VV^T = I_n$ ).

$\Sigma \in \mathbb{R}^{m \times n}$ — diagonal with $\sigma_1 \geq \sigma_2 \geq \cdots \geq \sigma_r > 0$ and $\sigma_{r+1} = \cdots = 0$ .

such that:

A = U\Sigma V^T

$\sigma_i$ — the singular values of $A$ .

$\mathbf{u}_i$ — columns of $U$ , the left singular vectors.

$\mathbf{v}_i$ — columns of $V$ , the right singular vectors.

✓ Example — SVD of a Return Panel

A $252 \times 50$ matrix $A$ holds daily excess returns for 50 stocks over one trading year. SVD gives $A = U\Sigma V^T$ :

$\mathbf{v}_1$ — the first right singular vector — is the portfolio of stock weights explaining the most cross-sectional variance ( $\sigma_1^2$ ).

$\mathbf{v}_2$ — orthogonal to $\mathbf{v}_1$ — captures the next-largest independent pattern.

Keeping the top $k \ll 50$ singular values approximates $A$ with a rank- $k$ matrix, compressing 50-dimensional daily data into $k$ statistical factors. This is PCA in matrix form — Chapter 15's spectral decomposition of $\Sigma$ is the SVD of the centred data matrix.

❌ Failure — Treating SVD as Optional for Singular Matrices

$A = \begin{pmatrix}1&2\\2&4\end{pmatrix}$ — rank 1, singular. LU fails (zero pivot). QR fails (dependent columns). Eigenvalue decomposition gives $\lambda_1 = 5$ , $\lambda_2 = 0$ but the eigenvector for $\lambda_2$ does not span the null space cleanly in numerical code.

SVD: $\sigma_1 = 5$ , $\sigma_2 = 0$ . $\mathbf{v}_1 = \frac{1}{\sqrt{5}}\begin{pmatrix}1\\2\end{pmatrix}$ spans the column space; $\mathbf{v}_2 = \frac{1}{\sqrt{5}}\begin{pmatrix}-2\\1\end{pmatrix}$ spans the null space of $A$ .

Consequence: SVD is the robust tool for rank-deficient and rectangular matrices. It is not a special case — it is the general decomposition from which LU, QR, and eigendecomposition are specialisations.

02 · Singular Values from $A^TA$

The right singular vectors and singular values come from the eigenproblem of $A^TA$ .

Definition — Singular Values as Eigenvalues

The singular values of $A$ satisfy:

\sigma_i = \sqrt{\lambda_i(A^TA)}

$\lambda_i(A^TA)$ — the $i$ -th eigenvalue of $A^TA$ , ordered $\lambda_1 \geq \lambda_2 \geq \cdots \geq 0$ .

The right singular vector $\mathbf{v}_i$ is the unit eigenvector of $A^TA$ for $\lambda_i$ .

The left singular vector is recovered by:

\mathbf{u}_i = \frac{1}{\sigma_i} A\mathbf{v}_i \quad (\sigma_i > 0)

Step-by-step — SVD of $A = \begin{pmatrix}1&2\\2&1\end{pmatrix}$

Form $A^TA$ : $A^TA = \begin{pmatrix}1&2\\2&1\end{pmatrix}\begin{pmatrix}1&2\\2&1\end{pmatrix} = \begin{pmatrix}1+4&2+2\\2+2&4+1\end{pmatrix} = \begin{pmatrix}5&4\\4&5\end{pmatrix}$ .

The $(1,1)$ entry $5$ comes from $1^2+2^2$ ; the $(1,2)$ entry $4$ from $1\cdot2+2\cdot1$ .

Find eigenvalues of $A^TA$ : characteristic polynomial $\det(A^TA - \lambda I) = (5-\lambda)^2 - 16 = \lambda^2 - 10\lambda + 9 = (\lambda-9)(\lambda-1)$ .

$\lambda_1 = 9$ , $\lambda_2 = 1$ . Singular values: $\sigma_1 = \sqrt{9} = 3$ , $\sigma_2 = \sqrt{1} = 1$ .

Right singular vector $\mathbf{v}_1$ (eigenvector for $\lambda_1=9$ ): $(A^TA - 9I)\mathbf{v} = \begin{pmatrix}-4&4\\4&-4\end{pmatrix}\mathbf{v} = \mathbf{0}$ .

$v_1 = v_2$ from the first row: $-4v_1+4v_2=0 \Rightarrow v_1=v_2$ . Unit vector: $\mathbf{v}_1 = \frac{1}{\sqrt{2}}\begin{pmatrix}1\\1\end{pmatrix}$ .

Right singular vector $\mathbf{v}_2$ (eigenvector for $\lambda_2=1$ ): $(A^TA - I)\mathbf{v} = \begin{pmatrix}4&4\\4&4\end{pmatrix}\mathbf{v} = \mathbf{0}$ .

$v_1 = -v_2$ . Unit vector: $\mathbf{v}_2 = \frac{1}{\sqrt{2}}\begin{pmatrix}1\\-1\end{pmatrix}$ .

Left singular vectors from $\mathbf{u}_i = A\mathbf{v}_i/\sigma_i$ :

$A\mathbf{v}_1 = \frac{1}{\sqrt{2}}\begin{pmatrix}1+2\\2+1\end{pmatrix} = \frac{3}{\sqrt{2}}\begin{pmatrix}1\\1\end{pmatrix}$ . $\mathbf{u}_1 = \frac{1}{3}\cdot\frac{3}{\sqrt{2}}\begin{pmatrix}1\\1\end{pmatrix} = \frac{1}{\sqrt{2}}\begin{pmatrix}1\\1\end{pmatrix}$ .

$A\mathbf{v}_2 = \frac{1}{\sqrt{2}}\begin{pmatrix}1-2\\2-1\end{pmatrix} = \frac{1}{\sqrt{2}}\begin{pmatrix}-1\\1\end{pmatrix}$ . $\mathbf{u}_2 = \frac{1}{1}\cdot\frac{1}{\sqrt{2}}\begin{pmatrix}-1\\1\end{pmatrix} = \frac{1}{\sqrt{2}}\begin{pmatrix}-1\\1\end{pmatrix}$ .

Assemble the SVD:

A = \underbrace{\frac{1}{\sqrt{2}}\begin{pmatrix}1&-1\\1&1\end{pmatrix}}_{U} \underbrace{\begin{pmatrix}3&0\\0&1\end{pmatrix}}_{\Sigma} \underbrace{\frac{1}{\sqrt{2}}\begin{pmatrix}1&1\\1&-1\end{pmatrix}^T}_{V^T}

Verify: $\sigma_1=3$ is the dominant stretching direction; $\sigma_2=1$ is the secondary direction. $U$ and $V$ are orthogonal ( $U^TU = I_2$ ✓).

03 · Geometric Interpretation

SVD reveals that every linear map is: rotate ( $V^T$ ), scale along axes ( $\Sigma$ ), rotate ( $U$ ).

Definition — Action of SVD on the Unit Sphere

The unit circle in $\mathbb{R}^n$ maps under $A$ to an ellipse in $\mathbb{R}^m$ . The semi-axes of that ellipse have lengths $\sigma_1, \sigma_2, \ldots$ , oriented along $\mathbf{u}_1, \mathbf{u}_2, \ldots$ .

\|A\mathbf{v}_i\| = \sigma_i

$\mathbf{v}_i$ — input direction that $A$ stretches by exactly $\sigma_i$ without rotation (output is parallel to $\mathbf{u}_i$ ).

The operator norm $\|A\|_2 = \sigma_1$ — the maximum stretching factor. The condition number $\kappa(A) = \sigma_1/\sigma_r$ when $A$ has full rank.

❌ Failure — Confusing Singular Values with Eigenvalues

For $A = \begin{pmatrix}0&1\\0&0\end{pmatrix}$ , eigenvalues are $0, 0$ but $\sigma_1 = 1$ .

Why it breaks: eigenvalues come from $A\mathbf{v} = \lambda\mathbf{v}$ — $A$ must act parallel to $\mathbf{v}$ . Singular values come from $A^TA$ — they measure stretching in the best-matching input-output directions, which need not be the same vector.

Consequence: $\sigma_i$ and $\lambda_i$ coincide in absolute value only for symmetric matrices (up to sign). For general $A$ , use SVD for norms, conditioning, and low-rank structure — not the eigenvalue decomposition of $A$ itself.

04 · Low-Rank Approximation (Eckart–Young)

Definition — Best Rank-$k$ Approximation

For $A = U\Sigma V^T = \sum_{i=1}^r \sigma_i \mathbf{u}_i \mathbf{v}_i^T$ , the rank- $k$ matrix closest to $A$ in the Frobenius norm is:

A_k = \sum_{i=1}^k \sigma_i \mathbf{u}_i \mathbf{v}_i^T = U_k \Sigma_k V_k^T

$A_k$ — keep the top $k$ singular triplets, discard the rest.

Eckart–Young theorem: $\|A - A_k\|_F = \sqrt{\sigma_{k+1}^2 + \cdots + \sigma_r^2}$ — the discarded singular values measure the approximation error exactly.

✓ Example — Compressing a Covariance Structure

For $A = \begin{pmatrix}1&2\\2&1\end{pmatrix}$ with $\sigma_1=3$ , $\sigma_2=1$ : the rank-1 approximation is $A_1 = 3\mathbf{u}_1\mathbf{v}_1^T$ .

Fraction of "energy" retained: $\sigma_1^2/(\sigma_1^2+\sigma_2^2) = 9/10 = 90\%$ .

In a 500-stock return panel, if the top 10 singular values capture 85% of total variance ( $\sum_{i=1}^{10}\sigma_i^2 / \sum_{i=1}^{500}\sigma_i^2$ ), a rank-10 factor model replaces 500 dimensions with 10 — the backbone of statistical risk models in portfolio management.

05 · Connection to PCA and the Pseudoinverse

Definition — SVD and PCA

If $X$ is the $T \times p$ centred data matrix (each column demeaned), then:

X = U\Sigma V^T

Columns of $V$ are the principal component directions. $\sigma_i^2/T$ is the variance explained by PC $i$ . The sample covariance $\hat{\Sigma} = \frac{1}{T}X^TX = V\frac{\Sigma^2}{T}V^T$ — an eigendecomposition with eigenvalues $\sigma_i^2/T$ .

PCA from Chapter 15 and SVD of the data matrix are the same computation in different packaging.

Definition — Moore–Penrose Pseudoinverse

For $A = U\Sigma V^T$ , define:

A^+ = V\Sigma^+ U^T, \quad \Sigma^+_{ii} = \begin{cases}1/\sigma_i & \sigma_i > 0 \\ 0 & \sigma_i = 0\end{cases}

The least squares solution of minimum Euclidean norm is $\hat{\mathbf{x}} = A^+\mathbf{b}$ . When $A$ is invertible, $A^+ = A^{-1}$ .

✓ Example — Hedging with a Rank-Deficient Exposure Matrix

A desk has exposure matrix $A \in \mathbb{R}^{m \times n}$ mapping $n$ instrument weights to $m$ risk factors. Some factors are redundant (rank $r < n$ ). The minimum-norm hedge $\mathbf{w} = A^+\mathbf{e}$ uses SVD to distribute hedging across instruments without inflating position sizes — the pseudoinverse selects the smallest $\|\mathbf{w}\|_2$ among all exact or least-squares solutions.

06 · Practice Exercises

EXERCISE 18.1

For diagonal $A$ , singular values are $|a_{ii}|$ . $U = I$ , $V = I$ (up to sign flips in columns to make $\sigma_i \geq 0$ ).

$A = \begin{pmatrix}3&0\\0&-2\end{pmatrix}$ . $A^TA = \begin{pmatrix}9&0\\0&4\end{pmatrix}$ .

$\sigma_1 = 3$ , $\sigma_2 = 2$ . $\mathbf{v}_1 = \begin{pmatrix}1\\0\end{pmatrix}$ , $\mathbf{v}_2 = \begin{pmatrix}0\\1\end{pmatrix}$ .

$\mathbf{u}_1 = A\mathbf{v}_1/\sigma_1 = \begin{pmatrix}3\\0\end{pmatrix}/3 = \begin{pmatrix}1\\0\end{pmatrix}$ .

$\mathbf{u}_2 = A\mathbf{v}_2/\sigma_2 = \begin{pmatrix}0\\-2\end{pmatrix}/2 = \begin{pmatrix}0\\-1\end{pmatrix}$ .

$A = \begin{pmatrix}1&0\\0&-1\end{pmatrix}\begin{pmatrix}3&0\\0&2\end{pmatrix}\begin{pmatrix}1&0\\0&1\end{pmatrix}^T$ ✓.

Find the SVD of $A = \begin{pmatrix}3&0\\0&-2\end{pmatrix}$ . List $\sigma_1$ , $\sigma_2$ , and the columns of $U$ and $V$ .

EXERCISE 18.2

Compute $A^TA$ , find its eigenvalues, take square roots for $\sigma_i$ . Eigenvectors of $A^TA$ give $\mathbf{v}_i$ ; then $\mathbf{u}_i = A\mathbf{v}_i/\sigma_i$ .

$A = \begin{pmatrix}1&0\\1&1\end{pmatrix}$ . $A^TA = \begin{pmatrix}2&1\\1&1\end{pmatrix}$ .

Characteristic: $(2-\lambda)(1-\lambda)-1 = \lambda^2-3\lambda+1 = 0$ . $\lambda = \frac{3\pm\sqrt{5}}{2}$ .

$\sigma_1 = \sqrt{\frac{3+\sqrt{5}}{2}} \approx 1.618$ , $\sigma_2 = \sqrt{\frac{3-\sqrt{5}}{2}} \approx 0.618$ .

For $\lambda_1$ : eigenvector $\mathbf{v}_1 \propto \begin{pmatrix}1\\(\lambda_1-2)\end{pmatrix}$ — normalise to unit length.

$\mathbf{u}_1 = A\mathbf{v}_1/\sigma_1$ . (Full numeric values follow from the quadratic formula.)

Rank $= 2$ — both singular values are positive.

Compute the singular values of $A = \begin{pmatrix}1&0\\1&1\end{pmatrix}$ by solving the eigenproblem for $A^TA$ . State whether $A$ is full rank.

EXERCISE 18.3

Rank-1 approximation: $A_1 = \sigma_1 \mathbf{u}_1\mathbf{v}_1^T$ . Frobenius error $= \sigma_2$ .

From the chapter: $A = \begin{pmatrix}1&2\\2&1\end{pmatrix}$ , $\sigma_1=3$ , $\sigma_2=1$ , $\mathbf{u}_1 = \mathbf{v}_1 = \frac{1}{\sqrt{2}}\begin{pmatrix}1\\1\end{pmatrix}$ .

$A_1 = 3 \cdot \frac{1}{2}\begin{pmatrix}1\\1\end{pmatrix}\begin{pmatrix}1&1\end{pmatrix} = \frac{3}{2}\begin{pmatrix}1&1\\1&1\end{pmatrix}$ .

$\|A - A_1\|_F = \sigma_2 = 1$ .

Variance captured: $\sigma_1^2/(\sigma_1^2+\sigma_2^2) = 9/10 = 90\%$ .

For $A = \begin{pmatrix}1&2\\2&1\end{pmatrix}$ , write the rank-1 approximation $A_1$ . Compute $\|A-A_1\|_F$ and the fraction of squared singular values retained.

EXERCISE 18.4

$\sigma_1=5$ , $\sigma_2=0$ . $\mathbf{v}_1$ spans column space; $\mathbf{v}_2$ spans null space. $A^+$ inverts only $\sigma_1$ .

$A = \begin{pmatrix}1&2\\2&4\end{pmatrix}$ . $A^TA = \begin{pmatrix}5&10\\10&20\end{pmatrix}$ — eigenvalues $25$ and $0$ .

$\sigma_1 = 5$ , $\sigma_2 = 0$ . $\mathbf{v}_1 = \frac{1}{\sqrt{5}}\begin{pmatrix}1\\2\end{pmatrix}$ .

$A^+ = \frac{1}{25}\begin{pmatrix}1&2\\2&4\end{pmatrix}$ (the pseudoinverse of a rank-1 matrix).

Null space: $\mathbf{v}_2 = \frac{1}{\sqrt{5}}\begin{pmatrix}-2\\1\end{pmatrix}$ . $A\mathbf{v}_2 = \mathbf{0}$ ✓.

Column space of $A$ = span of $\begin{pmatrix}1\\2\end{pmatrix}$ . Null space of $A$ = span of $\begin{pmatrix}-2\\1\end{pmatrix}$ . SVD separates both cleanly.

For $A = \begin{pmatrix}1&2\\2&4\end{pmatrix}$ (rank 1): find $\sigma_1$ , $\sigma_2$ , identify $\mathbf{v}_1$ and $\mathbf{v}_2$ , and describe the column space and null space.

EXERCISE 18.5

PCA of centred $X$ equals SVD of $X$ . Variance of PC $i$ is $\sigma_i^2/T$ . Fraction explained by top $k$ PCs: $\sum_{i=1}^k \sigma_i^2 / \sum_{i=1}^p \sigma_i^2$ .

$X = \begin{pmatrix}1&-1\\0&0\\-1&1\end{pmatrix}$ (3 observations, 2 assets, already centred).

$X^TX = \begin{pmatrix}2&-2\\-2&2\end{pmatrix}$ . Eigenvalues: $4$ and $0$ . $\sigma_1 = 2$ , $\sigma_2 = 0$ .

One PC explains $100\%$ of variance — the assets move in perfect opposition ( $r = -1$ ).

$\mathbf{v}_1 = \frac{1}{\sqrt{2}}\begin{pmatrix}1\\-1\end{pmatrix}$ — long asset 1, short asset 2.

Total variance $= \sigma_1^2 + \sigma_2^2 = 4$ . PC1 fraction $= 4/4 = 100\%$ .

Centred data matrix $X = \begin{pmatrix}1&-1\\0&0\\-1&1\end{pmatrix}$ (3 days, 2 assets). Find $\sigma_1$ , $\sigma_2$ and the fraction of total variance explained by the first principal component.

EXERCISE 18.6

Use $A = U\Sigma V^T$ from the chapter example. $A^+ = V\Sigma^+ U^T$ with $\Sigma^+ = \text{diag}(1/3, 1/1)$ . Compute $\hat{\mathbf{x}} = A^+\mathbf{b}$ .

$A = \begin{pmatrix}1&2\\2&1\end{pmatrix}$ , $\mathbf{b}=\begin{pmatrix}3\\3\end{pmatrix}$ .

Exact solution exists: $\mathbf{x} = \begin{pmatrix}1\\1\end{pmatrix}$ since $A\begin{pmatrix}1\\1\end{pmatrix} = \begin{pmatrix}3\\3\end{pmatrix}$ .

Via SVD: $A^+ = V\Sigma^{-1}U^T$ . With $\sigma_1=3$ , $\sigma_2=1$ :

$A^+\mathbf{b} = V\Sigma^{-1}U^T\mathbf{b}$ .

$U^T\mathbf{b} = \frac{1}{\sqrt{2}}\begin{pmatrix}1&1\\-1&1\end{pmatrix}\begin{pmatrix}3\\3\end{pmatrix} = \frac{1}{\sqrt{2}}\begin{pmatrix}6\\0\end{pmatrix}$ .

$\Sigma^{-1}U^T\mathbf{b} = \begin{pmatrix}6/(3\sqrt{2})\\0\end{pmatrix} = \begin{pmatrix}\sqrt{2}\\0\end{pmatrix}$ .

$V\begin{pmatrix}\sqrt{2}\\0\end{pmatrix} = \frac{\sqrt{2}}{\sqrt{2}}\begin{pmatrix}1\\1\end{pmatrix} = \begin{pmatrix}1\\1\end{pmatrix}$ ✓.

The pseudoinverse recovers the exact solution. For inconsistent $\mathbf{b}$ , it returns the minimum-norm least squares solution.

For $A = \begin{pmatrix}1&2\\2&1\end{pmatrix}$ and $\mathbf{b}=\begin{pmatrix}3\\3\end{pmatrix}$ , compute $\hat{\mathbf{x}} = A^+\mathbf{b}$ using the SVD from this chapter. Verify $A\hat{\mathbf{x}}=\mathbf{b}$ .

07 · Summary

Term	Definition
SVD	$A = U\Sigma V^T$ ; exists for every $m \times n$ matrix
Singular value $\sigma_i$	$\sigma_i = \sqrt{\lambda_i(A^TA)}$ ; $\sigma_1 \geq \cdots \geq 0$
Left singular vector $\mathbf{u}_i$	Column of $U$ ; $\mathbf{u}_i = A\mathbf{v}_i/\sigma_i$
Right singular vector $\mathbf{v}_i$	Unit eigenvector of $A^TA$ ; column of $V$
Operator norm	$\\|A\\|_2 = \sigma_1$
Rank- $k$ approximation	$A_k = \sum_{i=1}^k \sigma_i \mathbf{u}_i\mathbf{v}_i^T$ ; error $= \sqrt{\sum_{i>k}\sigma_i^2}$
PCA connection	SVD of centred $X$ gives PC directions and variances $\sigma_i^2/T$
Pseudoinverse $A^+$	$V\Sigma^+ U^T$ ; least squares with minimum norm

Next: Markov Chains & Steady States — stochastic matrices, stationary distributions as eigenvectors, and credit-rating transitions in quantitative risk.