Chapter 08

Linear Transformations

00 — Symbol Glossary


01 — Definition of a Linear Transformation

Definition

A function T:VWT:V\to W between vector spaces is a linear transformation (or linear map) if for all u,vV\mathbf{u},\mathbf{v}\in V and all scalars cc:

  1. Additivity: T(u+v)=T(u)+T(v)T(\mathbf{u}+\mathbf{v})=T(\mathbf{u})+T(\mathbf{v})
  2. Homogeneity: T(cv)=cT(v)T(c\mathbf{v})=cT(\mathbf{v})

Equivalently, both conditions together: T(cu+dv)=cT(u)+dT(v)T(c\mathbf{u}+d\mathbf{v})=cT(\mathbf{u})+dT(\mathbf{v}) for all scalars c,dc,d.

Note

Every T(x)=AxT(\mathbf{x})=A\mathbf{x} for a matrix AA is a linear transformation. The converse is also true for finite-dimensional spaces: every linear transformation between finite-dimensional spaces has a matrix representation.

Immediate consequences of linearity:

  • T(0)=0T(\mathbf{0})=\mathbf{0} (the zero vector maps to zero).
  • T(v)=T(v)T(-\mathbf{v})=-T(\mathbf{v}).
  • T(c1v1++ckvk)=c1T(v1)++ckT(vk)T(c_1\mathbf{v}_1+\cdots+c_k\mathbf{v}_k)=c_1T(\mathbf{v}_1)+\cdots+c_kT(\mathbf{v}_k).
Example
  • Scaling: T(x)=cxT(\mathbf{x})=c\mathbf{x} for fixed scalar cc.
  • Rotation by θ\theta: T(xy)=(cosθsinθsinθcosθ)(xy)T\begin{pmatrix}x\\y\end{pmatrix}=\begin{pmatrix}\cos\theta&-\sin\theta\\\sin\theta&\cos\theta\end{pmatrix}\begin{pmatrix}x\\y\end{pmatrix}.
  • Projection onto xx-axis: T(xy)=(x0)T\begin{pmatrix}x\\y\end{pmatrix}=\begin{pmatrix}x\\0\end{pmatrix}.
  • Differentiation: T(f)=fT(f)=f' is linear on the space of differentiable functions.
Common mistake

Wrong: T(xy)=(x+1y)T\begin{pmatrix}x\\y\end{pmatrix}=\begin{pmatrix}x+1\\y\end{pmatrix} (translation) is linear because it "looks simple."
Why it happens: translations feel like "scaling by 1" but they violate additivity.
Correct: T(0)=(10)0T(\mathbf{0})=\begin{pmatrix}1\\0\end{pmatrix}\neq\mathbf{0}, so TT fails T(0)=0T(\mathbf{0})=\mathbf{0} — not linear. Translations are affine maps, not linear.
Check: always verify T(0)=0T(\mathbf{0})=\mathbf{0} as a quick necessary condition.


02 — Kernel and Image

Definition

Let T:VWT:V\to W be a linear transformation.

Kernel (null space): ker(T)={vV:T(v)=0W}\ker(T)=\{\mathbf{v}\in V : T(\mathbf{v})=\mathbf{0}_W\} This is a subspace of the domain VV.

Image (range): Im(T)={T(v):vV}\text{Im}(T)=\{T(\mathbf{v}) : \mathbf{v}\in V\} This is a subspace of the codomain WW.

Example

T:R3R2T:\mathbb{R}^3\to\mathbb{R}^2 defined by T(x)=AxT(\mathbf{x})=A\mathbf{x} with A=(123456)A=\begin{pmatrix}1&2&3\\4&5&6\end{pmatrix}.

Kernel: Solve Ax=0A\mathbf{x}=\mathbf{0}. Row reduce: (101012)\begin{pmatrix}1&0&-1\\0&1&2\end{pmatrix}. Free variable x3=tx_3=t; then x1=tx_1=t, x2=2tx_2=-2t. ker(T)=span ⁣{(121)}\ker(T)=\text{span}\!\left\{\begin{pmatrix}1\\-2\\1\end{pmatrix}\right\}.

Image: Col(A)=span ⁣{(14),(25)}=R2\text{Col}(A)=\text{span}\!\left\{\begin{pmatrix}1\\4\end{pmatrix},\begin{pmatrix}2\\5\end{pmatrix}\right\}=\mathbb{R}^2 (since the two pivot columns span R2\mathbb{R}^2).


03 — Rank–Nullity Theorem

Definition

Let T:VWT:V\to W be a linear transformation with dim(V)=n\dim(V)=n (finite). Then:

dim(ker(T))+dim(Im(T))=n\dim(\ker(T))+\dim(\text{Im}(T))=n

nullity(T)+rank(T)=n\text{nullity}(T)+\text{rank}(T)=n

Example

For T:R5R3T:\mathbb{R}^5\to\mathbb{R}^3 with rank(T)=3\text{rank}(T)=3: nullity(T)=53=2\text{nullity}(T)=5-3=2 The kernel is 2-dimensional — there are 2 independent "input directions" that collapse to zero.

Note

The rank-nullity theorem is a conservation law for dimensions. It places hard limits on what a transformation can do: a map from R5\mathbb{R}^5 to R3\mathbb{R}^3 must collapse at least a 2-dimensional subspace.

Common mistake

Wrong: for T:R5R3T:\mathbb{R}^5\to\mathbb{R}^3 with rank(T)=2\text{rank}(T)=2, the image is R3\mathbb{R}^3.
Why it happens: the codomain is R3\mathbb{R}^3, so the image "should be" R3\mathbb{R}^3.
Correct: the image is a 2-dimensional subspace of R3\mathbb{R}^3 (a plane through the origin), not all of R3\mathbb{R}^3.
Check: dim(Im(T))=rank(T)\dim(\text{Im}(T))=\text{rank}(T), not dim(W)\dim(W). Surjectivity (Im(T)=W\text{Im}(T)=W) is a separate condition.


04 — Matrix Representation of a Linear Transformation

Every linear T:RnRmT:\mathbb{R}^n\to\mathbb{R}^m has a unique standard matrix ARm×nA\in\mathbb{R}^{m\times n} such that T(x)=AxT(\mathbf{x})=A\mathbf{x}.

How to find AA: the jj-th column of AA is T(ej)T(\mathbf{e}_j), where ej\mathbf{e}_j is the jj-th standard basis vector.

A=(T(e1)T(e2)T(en))A = \begin{pmatrix} T(\mathbf{e}_1) & T(\mathbf{e}_2) & \cdots & T(\mathbf{e}_n) \end{pmatrix}

Find the matrix for $T:\mathbb{R}^2\to\mathbb{R}^2$: rotation by $90°$ counter-clockwise

Rotating (10)\begin{pmatrix}1\\0\end{pmatrix} by 90°90° counter-clockwise: the point (1,0)(1,0) maps to (0,1)(0,1).
T(e1)=(01)T(\mathbf{e}_1)=\begin{pmatrix}0\\1\end{pmatrix} — this becomes column 1 of AA.

Rotating (01)\begin{pmatrix}0\\1\end{pmatrix} by 90°90° counter-clockwise: the point (0,1)(0,1) maps to (1,0)(-1,0).
T(e2)=(10)T(\mathbf{e}_2)=\begin{pmatrix}-1\\0\end{pmatrix} — this becomes column 2 of AA.

A=(T(e1)T(e2))=(0110)A=\begin{pmatrix}T(\mathbf{e}_1)&T(\mathbf{e}_2)\end{pmatrix}=\begin{pmatrix}0&-1\\1&0\end{pmatrix} — the general formula for rotation by θ\theta is (cosθsinθsinθcosθ)\begin{pmatrix}\cos\theta&-\sin\theta\\\sin\theta&\cos\theta\end{pmatrix}; at θ=90°\theta=90°: cos90°=0\cos90°=0, sin90°=1\sin90°=1.

A(10)=(01)A\begin{pmatrix}1\\0\end{pmatrix}=\begin{pmatrix}0\\1\end{pmatrix}\,\checkmark; A(01)=(10)A\begin{pmatrix}0\\1\end{pmatrix}=\begin{pmatrix}-1\\0\end{pmatrix}\,\checkmark.


05 — Injective, Surjective, and Bijective Transformations

Definition

Let T:VWT:V\to W.

  • Injective (one-to-one): T(u)=T(v)u=vT(\mathbf{u})=T(\mathbf{v})\Rightarrow\mathbf{u}=\mathbf{v}. Equivalently, ker(T)={0}\ker(T)=\{\mathbf{0}\}.
  • Surjective (onto): Im(T)=W\text{Im}(T)=W.
  • Bijective: both injective and surjective. For T:RnRnT:\mathbb{R}^n\to\mathbb{R}^n, bijective     \iff AA is invertible     \iff det(A)0\det(A)\neq0.
T:RnRmT:\mathbb{R}^n\to\mathbb{R}^mInjective conditionSurjective condition
n>mn>mImpossible (nullity nm>0\geq n-m>0)Possible
n<mn<mPossibleImpossible (rankn<m\text{rank}\leq n<m)
n=mn=mdet(A)0\det(A)\neq0det(A)0\det(A)\neq0 (same condition)

06 — Composition and Invertibility

Definition

If T:UVT:U\to V has matrix AA and S:VWS:V\to W has matrix BB, then the composition ST:UWS\circ T:U\to W has matrix BABA:

(ST)(x)=S(T(x))=B(Ax)=(BA)x(S\circ T)(\mathbf{x})=S(T(\mathbf{x}))=B(A\mathbf{x})=(BA)\mathbf{x}

Example

Rotate by 90°90°: A=(0110)A=\begin{pmatrix}0&-1\\1&0\end{pmatrix}. Reflect across xx-axis: B=(1001)B=\begin{pmatrix}1&0\\0&-1\end{pmatrix}.

Composition (rotate, then reflect): BA=(1001)(0110)=(0110)BA=\begin{pmatrix}1&0\\0&-1\end{pmatrix}\begin{pmatrix}0&-1\\1&0\end{pmatrix}=\begin{pmatrix}0&-1\\-1&0\end{pmatrix} Note that ABBAAB\neq BA — the order of composition matters.


07 — Quant Application — Factor Models as Linear Maps

A linear factor model for asset returns is exactly a linear transformation:

r=Bf+ϵ\mathbf{r}=B\mathbf{f}+\boldsymbol{\epsilon}

where rRn\mathbf{r}\in\mathbb{R}^n is the vector of nn asset returns, fRk\mathbf{f}\in\mathbb{R}^k is the vector of kk factor returns (knk\ll n), and BRn×kB\in\mathbb{R}^{n\times k} is the loading matrix.

The transformation T:RkRnT:\mathbb{R}^k\to\mathbb{R}^n defined by T(f)=BfT(\mathbf{f})=B\mathbf{f} is linear. Its image Im(T)=Col(B)\text{Im}(T)=\text{Col}(B) is the factor subspace — the kk-dimensional slice of return space explained by the factors.

The kernel ker(T)=ker(B)\ker(T^\top)=\ker(B^\top) identifies directions in return space orthogonal to all factors — pure idiosyncratic risk.

Rank-nullity in practice: if BB has rank kk (full column rank), then the factor map is injective — each factor profile f\mathbf{f} maps to a distinct return vector. If two columns of BB are nearly collinear (rank deficient), two "different" factors have nearly the same effect — a sign of a misspecified model.

In the Barra / APT framework, PCA (eigendecomposition of Σ\Sigma) chooses BB so that Col(B)\text{Col}(B) captures the maximum variance in r\mathbf{r} with only kk factors.


Exercises

EXERCISE 8.1

Check the two linearity conditions: T(u+v)=T(u)+T(v)T(\mathbf{u}+\mathbf{v})=T(\mathbf{u})+T(\mathbf{v}) and T(cv)=cT(v)T(c\mathbf{v})=cT(\mathbf{v}). If either fails for even one example, the map is not linear. Also check T(0)=0T(\mathbf{0})=\mathbf{0} as a quick filter.

(a) T(xy)=(2xyx+3y)T\begin{pmatrix}x\\y\end{pmatrix}=\begin{pmatrix}2x-y\\x+3y\end{pmatrix}. This equals (2113)(xy)\begin{pmatrix}2&-1\\1&3\end{pmatrix}\begin{pmatrix}x\\y\end{pmatrix} — a matrix multiplication. Linear.

(b) T(xy)=(x2y)T\begin{pmatrix}x\\y\end{pmatrix}=\begin{pmatrix}x^2\\y\end{pmatrix}. Check homogeneity: T(cxcy)=(c2x2cy)c(x2y)T\begin{pmatrix}cx\\cy\end{pmatrix}=\begin{pmatrix}c^2x^2\\cy\end{pmatrix}\neq c\begin{pmatrix}x^2\\y\end{pmatrix} for c0,1c\neq0,1. Not linear.

(c) T(xy)=(x+2y1)T\begin{pmatrix}x\\y\end{pmatrix}=\begin{pmatrix}x+2\\y-1\end{pmatrix}. T(0)=(21)0T(\mathbf{0})=\begin{pmatrix}2\\-1\end{pmatrix}\neq\mathbf{0}. Not linear (translation).

Determine which of the following maps T:R2R2T:\mathbb{R}^2\to\mathbb{R}^2 are linear:
(a) T(xy)=(2xyx+3y)T\begin{pmatrix}x\\y\end{pmatrix}=\begin{pmatrix}2x-y\\x+3y\end{pmatrix}, (b) T(xy)=(x2y)T\begin{pmatrix}x\\y\end{pmatrix}=\begin{pmatrix}x^2\\y\end{pmatrix}, (c) T(xy)=(x+2y1)T\begin{pmatrix}x\\y\end{pmatrix}=\begin{pmatrix}x+2\\y-1\end{pmatrix}.

EXERCISE 8.2

Evaluate TT on each standard basis vector e1,e2,e3\mathbf{e}_1,\mathbf{e}_2,\mathbf{e}_3 and place the results as columns of the matrix. Then use the matrix to compute T(v)T(\mathbf{v}).

T(xyz)=(x+2y3zx)T\begin{pmatrix}x\\y\\z\end{pmatrix}=\begin{pmatrix}x+2y\\3z-x\end{pmatrix}.

T(e1)=T(100)=(11)T(\mathbf{e}_1)=T\begin{pmatrix}1\\0\\0\end{pmatrix}=\begin{pmatrix}1\\-1\end{pmatrix}; T(e2)=T(010)=(20)T(\mathbf{e}_2)=T\begin{pmatrix}0\\1\\0\end{pmatrix}=\begin{pmatrix}2\\0\end{pmatrix}; T(e3)=T(001)=(03)T(\mathbf{e}_3)=T\begin{pmatrix}0\\0\\1\end{pmatrix}=\begin{pmatrix}0\\3\end{pmatrix}.

Standard matrix: A=(120103)A=\begin{pmatrix}1&2&0\\-1&0&3\end{pmatrix}.

T(213)=A(213)=(22+02+0+9)=(07)T\begin{pmatrix}2\\-1\\3\end{pmatrix}=A\begin{pmatrix}2\\-1\\3\end{pmatrix}=\begin{pmatrix}2-2+0\\-2+0+9\end{pmatrix}=\begin{pmatrix}0\\7\end{pmatrix}.

Find the standard matrix for T:R3R2T:\mathbb{R}^3\to\mathbb{R}^2 defined by T(xyz)=(x+2y3zx)T\begin{pmatrix}x\\y\\z\end{pmatrix}=\begin{pmatrix}x+2y\\3z-x\end{pmatrix}, then compute T(213)T\begin{pmatrix}2\\-1\\3\end{pmatrix}.

EXERCISE 8.3

For T(x)=AxT(\mathbf{x})=A\mathbf{x}: the kernel is Nul(A)\text{Nul}(A) (row reduce [A0][A|\mathbf{0}]); the image is Col(A)\text{Col}(A) (columns corresponding to pivots). Use rank-nullity to verify.

A=(112224)A=\begin{pmatrix}1&-1&2\\2&-2&4\end{pmatrix}.

Row reduce: R2R22R1R_2\leftarrow R_2-2R_1: (112000)\begin{pmatrix}1&-1&2\\0&0&0\end{pmatrix}. One pivot (rank =1=1).

Kernel: Free variables x2=sx_2=s, x3=tx_3=t; x1=x22x3=s2tx_1=x_2-2x_3=s-2t. ker(T)=span ⁣{(110),(201)}\ker(T)=\text{span}\!\left\{\begin{pmatrix}1\\1\\0\end{pmatrix},\begin{pmatrix}-2\\0\\1\end{pmatrix}\right\}. Nullity =2=2.

Image: Col(A)=span ⁣{(12)}\text{Col}(A)=\text{span}\!\left\{\begin{pmatrix}1\\2\end{pmatrix}\right\}. Rank =1=1.

Rank-nullity check: 1+2=3=dim(R3)1+2=3=\dim(\mathbb{R}^3)\,\checkmark.

For T:R3R2T:\mathbb{R}^3\to\mathbb{R}^2 with matrix A=(112224)A=\begin{pmatrix}1&-1&2\\2&-2&4\end{pmatrix}, find ker(T)\ker(T) and Im(T)\text{Im}(T). Verify the rank-nullity theorem.

EXERCISE 8.4

TT is injective iff ker(T)={0}\ker(T)=\{\mathbf{0}\}; surjective iff rank(A)=m\text{rank}(A)=m (number of rows). Use the rank-nullity theorem and the dimensions to decide. A map from Rn\mathbb{R}^n to Rm\mathbb{R}^m with n<mn < m cannot be surjective.

T:R2R3T:\mathbb{R}^2\to\mathbb{R}^3 with A=(100111)A=\begin{pmatrix}1&0\\0&1\\1&1\end{pmatrix}.

Rank: column 1 and column 2 are linearly independent (neither is a multiple of the other). Rank =2=2.

Injective: nullity =22=0=2-2=0, so ker(T)={0}\ker(T)=\{\mathbf{0}\}. Yes, injective.

Surjective: rank=2<3=dim(R3)\text{rank}=2 < 3=\dim(\mathbb{R}^3). Image is a 2-D plane in R3\mathbb{R}^3, not all of R3\mathbb{R}^3. Not surjective.

Geometric interpretation: TT embeds R2\mathbb{R}^2 as a plane {(x,y,x+y):x,yR}\{(x,y,x+y): x,y\in\mathbb{R}\} inside R3\mathbb{R}^3.

Let T:R2R3T:\mathbb{R}^2\to\mathbb{R}^3 have matrix A=(100111)A=\begin{pmatrix}1&0\\0&1\\1&1\end{pmatrix}. Is TT injective? Surjective? Justify.

EXERCISE 8.5

The composition STS\circ T has matrix BABA (apply TT first with matrix AA, then SS with matrix BB). Compute the product, then find its kernel by row reduction.

T:R2R2T:\mathbb{R}^2\to\mathbb{R}^2: A=(1101)A=\begin{pmatrix}1&1\\0&1\end{pmatrix} (shear). S:R2R2S:\mathbb{R}^2\to\mathbb{R}^2: B=(2002)B=\begin{pmatrix}2&0\\0&2\end{pmatrix} (scaling by 2).

Matrix of STS\circ T: BA=(2002)(1101)=(2202)BA=\begin{pmatrix}2&0\\0&2\end{pmatrix}\begin{pmatrix}1&1\\0&1\end{pmatrix}=\begin{pmatrix}2&2\\0&2\end{pmatrix}.

det(BA)=40\det(BA)=4\neq0, so BABA is invertible. ker(ST)={0}\ker(S\circ T)=\{\mathbf{0}\}. The composition is injective (and bijective since it's square).

Let T:R2R2T:\mathbb{R}^2\to\mathbb{R}^2 have matrix (1101)\begin{pmatrix}1&1\\0&1\end{pmatrix} and S:R2R2S:\mathbb{R}^2\to\mathbb{R}^2 have matrix (2002)\begin{pmatrix}2&0\\0&2\end{pmatrix}. Find the matrix of STS\circ T and the kernel of the composition.

EXERCISE 8.6

In a factor model r=Bf\mathbf{r}=B\mathbf{f}, the image of BB is the factor subspace. Rank of BB = number of independent factors. Use the rank-nullity theorem to find the idiosyncratic dimension. Then interpret: if two columns of BB are collinear, the model is misspecified.

B=(123012101211)B=\begin{pmatrix}1&2&3\\0&1&2\\1&0&1\\2&1&1\end{pmatrix} (44 assets, 33 factors).

Row reduce BB: R3R3R1R_3\leftarrow R_3-R_1, R4R42R1R_4\leftarrow R_4-2R_1:

(123012022035)\begin{pmatrix}1&2&3\\0&1&2\\0&-2&-2\\0&-3&-5\end{pmatrix}. R3R3+2R2R_3\leftarrow R_3+2R_2, R4R4+3R2R_4\leftarrow R_4+3R_2: (123012002001)\begin{pmatrix}1&2&3\\0&1&2\\0&0&2\\0&0&1\end{pmatrix}. R4R412R3R_4\leftarrow R_4-\tfrac{1}{2}R_3: (123012002000)\begin{pmatrix}1&2&3\\0&1&2\\0&0&2\\0&0&0\end{pmatrix}.

Rank =3=3. BB has full column rank — the 3 factors are independent.

The factor-space image Col(B)\text{Col}(B) is a 3-D subspace of R4\mathbb{R}^4. Nullity of BB^\top (idiosyncratic dimension) =43=1=4-3=1: one direction in asset return space is orthogonal to all factors — pure idiosyncratic risk.

If rank were 2 (e.g. two factors nearly collinear), the model would be over-parameterised: two "different" factors would explain the same variance, leading to unstable loading estimates.

A 3-factor model for 4 assets has loading matrix B=(123012101211)B=\begin{pmatrix}1&2&3\\0&1&2\\1&0&1\\2&1&1\end{pmatrix}. Find the rank of BB, the dimension of the factor subspace, and the dimension of the idiosyncratic (factor-orthogonal) subspace. Interpret the result.


Chapter Summary

ConceptFormula / Rule
LinearityT(cu+dv)=cT(u)+dT(v)T(c\mathbf{u}+d\mathbf{v})=cT(\mathbf{u})+dT(\mathbf{v})
Always trueT(0)=0T(\mathbf{0})=\mathbf{0}
Standard matrixA=(T(e1)T(en))A=\begin{pmatrix}T(\mathbf{e}_1)&\cdots&T(\mathbf{e}_n)\end{pmatrix}
Kernelker(T)=Nul(A)\ker(T)=\text{Nul}(A); subspace of domain
ImageIm(T)=Col(A)\text{Im}(T)=\text{Col}(A); subspace of codomain
Rank-nullityrank(T)+nullity(T)=dim(V)\text{rank}(T)+\text{nullity}(T)=\dim(V)
Injectiveker(T)={0}\ker(T)=\{\mathbf{0}\}
SurjectiveIm(T)=W\text{Im}(T)=W
BijectiveInjective + surjective; AA invertible when square
Composition(ST)(x)=(BA)x(S\circ T)(\mathbf{x})=(BA)\mathbf{x}

Up next: Chapter 09 — Change of Basis, where we see how the matrix representation [T]BC[T]_\mathcal{B}^\mathcal{C} changes when we switch coordinate systems, and how diagonalisation is a special case of this.