A function T:V→W between vector spaces is a linear transformation (or linear map) if for all u,v∈V and all scalars c:
Additivity:T(u+v)=T(u)+T(v)
Homogeneity:T(cv)=cT(v)
Equivalently, both conditions together: T(cu+dv)=cT(u)+dT(v) for all scalars c,d.
Note
Every T(x)=Ax for a matrix A is a linear transformation. The converse is also true for finite-dimensional spaces: every linear transformation between finite-dimensional spaces has a matrix representation.
Immediate consequences of linearity:
T(0)=0 (the zero vector maps to zero).
T(−v)=−T(v).
T(c1v1+⋯+ckvk)=c1T(v1)+⋯+ckT(vk).
Example
Scaling:T(x)=cx for fixed scalar c.
Rotation by θ:T(xy)=(cosθsinθ−sinθcosθ)(xy).
Projection onto x-axis:T(xy)=(x0).
Differentiation:T(f)=f′ is linear on the space of differentiable functions.
Common mistake
Wrong:T(xy)=(x+1y) (translation) is linear because it "looks simple." Why it happens: translations feel like "scaling by 1" but they violate additivity. Correct:T(0)=(10)=0, so T fails T(0)=0 — not linear. Translations are affine maps, not linear. Check: always verify T(0)=0 as a quick necessary condition.
02 — Kernel and Image
Definition
Let T:V→W be a linear transformation.
Kernel (null space):ker(T)={v∈V:T(v)=0W}
This is a subspace of the domainV.
Image (range):Im(T)={T(v):v∈V}
This is a subspace of the codomainW.
Image:Col(A)=span{(14),(25)}=R2 (since the two pivot columns span R2).
03 — Rank–Nullity Theorem
Definition
Let T:V→W be a linear transformation with dim(V)=n (finite). Then:
dim(ker(T))+dim(Im(T))=n
nullity(T)+rank(T)=n
Example
For T:R5→R3 with rank(T)=3:
nullity(T)=5−3=2
The kernel is 2-dimensional — there are 2 independent "input directions" that collapse to zero.
Note
The rank-nullity theorem is a conservation law for dimensions. It places hard limits on what a transformation can do: a map from R5 to R3 must collapse at least a 2-dimensional subspace.
Common mistake
Wrong: for T:R5→R3 with rank(T)=2, the image is R3. Why it happens: the codomain is R3, so the image "should be" R3. Correct: the image is a 2-dimensional subspace of R3 (a plane through the origin), not all of R3. Check:dim(Im(T))=rank(T), not dim(W). Surjectivity (Im(T)=W) is a separate condition.
04 — Matrix Representation of a Linear Transformation
Every linear T:Rn→Rm has a unique standard matrixA∈Rm×n such that T(x)=Ax.
How to find A: the j-th column of A is T(ej), where ej is the j-th standard basis vector.
A=(T(e1)T(e2)⋯T(en))
Find the matrix for $T:\mathbb{R}^2\to\mathbb{R}^2$: rotation by $90°$ counter-clockwise
Rotating (10) by 90° counter-clockwise: the point (1,0) maps to (0,1). T(e1)=(01) — this becomes column 1 of A.
Rotating (01) by 90° counter-clockwise: the point (0,1) maps to (−1,0). T(e2)=(−10) — this becomes column 2 of A.
A=(T(e1)T(e2))=(01−10)
— the general formula for rotation by θ is (cosθsinθ−sinθcosθ); at θ=90°: cos90°=0, sin90°=1.
A(10)=(01)✓; A(01)=(−10)✓.
05 — Injective, Surjective, and Bijective Transformations
Bijective: both injective and surjective. For T:Rn→Rn, bijective ⟺A is invertible ⟺det(A)=0.
T:Rn→Rm
Injective condition
Surjective condition
n>m
Impossible (nullity ≥n−m>0)
Possible
n<m
Possible
Impossible (rank≤n<m)
n=m
det(A)=0
det(A)=0 (same condition)
06 — Composition and Invertibility
Definition
If T:U→V has matrix A and S:V→W has matrix B, then the composition S∘T:U→W has matrix BA:
(S∘T)(x)=S(T(x))=B(Ax)=(BA)x
Example
Rotate by 90°: A=(01−10). Reflect across x-axis: B=(100−1).
Composition (rotate, then reflect):
BA=(100−1)(01−10)=(0−1−10)
Note that AB=BA — the order of composition matters.
07 — Quant Application — Factor Models as Linear Maps
A linear factor model for asset returns is exactly a linear transformation:
r=Bf+ϵ
where r∈Rn is the vector of n asset returns, f∈Rk is the vector of k factor returns (k≪n), and B∈Rn×k is the loading matrix.
The transformation T:Rk→Rn defined by T(f)=Bf is linear. Its image Im(T)=Col(B) is the factor subspace — the k-dimensional slice of return space explained by the factors.
The kernel ker(T⊤)=ker(B⊤) identifies directions in return space orthogonal to all factors — pure idiosyncratic risk.
Rank-nullity in practice: if B has rank k (full column rank), then the factor map is injective — each factor profile f maps to a distinct return vector. If two columns of B are nearly collinear (rank deficient), two "different" factors have nearly the same effect — a sign of a misspecified model.
In the Barra / APT framework, PCA (eigendecomposition of Σ) chooses B so that Col(B) captures the maximum variance in r with only k factors.
Exercises
EXERCISE 8.1
Check the two linearity conditions: T(u+v)=T(u)+T(v) and T(cv)=cT(v). If either fails for even one example, the map is not linear. Also check T(0)=0 as a quick filter.
(a)T(xy)=(2x−yx+3y). This equals (21−13)(xy) — a matrix multiplication. Linear.
(b)T(xy)=(x2y). Check homogeneity: T(cxcy)=(c2x2cy)=c(x2y) for c=0,1. Not linear.
(c)T(xy)=(x+2y−1). T(0)=(2−1)=0. Not linear (translation).
Determine which of the following maps T:R2→R2 are linear: (a)T(xy)=(2x−yx+3y), (b)T(xy)=(x2y), (c)T(xy)=(x+2y−1).
EXERCISE 8.2
Evaluate T on each standard basis vector e1,e2,e3 and place the results as columns of the matrix. Then use the matrix to compute T(v).
For T:R3→R2 with matrix A=(12−1−224), find ker(T) and Im(T). Verify the rank-nullity theorem.
EXERCISE 8.4
T is injective iff ker(T)={0}; surjective iff rank(A)=m (number of rows). Use the rank-nullity theorem and the dimensions to decide. A map from Rn to Rm with n<m cannot be surjective.
T:R2→R3 with A=101011.
Rank: column 1 and column 2 are linearly independent (neither is a multiple of the other). Rank =2.
Injective: nullity =2−2=0, so ker(T)={0}. Yes, injective.
Surjective:rank=2<3=dim(R3). Image is a 2-D plane in R3, not all of R3. Not surjective.
Geometric interpretation: T embeds R2 as a plane {(x,y,x+y):x,y∈R} inside R3.
Let T:R2→R3 have matrix A=101011. Is T injective? Surjective? Justify.
EXERCISE 8.5
The composition S∘T has matrix BA (apply T first with matrix A, then S with matrix B). Compute the product, then find its kernel by row reduction.
T:R2→R2: A=(1011) (shear). S:R2→R2: B=(2002) (scaling by 2).
Matrix of S∘T: BA=(2002)(1011)=(2022).
det(BA)=4=0, so BA is invertible. ker(S∘T)={0}. The composition is injective (and bijective since it's square).
Let T:R2→R2 have matrix (1011) and S:R2→R2 have matrix (2002). Find the matrix of S∘T and the kernel of the composition.
EXERCISE 8.6
In a factor model r=Bf, the image of B is the factor subspace. Rank of B = number of independent factors. Use the rank-nullity theorem to find the idiosyncratic dimension. Then interpret: if two columns of B are collinear, the model is misspecified.
Rank =3. B has full column rank — the 3 factors are independent.
The factor-space image Col(B) is a 3-D subspace of R4. Nullity of B⊤ (idiosyncratic dimension) =4−3=1: one direction in asset return space is orthogonal to all factors — pure idiosyncratic risk.
If rank were 2 (e.g. two factors nearly collinear), the model would be over-parameterised: two "different" factors would explain the same variance, leading to unstable loading estimates.
A 3-factor model for 4 assets has loading matrix B=101221013211. Find the rank of B, the dimension of the factor subspace, and the dimension of the idiosyncratic (factor-orthogonal) subspace. Interpret the result.
Chapter Summary
Concept
Formula / Rule
Linearity
T(cu+dv)=cT(u)+dT(v)
Always true
T(0)=0
Standard matrix
A=(T(e1)⋯T(en))
Kernel
ker(T)=Nul(A); subspace of domain
Image
Im(T)=Col(A); subspace of codomain
Rank-nullity
rank(T)+nullity(T)=dim(V)
Injective
ker(T)={0}
Surjective
Im(T)=W
Bijective
Injective + surjective; A invertible when square
Composition
(S∘T)(x)=(BA)x
Up next: Chapter 09 — Change of Basis, where we see how the matrix representation [T]BC changes when we switch coordinate systems, and how diagonalisation is a special case of this.