Chapter 01

Vectors & Vector Spaces

00 · Symbol Glossary

Every symbol you'll see in this chapter, named and explained. When a new one appears, refer back here.

$\mathbb{R}$Blackboard R

The set of all real numbers — every number on the number line: negative, zero, positive, decimals, irrationals like π\pi. The double-struck style of R\mathbb{R} is standard in mathematics to distinguish it from an ordinary RR.

$\mathbb{R}^n$R-n

The set of all vectors with exactly nn real-number components. R2\mathbb{R}^2 is the 2D plane, R3\mathbb{R}^3 is 3D space, R100\mathbb{R}^{100} is 100-dimensional space — used in machine learning constantly.

$\mathbf{v}$Bold v — vector

A vector. Bold lowercase letters always denote vectors. Plain lowercase like vv (no bold) typically denotes a scalar — a single number, not a list.

$v_i$v sub i — component

The ii-th entry of vector v\mathbf{v}. The subscript ii is an index that selects a slot. v1v_1 is the first entry, v2v_2 the second, vnv_n the last.

$\mathbf{v}^T$v transpose

The superscript TT means transpose — flip the column vector into a row vector. A column (35)\begin{pmatrix}3\\5\end{pmatrix} becomes the row (35)\begin{pmatrix}3 & 5\end{pmatrix}. Critical for writing dot products as matrix multiplication.

$\in$"in" / element of

Membership. vRn\mathbf{v} \in \mathbb{R}^n reads "v is an element of Rn\mathbb{R}^n" — meaning v belongs to that space. Think of it as the mathematical word "in."

$\sum$Sigma — summation

Greek capital letter sigma. Means "add up a sequence." i=1nvi\sum_{i=1}^{n} v_i means: start at i=1i=1, go up to i=ni=n, and add v1+v2++vnv_1 + v_2 + \cdots + v_n. A compact way to write long sums.

$\|\mathbf{v}\|$Norm of v — length

The Euclidean length (magnitude) of a vector — how long the arrow is. The double vertical bars are the norm notation. Always a non-negative number.

$\mathbf{u} \cdot \mathbf{v}$Dot product

An operation between two vectors that returns a single number (a scalar). Computed by multiplying corresponding components and summing. The dot \cdot distinguishes this from scalar multiplication.

$\mathbf{0}$Zero vector

The vector where every component is 0. Bold to distinguish from the number zero. 0Rn\mathbf{0} \in \mathbb{R}^n is an nn-component vector of zeros — the "do nothing" element of a vector space.

$\exists$"there exists"

Existential quantifier. 0\exists\, \mathbf{0} means "there exists a zero vector." Used in formal definitions to assert that something exists without naming it explicitly.

$\forall$"for all"

Universal quantifier. vV\forall\, \mathbf{v} \in V means "for every vector v in V" — the statement that follows must hold without exception.

$c,\ d$Scalars

Plain (non-bold) letters representing single real numbers. Called "scalars" to contrast with vectors. They scale vectors — stretching, shrinking, or flipping them.

$\theta$Theta — angle

Greek lowercase letter theta. In this chapter it represents the angle between two vectors. You'll encounter it again in the dot product's geometric formula.

$\cos\theta$Cosine of theta

A trigonometric function that measures how aligned two directions are. cos(0°)=1\cos(0°) = 1 (same direction), cos(90°)=0\cos(90°) = 0 (perpendicular), cos(180°)=1\cos(180°) = -1 (opposite). Appears inside the dot product's geometric formula.

$\sqrt{\phantom{x}}$Square root

x\sqrt{x} is the number which, multiplied by itself, gives xx. Example: 25=5\sqrt{25} = 5 because 5×5=255 \times 5 = 25. Used in the norm formula to "undo" the squaring of components.

$\hat{\mathbf{v}}$v-hat — unit vector

The hat accent means the vector has been normalized — its length is exactly 1. Direction is preserved, magnitude is removed. Read aloud as "v hat."

$\dim(V)$Dimension of V

The number of vectors in any basis of vector space VV. It measures the "degrees of freedom" — how many independent directions exist in the space.

$\text{span}\{\cdot\}$Span

The set of all possible linear combinations of the vectors inside the braces. Represents every point reachable by mixing those vectors with any real-number coefficients.


01 · What is a Vector?

A vector is an ordered list of numbers. "Ordered" means position matters — the first slot is different from the second slot.

Think of it as coordinates. If you say "I'm 3 blocks east and 5 blocks north," you've described a position using two numbers in a specific order — that's a vector: (35)\begin{pmatrix}3\\5\end{pmatrix}.

Definition — Vector in $\mathbb{R}^n$

An nn-dimensional vector is an ordered list of nn real numbers, written vertically as a column:

v=(v1v2vn)Rn\mathbf{v} = \begin{pmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{pmatrix} \in \mathbb{R}^n

v\mathbf{v} — bold lowercase, the vector itself.

v1,v2,,vnv_1, v_2, \ldots, v_n — the individual numbers inside, called components or entries.

Rn\in \mathbb{R}^n — reads "is an element of R-n," meaning this vector lives in nn-dimensional real space.

\vdots — vertical dots meaning "and so on, continuing the pattern."

Concrete Examples

Example — 2D Vector

A stock has return 4% and volatility 2%. As a vector in R2\mathbb{R}^2:

v=(42)\mathbf{v} = \begin{pmatrix} 4 \\ 2 \end{pmatrix}

First slot (v1v_1) = 4 → return. Second slot (v2v_2) = 2 → volatility. Order is fixed — swapping them would mean something completely different.

Example — 3D Vector

A point in 3D space at x=1x=1, y=3y=-3, z=7z=7:

p=(137)R3\mathbf{p} = \begin{pmatrix} 1 \\ -3 \\ 7 \end{pmatrix} \in \mathbb{R}^3

v1=1v_1 = 1, v2=3v_2 = -3, v3=7v_3 = 7. Negative numbers are perfectly valid.

Geometric Picture

In 2D, a vector (35)\begin{pmatrix}3\\5\end{pmatrix} is an arrow starting at the origin (0,0)(0,0), pointing to the point (3,5)(3, 5). The two numbers tell you how far right and how far up to go. This arrow interpretation is critical — it lets you visualize addition and scaling.

Row vs Column

Vectors written horizontally like (3, 5)(3,\ 5) are row vectors. Vectors written vertically (as above) are column vectors. Default in linear algebra: column. When you see vT\mathbf{v}^T (v-transpose), it means flip the column into a row: vT=(35)\mathbf{v}^T = \begin{pmatrix}3 & 5\end{pmatrix}. This distinction matters a lot for matrix multiplication later.


02 · Vector Operations

Vector Addition

Add two vectors by adding their components in the same position. First slot with first slot, second with second, and so on. Both vectors must have the same number of components — you cannot add a 2D vector to a 3D vector.

Definition — Vector Addition

u+v=(u1u2un)+(v1v2vn)=(u1+v1u2+v2un+vn)\mathbf{u} + \mathbf{v} = \begin{pmatrix}u_1\\u_2\\\vdots\\u_n\end{pmatrix} + \begin{pmatrix}v_1\\v_2\\\vdots\\v_n\end{pmatrix} = \begin{pmatrix}u_1+v_1\\u_2+v_2\\\vdots\\u_n+v_n\end{pmatrix}

Step-by-step — Adding $\mathbf{u} = \begin{pmatrix}2\\4\\-1\end{pmatrix}$ and $\mathbf{v} = \begin{pmatrix}3\\-2\\5\end{pmatrix}$
Confirm same dimension: both have 3 components. u,vR3\mathbf{u}, \mathbf{v} \in \mathbb{R}^3. We can add them.
Add slot 1: u1+v1=2+3=5u_1 + v_1 = 2 + 3 = 5. This becomes the first entry of the result.
Add slot 2: u2+v2=4+(2)=2u_2 + v_2 = 4 + (-2) = 2. Adding a negative is the same as subtracting.
Add slot 3: u3+v3=(1)+5=4u_3 + v_3 = (-1) + 5 = 4.
Assemble the result: u+v=(524)\mathbf{u}+\mathbf{v} = \begin{pmatrix}5\\2\\4\end{pmatrix}.

Geometric meaning: place the tail of v\mathbf{v} at the tip of u\mathbf{u}. The result is the arrow from the origin to where v\mathbf{v}'s tip ends up. Like walking 2 blocks east then 3 blocks north — the total displacement is 2+32+3 in each direction.

Common mistake — Mismatched Dimensions

You cannot add (24)R2\begin{pmatrix}2\\4\end{pmatrix} \in \mathbb{R}^2 and (135)R3\begin{pmatrix}1\\3\\5\end{pmatrix} \in \mathbb{R}^3. There's no third slot in the first vector to pair with the 55. This is undefined — full stop.

Scalar Multiplication

Multiply a vector by a single number (a scalar). Multiply every component by that number. This stretches or shrinks — and possibly flips — the arrow.

Definition — Scalar Multiplication

cv=c(v1v2vn)=(cv1cv2cvn)c\mathbf{v} = c\begin{pmatrix}v_1\\v_2\\\vdots\\v_n\end{pmatrix} = \begin{pmatrix}cv_1\\cv_2\\\vdots\\cv_n\end{pmatrix}

cc is the scalar — just a real number. It's not bold because it's not a vector.

Step-by-step — Computing $-2 \cdot \mathbf{v}$ where $\mathbf{v} = \begin{pmatrix}3\\-1\\4\end{pmatrix}$
Identify scalar and vector: scalar c=2c = -2, vector v=(314)\mathbf{v} = \begin{pmatrix}3\\-1\\4\end{pmatrix}.
Multiply slot 1: (2)×3=6(-2) \times 3 = -6. Negative times positive = negative.
Multiply slot 2: (2)×(1)=2(-2) \times (-1) = 2. Negative times negative = positive.
Multiply slot 3: (2)×4=8(-2) \times 4 = -8.
Result: 2v=(628)-2\mathbf{v} = \begin{pmatrix}-6\\2\\-8\end{pmatrix}. The scalar c=2c = -2 doubled the length (2=2|-2| = 2) and flipped direction (negative sign).

Special cases: c=0c = 0 gives the zero vector 0\mathbf{0}. c=1c = 1 gives back v\mathbf{v} unchanged. c=1c = -1 flips the arrow to point the opposite direction with the same length.

Dot Product

The most important operation in this chapter. Multiply corresponding components together, then add all those products up. The result is a single number (a scalar), not a vector.

Definition — Dot Product

uv=u1v1+u2v2++unvn=i=1nuivi\mathbf{u} \cdot \mathbf{v} = u_1 v_1 + u_2 v_2 + \cdots + u_n v_n = \sum_{i=1}^{n} u_i v_i

i=1n\sum_{i=1}^{n} — "sum from i=1i=1 to nn." The index ii steps through 1,2,3,,n1, 2, 3, \ldots, n, and for each value of ii you compute uiviu_i v_i, then add them all.

uiviu_i v_i — the ii-th component of u\mathbf{u} multiplied by the ii-th component of v\mathbf{v}.

Step-by-step — $\mathbf{u} \cdot \mathbf{v}$ where $\mathbf{u} = \begin{pmatrix}2\\-1\\3\end{pmatrix}$, $\mathbf{v} = \begin{pmatrix}4\\5\\-2\end{pmatrix}$
Pair slot 1: u1×v1=2×4=8u_1 \times v_1 = 2 \times 4 = 8. (First component of each vector.)
Pair slot 2: u2×v2=(1)×5=5u_2 \times v_2 = (-1) \times 5 = -5. (Second component of each vector.)
Pair slot 3: u3×v3=3×(2)=6u_3 \times v_3 = 3 \times (-2) = -6. (Third component of each vector.)
Sum all products: uv=8+(5)+(6)=856=3\mathbf{u}\cdot\mathbf{v} = 8 + (-5) + (-6) = 8 - 5 - 6 = -3. Result is 3-3, a scalar. The dot product is negative here — we'll see what that means geometrically next.

Geometric Meaning of the Dot Product

There is a second formula for the dot product that reveals its geometric meaning:

uv=uvcosθ\mathbf{u} \cdot \mathbf{v} = \|\mathbf{u}\|\,\|\mathbf{v}\|\cos\theta

where θ\theta is the angle between the two vectors, u\|\mathbf{u}\| and v\|\mathbf{v}\| are the lengths of the vectors (always positive), and cosθ\cos\theta is the cosine of the angle.

Definition — What the sign of the dot product tells you

Positive: uv>0    cosθ>0    θ<90°\mathbf{u}\cdot\mathbf{v} > 0 \implies \cos\theta > 0 \implies \theta < 90°. Vectors point roughly in the same direction.

Zero: uv=0    cosθ=0    θ=90°\mathbf{u}\cdot\mathbf{v} = 0 \implies \cos\theta = 0 \implies \theta = 90°. Vectors are perpendicular (called orthogonal). This is hugely important in PCA.

Negative: uv<0    cosθ<0    θ>90°\mathbf{u}\cdot\mathbf{v} < 0 \implies \cos\theta < 0 \implies \theta > 90°. Vectors point in opposing directions.

Norm (Vector Length)

The norm is the Euclidean length of a vector — how long the arrow is. It's always a non-negative number.

Definition — Euclidean Norm (L2 Norm)

v=v12+v22++vn2=i=1nvi2\|\mathbf{v}\| = \sqrt{v_1^2 + v_2^2 + \cdots + v_n^2} = \sqrt{\sum_{i=1}^{n} v_i^2}

v\|\mathbf{v}\| — double bars denote the norm. The "L2" label means we're squaring, summing, then square-rooting (as opposed to other norms which use different powers). This extends Pythagoras' theorem to nn dimensions.

Step-by-step — Norm of $\mathbf{v} = \begin{pmatrix}2\\-3\\6\end{pmatrix}$
Square each component: v12=22=4v_1^2 = 2^2 = 4, v22=(3)2=9v_2^2 = (-3)^2 = 9, v32=62=36v_3^2 = 6^2 = 36. Note: squaring makes negatives positive, so (3)2=9(-3)^2 = 9 not 9-9.
Sum the squares: 4+9+36=494 + 9 + 36 = 49. This number came from: 44 (from squaring 22) +9+ 9 (from squaring 3-3) +36+ 36 (from squaring 66).
Take the square root: v=4+9+36=49=7\|\mathbf{v}\| = \sqrt{4 + 9 + 36} = \sqrt{49} = 7, because 7×7=497 \times 7 = 49.

Unit Vectors — Normalizing

A unit vector has norm exactly equal to 1. To convert any vector to a unit vector, divide by its norm. This preserves direction but removes magnitude.

Step-by-step — Normalize $\mathbf{v} = \begin{pmatrix}3\\4\end{pmatrix}$
Compute the norm: v=32+42=9+16=25=5\|\mathbf{v}\| = \sqrt{3^2 + 4^2} = \sqrt{9+16} = \sqrt{25} = 5. Numbers 99 and 1616 came from squaring 33 and 44 respectively.
Divide every component by 5: v^=vv=15(34)=(3/54/5)=(0.60.8)\hat{\mathbf{v}} = \frac{\mathbf{v}}{\|\mathbf{v}\|} = \frac{1}{5}\begin{pmatrix}3\\4\end{pmatrix} = \begin{pmatrix}3/5\\4/5\end{pmatrix} = \begin{pmatrix}0.6\\0.8\end{pmatrix}.
Verify: v^=(0.6)2+(0.8)2=0.36+0.64=1=1\|\hat{\mathbf{v}}\| = \sqrt{(0.6)^2 + (0.8)^2} = \sqrt{0.36 + 0.64} = \sqrt{1} = 1.

03 · Linear Combinations

A linear combination means: take a set of vectors, multiply each by a scalar, and add the results. You're mixing vectors together. This is the single most fundamental operation in linear algebra — everything else builds on it.

Definition — Linear Combination

c1v1+c2v2++ckvk=i=1kcivic_1\mathbf{v}_1 + c_2\mathbf{v}_2 + \cdots + c_k\mathbf{v}_k = \sum_{i=1}^{k} c_i\mathbf{v}_i

c1,c2,,ckc_1, c_2, \ldots, c_k — scalars called coefficients. They control how much of each vector you use.

v1,v2,,vk\mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_k — the vectors being combined. The subscripts here are labels (vector 1, vector 2...), not component indices.

kk — the number of vectors in the combination.

Step-by-step — $3\mathbf{v}_1 - 2\mathbf{v}_2$ where $\mathbf{v}_1=\begin{pmatrix}1\\0\end{pmatrix}$, $\mathbf{v}_2=\begin{pmatrix}2\\1\end{pmatrix}$
Identify coefficients and vectors: c1=3c_1 = 3 with v1\mathbf{v}_1, and c2=2c_2 = -2 with v2\mathbf{v}_2.
Scale v1\mathbf{v}_1 by 3: 3v1=3(10)=(30)3\mathbf{v}_1 = 3\begin{pmatrix}1\\0\end{pmatrix} = \begin{pmatrix}3\\0\end{pmatrix}.
Scale v2\mathbf{v}_2 by 2-2: 2v2=2(21)=(42)-2\mathbf{v}_2 = -2\begin{pmatrix}2\\1\end{pmatrix} = \begin{pmatrix}-4\\-2\end{pmatrix}.
Add the scaled vectors component-by-component: (30)+(42)=(3+(4)0+(2))=(12)\begin{pmatrix}3\\0\end{pmatrix} + \begin{pmatrix}-4\\-2\end{pmatrix} = \begin{pmatrix}3+(-4)\\0+(-2)\end{pmatrix} = \begin{pmatrix}-1\\-2\end{pmatrix}.

This result (12)\begin{pmatrix}-1\\-2\end{pmatrix} is one particular linear combination. By changing c1c_1 and c2c_2 to any real numbers, you can produce infinitely many different vectors — this is the idea behind span.


04 · Span

Definition — Span

The span of vectors {v1,,vk}\{\mathbf{v}_1, \ldots, \mathbf{v}_k\} is the set of all vectors you can produce by forming every possible linear combination:

span{v1,,vk}={c1v1++ckvk  |  c1,,ckR}\text{span}\{\mathbf{v}_1, \ldots, \mathbf{v}_k\} = \left\{\, c_1\mathbf{v}_1 + \cdots + c_k\mathbf{v}_k \;\middle|\; c_1, \ldots, c_k \in \mathbb{R} \,\right\}

The set-builder notation reads: "the set of all things of the form [left side] where [right side] is the condition." Here: all linear combinations where the scalars are real numbers.

Example — Single Vector Spans a Line

What is span{(12)}\text{span}\left\{\begin{pmatrix}1\\2\end{pmatrix}\right\}?

Only one vector, so any linear combination is just c(12)=(c2c)c \cdot \begin{pmatrix}1\\2\end{pmatrix} = \begin{pmatrix}c\\2c\end{pmatrix} for any cRc \in \mathbb{R}.

This traces out a line through the origin in the direction (1,2)(1, 2). Points like (24)\begin{pmatrix}2\\4\end{pmatrix}, (36)\begin{pmatrix}-3\\-6\end{pmatrix}, (00)\begin{pmatrix}0\\0\end{pmatrix} are all on this line — all in the span. But (13)\begin{pmatrix}1\\3\end{pmatrix} is NOT, because c=1c = 1 gives (12)\begin{pmatrix}1\\2\end{pmatrix}, not (13)\begin{pmatrix}1\\3\end{pmatrix}.

Example — Two Non-Parallel Vectors Span a Plane

What is span{(10),(01)}\text{span}\left\{\begin{pmatrix}1\\0\end{pmatrix}, \begin{pmatrix}0\\1\end{pmatrix}\right\}?

Any combination: c1(10)+c2(01)=(c1c2)c_1\begin{pmatrix}1\\0\end{pmatrix} + c_2\begin{pmatrix}0\\1\end{pmatrix} = \begin{pmatrix}c_1\\c_2\end{pmatrix}. Since c1c_1 and c2c_2 are any real numbers, this produces every point in R2\mathbb{R}^2. The span is the entire 2D plane.

Common mistake — Parallel Vectors Have Limited Span

span{(12),(24)}\text{span}\left\{\begin{pmatrix}1\\2\end{pmatrix}, \begin{pmatrix}2\\4\end{pmatrix}\right\} — notice that (24)=2(12)\begin{pmatrix}2\\4\end{pmatrix} = 2\begin{pmatrix}1\\2\end{pmatrix}. These are parallel (same direction, different length). Any combination c1(12)+c2(24)=(c1+2c2)(12)c_1\begin{pmatrix}1\\2\end{pmatrix} + c_2\begin{pmatrix}2\\4\end{pmatrix} = (c_1 + 2c_2)\begin{pmatrix}1\\2\end{pmatrix} is still just a multiple of the first vector. Span is still only a line, not the plane — the second vector adds no new direction.


05 · Vector Spaces

A vector space is a set VV with two operations (addition and scalar multiplication) that satisfy 8 rules. The key idea: these operations must never take you outside the set — the space is self-contained.

Definition — Vector Space

A set VV over R\mathbb{R} is a vector space if u,vV\forall\, \mathbf{u}, \mathbf{v} \in V and cR\forall\, c \in \mathbb{R}:

u+vV(closed under addition)\mathbf{u} + \mathbf{v} \in V \qquad \text{(closed under addition)} cvV(closed under scalar multiplication)c\mathbf{v} \in V \qquad \text{(closed under scalar multiplication)}

\forall — "for all." This must hold for every possible pair of vectors, not just some.

Closed — the result stays inside VV. The space doesn't "leak."

The 8 Axioms

These aren't arbitrary rules. Each one captures something that must be true for the math to be consistent and useful.

AxiomRuleWhy it matters
1. Commutativityu+v=v+u\mathbf{u}+\mathbf{v} = \mathbf{v}+\mathbf{u}Order of addition doesn't matter — walking east then north = north then east.
2. Associativity(u+v)+w=u+(v+w)(\mathbf{u}+\mathbf{v})+\mathbf{w} = \mathbf{u}+(\mathbf{v}+\mathbf{w})Grouping doesn't matter; lets you drop parentheses.
3. Zero vector0V\exists\,\mathbf{0} \in V such that v+0=v\mathbf{v}+\mathbf{0}=\mathbf{v}A "do nothing" element exists — the origin.
4. Additive inversev,(v)\forall\,\mathbf{v},\, \exists\,(-\mathbf{v}) such that v+(v)=0\mathbf{v}+(-\mathbf{v})=\mathbf{0}Every vector has an opposite; allows subtraction.
5. Scalar identity1v=v1\cdot\mathbf{v} = \mathbf{v}Multiplying by 1 is neutral.
6. Scalar associativity(cd)v=c(dv)(cd)\mathbf{v} = c(d\mathbf{v})Scaling twice = scaling by product; 2×3=62 \times 3 = 6.
7. Distributive (vector)c(u+v)=cu+cvc(\mathbf{u}+\mathbf{v}) = c\mathbf{u}+c\mathbf{v}Scalar distributes over vector sum.
8. Distributive (scalar)(c+d)v=cv+dv(c+d)\mathbf{v} = c\mathbf{v}+d\mathbf{v}Vector distributes over scalar sum.
Example — $\mathbb{R}^2$ Is a Vector Space

Take u=(13)\mathbf{u} = \begin{pmatrix}1\\3\end{pmatrix} and v=(24)\mathbf{v} = \begin{pmatrix}-2\\4\end{pmatrix}, both in R2\mathbb{R}^2.

u+v=(17)R2\mathbf{u}+\mathbf{v} = \begin{pmatrix}-1\\7\end{pmatrix} \in \mathbb{R}^2 — still a 2D vector.

5v=(1020)R25\mathbf{v} = \begin{pmatrix}-10\\20\end{pmatrix} \in \mathbb{R}^2 — still a 2D vector.

All 8 axioms hold. R2\mathbb{R}^2 is a vector space.

Common mistake — The First Quadrant Is Not a Vector Space

Let VV be all vectors in R2\mathbb{R}^2 with non-negative entries: x0x \geq 0, y0y \geq 0 — the first quadrant.

Take v=(12)V\mathbf{v} = \begin{pmatrix}1\\2\end{pmatrix} \in V. Compute (1)v=(12)(-1)\mathbf{v} = \begin{pmatrix}-1\\-2\end{pmatrix}. Both components are negative, so (12)\begin{pmatrix}-1\\-2\end{pmatrix} is not in VV.

Axiom 4 (additive inverse) fails — v-\mathbf{v} is not in the set. VV is not a vector space.


06 · Subspaces

A subspace is a subset WVW \subseteq V that is itself a vector space under the same operations. Instead of checking all 8 axioms (they're inherited from VV), you only need 3 checks.

Definition — Subspace Test (3 Conditions)

WW is a subspace of VV if and only if:

S1. 0W\mathbf{0} \in W — the zero vector is in WW.

S2. u,vW    u+vW\mathbf{u}, \mathbf{v} \in W \implies \mathbf{u}+\mathbf{v} \in W — closed under addition.

S3. vW,cR    cvW\mathbf{v} \in W,\, c \in \mathbb{R} \implies c\mathbf{v} \in W — closed under scalar multiplication.

Step-by-step — Is the x-axis a subspace of $\mathbb{R}^2$?
Set up: The x-axis in R2\mathbb{R}^2 is all vectors of the form (x0)\begin{pmatrix}x\\0\end{pmatrix} where xRx \in \mathbb{R}. Call this set WW.
Check S1: Does WW contain 0\mathbf{0}? Set x=0x=0: (00)W\begin{pmatrix}0\\0\end{pmatrix} \in W. Yes.
Check S2: Take any two elements: (a0)+(b0)=(a+b0)\begin{pmatrix}a\\0\end{pmatrix}+\begin{pmatrix}b\\0\end{pmatrix} = \begin{pmatrix}a+b\\0\end{pmatrix}. Second component is still 00, so the result is in WW. Yes.
Check S3: c(a0)=(ca0)c\begin{pmatrix}a\\0\end{pmatrix} = \begin{pmatrix}ca\\0\end{pmatrix}. Second component stays 00. Still in WW. Yes.
Conclusion: All three conditions pass. The x-axis is a subspace of R2\mathbb{R}^2.
Common mistake — A Line Not Through the Origin Is Not a Subspace

Let W={(xx+1)xR}W = \left\{\begin{pmatrix}x\\x+1\end{pmatrix} \mid x \in \mathbb{R}\right\} — a line shifted up by 1.

Check S1: Does (00)W\begin{pmatrix}0\\0\end{pmatrix} \in W? We'd need xx such that x=0x=0 and x+1=0x+1=0 simultaneously. Impossible — 0+1=100+1=1 \neq 0. Zero vector is NOT in WW.

Conclusion: Fails the very first test. Not a subspace. Every subspace must contain the origin.


07 · Linear Independence

Vectors are linearly independent if none of them can be built from linear combinations of the others. Each vector contributes a genuinely new direction that the others can't replicate.

Intuition: if you're describing a route, you don't need "go east" if you already have "go north" and "go northeast" — northeast is a combination of the other two. That redundancy = linear dependence.

Definition — Linear Independence

Vectors {v1,,vk}\{\mathbf{v}_1, \ldots, \mathbf{v}_k\} are linearly independent if the only solution to:

c1v1+c2v2++ckvk=0c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + \cdots + c_k\mathbf{v}_k = \mathbf{0}

is c1=c2==ck=0c_1 = c_2 = \cdots = c_k = 0 (all coefficients must be zero).

If any non-zero solution exists, the vectors are linearly dependent.

Why this definition? If one vector, say v3\mathbf{v}_3, equals 2v1v22\mathbf{v}_1 - \mathbf{v}_2, then 2v1v2v3=02\mathbf{v}_1 - \mathbf{v}_2 - \mathbf{v}_3 = \mathbf{0} — a non-zero solution with c1=2,c2=1,c3=1c_1=2, c_2=-1, c_3=-1. The equation catches the redundancy.

Example — Linearly Independent Vectors

Test v1=(10)\mathbf{v}_1 = \begin{pmatrix}1\\0\end{pmatrix}, v2=(01)\mathbf{v}_2 = \begin{pmatrix}0\\1\end{pmatrix}.

Set up: c1(10)+c2(01)=(00)c_1\begin{pmatrix}1\\0\end{pmatrix} + c_2\begin{pmatrix}0\\1\end{pmatrix} = \begin{pmatrix}0\\0\end{pmatrix}.

Computing the left side: (c10)+(0c2)=(c1c2)=(00)\begin{pmatrix}c_1\\0\end{pmatrix} + \begin{pmatrix}0\\c_2\end{pmatrix} = \begin{pmatrix}c_1\\c_2\end{pmatrix} = \begin{pmatrix}0\\0\end{pmatrix}.

Match components: slot 1 gives c1=0c_1 = 0. Slot 2 gives c2=0c_2 = 0. The only solution is c1=c2=0c_1=c_2=0. Linearly independent.

Common mistake — Parallel Vectors Are Linearly Dependent

Test v1=(12)\mathbf{v}_1=\begin{pmatrix}1\\2\end{pmatrix}, v2=(36)\mathbf{v}_2=\begin{pmatrix}3\\6\end{pmatrix}.

Notice: v2=3v1\mathbf{v}_2 = 3\mathbf{v}_1 — every component of v2\mathbf{v}_2 is exactly 3 times v1\mathbf{v}_1. They're parallel.

Try c1=3c_1 = 3, c2=1c_2 = -1: 3(12)+(1)(36)=(36)(36)=(00)3\begin{pmatrix}1\\2\end{pmatrix} + (-1)\begin{pmatrix}3\\6\end{pmatrix} = \begin{pmatrix}3\\6\end{pmatrix} - \begin{pmatrix}3\\6\end{pmatrix} = \begin{pmatrix}0\\0\end{pmatrix}.

A non-zero solution exists (c1=3,c2=1c_1=3, c_2=-1). Linearly dependent. v2\mathbf{v}_2 is redundant — it adds no new direction.


08 · Basis & Dimension

A basis is the most efficient set of vectors that fully describes a vector space — no redundancy, no gaps. Think of it as the minimal vocabulary needed to express every vector in the space.

Definition — Basis

A set B={b1,,bk}B = \{\mathbf{b}_1, \ldots, \mathbf{b}_k\} is a basis for vector space VV if:

B1. BB is linearly independent (no redundancy).

B2. span(B)=V\text{span}(B) = V (every vector in VV can be reached).

Example — Standard Basis of $\mathbb{R}^2$

The standard basis is:

e1=(10),e2=(01)\mathbf{e}_1 = \begin{pmatrix}1\\0\end{pmatrix}, \qquad \mathbf{e}_2 = \begin{pmatrix}0\\1\end{pmatrix}

Independent? Yes — shown above. Spans R2\mathbb{R}^2? Yes — any vector (xy)=xe1+ye2\begin{pmatrix}x\\y\end{pmatrix} = x\mathbf{e}_1 + y\mathbf{e}_2, so c1=xc_1=x, c2=yc_2=y gives every point. It's a basis.

The letters e\mathbf{e} stand for "elementary" or "standard." The subscript tells which slot is 11 — all other slots are 00.

Bases are not unique

{(11),(11)}\left\{\begin{pmatrix}1\\1\end{pmatrix}, \begin{pmatrix}1\\-1\end{pmatrix}\right\} is also a valid basis for R2\mathbb{R}^2 — linearly independent and spans the plane. A space has infinitely many bases, but all bases for the same space have the same number of vectors. That number is the dimension.

Definition — Dimension

The dimension of a vector space VV, written dim(V)\dim(V), is the number of vectors in any basis of VV. It is always the same regardless of which basis you choose.

dim(Rn)=n\dim(\mathbb{R}^n) = n

A line through the origin has dim=1\dim = 1. A plane has dim=2\dim = 2. In machine learning, a dataset with 500 features lives in R500\mathbb{R}^{500}, dim=500\dim = 500 — until PCA reduces it.


09 · Exercises

EXERCISE 1.1

Scale v\mathbf{v} by 3 first (multiply every component by 3), then add component-by-component to u\mathbf{u}.

Compute 3v=3(123)=(369)3\mathbf{v} = 3\begin{pmatrix}1\\2\\-3\end{pmatrix} = \begin{pmatrix}3\\6\\-9\end{pmatrix}.

Then add component-by-component: u+3v=(2+31+64+(9))=(555)\mathbf{u}+3\mathbf{v} = \begin{pmatrix}2+3\\-1+6\\4+(-9)\end{pmatrix} = \begin{pmatrix}5\\5\\-5\end{pmatrix}.

Compute u+3v\mathbf{u} + 3\mathbf{v} where u=(214)\mathbf{u}=\begin{pmatrix}2\\-1\\4\end{pmatrix} and v=(123)\mathbf{v}=\begin{pmatrix}1\\2\\-3\end{pmatrix}. Show every component step.

EXERCISE 1.2

Multiply corresponding components and sum. The sign of the result tells you the angle type: positive means acute, zero means right angle, negative means obtuse.

Compute each product and sum: 2(1)+(1)(2)+4(3)=2212=122(1) + (-1)(2) + 4(-3) = 2 - 2 - 12 = -12.

The dot product is 12-12. Since it is negative, cosθ<0\cos\theta < 0, which means θ>90°\theta > 90° — the angle is obtuse.

Find the dot product uv\mathbf{u}\cdot\mathbf{v} for u=(214)\mathbf{u}=\begin{pmatrix}2\\-1\\4\end{pmatrix} and v=(123)\mathbf{v}=\begin{pmatrix}1\\2\\-3\end{pmatrix}. Is the angle between them acute, right, or obtuse?

EXERCISE 1.3

Square each component, sum the squares, then take the square root to get the norm. Then divide every component by the norm to obtain the unit vector.

Square each component: 02+(5)2+122=0+25+144=1690^2 + (-5)^2 + 12^2 = 0 + 25 + 144 = 169.

Take the square root: w=169=13\|\mathbf{w}\| = \sqrt{169} = 13.

Divide every component by 13: w^=113(0512)=(05/1312/13)\hat{\mathbf{w}} = \frac{1}{13}\begin{pmatrix}0\\-5\\12\end{pmatrix} = \begin{pmatrix}0\\-5/13\\12/13\end{pmatrix}.

Compute w\|\mathbf{w}\| for w=(0512)\mathbf{w}=\begin{pmatrix}0\\-5\\12\end{pmatrix}, then find the unit vector w^\hat{\mathbf{w}}.

EXERCISE 1.4

You need a scalar cc such that c(21)=(42)c\begin{pmatrix}2\\-1\end{pmatrix}=\begin{pmatrix}4\\-2\end{pmatrix}. Try solving for cc using the first component, then verify the second component is also satisfied.

From the first component: 2c=42c = 4, so c=2c = 2.

Verify with the second component: (1)(2)=2(-1)(2) = -2. That matches the second component of (42)\begin{pmatrix}4\\-2\end{pmatrix}.

Yes, (42)\begin{pmatrix}4\\-2\end{pmatrix} is in span{(21)}\text{span}\left\{\begin{pmatrix}2\\-1\end{pmatrix}\right\} with c=2c = 2.

Is (42)\begin{pmatrix}4\\-2\end{pmatrix} in span{(21)}\text{span}\left\{\begin{pmatrix}2\\-1\end{pmatrix}\right\}? Show why.

EXERCISE 1.5

Set up c1v1+c2v2+c3v3=0c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + c_3\mathbf{v}_3 = \mathbf{0} and check if a non-zero solution exists. Hint: notice whether v3=v1+v2\mathbf{v}_3 = \mathbf{v}_1 + \mathbf{v}_2.

Observe that v3=(112)=(101)+(011)=v1+v2\mathbf{v}_3 = \begin{pmatrix}1\\1\\2\end{pmatrix} = \begin{pmatrix}1\\0\\1\end{pmatrix} + \begin{pmatrix}0\\1\\1\end{pmatrix} = \mathbf{v}_1 + \mathbf{v}_2.

Therefore 1v1+1v2+(1)v3=01 \cdot \mathbf{v}_1 + 1 \cdot \mathbf{v}_2 + (-1) \cdot \mathbf{v}_3 = \mathbf{0} is a non-zero solution (c1=1,c2=1,c3=1c_1=1, c_2=1, c_3=-1).

Linearly dependent. v3\mathbf{v}_3 is redundant — it is exactly the sum of the other two vectors.

Are v1=(101)\mathbf{v}_1=\begin{pmatrix}1\\0\\1\end{pmatrix}, v2=(011)\mathbf{v}_2=\begin{pmatrix}0\\1\\1\end{pmatrix}, v3=(112)\mathbf{v}_3=\begin{pmatrix}1\\1\\2\end{pmatrix} linearly independent?

EXERCISE 1.6

The dot product wr=wiri\mathbf{w}\cdot\mathbf{r} = \sum w_i r_i is a weighted average of the returns. To verify the weights are valid, check that w1=0.4+0.3+0.3=1\mathbf{w}\cdot\mathbf{1} = 0.4 + 0.3 + 0.3 = 1.

Compute each product and sum: 0.4(0.08)+0.3(0.05)+0.3(0.12)=0.032+0.015+0.036=0.0830.4(0.08) + 0.3(0.05) + 0.3(0.12) = 0.032 + 0.015 + 0.036 = 0.083.

Portfolio expected return = 8.3%.

Weights check: 0.4+0.3+0.3=1.00.4 + 0.3 + 0.3 = 1.0 — fully invested, no leverage.

A bond portfolio has positions in three assets with weights w=(0.40.30.3)\mathbf{w}=\begin{pmatrix}0.4\\0.3\\0.3\end{pmatrix} and expected returns r=(0.080.050.12)\mathbf{r}=\begin{pmatrix}0.08\\0.05\\0.12\end{pmatrix}. Compute the portfolio expected return wr\mathbf{w}\cdot\mathbf{r} using the dot product, and verify that the weights form a valid portfolio (w1=1\mathbf{w}\cdot\mathbf{1}=1).


10 · Chapter Summary

ConceptFormula / Rule
Vector in Rn\mathbb{R}^nOrdered list of nn reals; bold v\mathbf{v}
AdditionComponent-wise; same dimension required
Scalar mult.Multiply every component by cc
Dot productuv=uivi\mathbf{u}\cdot\mathbf{v}=\sum u_i v_i; zero means orthogonal
Normv=vi2\|\mathbf{v}\|=\sqrt{\sum v_i^2}; length of arrow
Unit vectorv^=v/v\hat{\mathbf{v}}=\mathbf{v}/\|\mathbf{v}\|; norm = 1
Linear combinationcivi\sum c_i\mathbf{v}_i; the core building block
SpanAll reachable by linear combinations
Vector spaceClosed under + and scalar mult.; 8 axioms
Subspace testContains 0\mathbf{0}; closed under + and scalar mult.
Linear independenceOnly zero solution to civi=0\sum c_i\mathbf{v}_i=\mathbf{0}
BasisIndependent set that spans the space
DimensionNumber of vectors in any basis

Next: Chapter 02 — Matrix Operations extends vectors into rectangular grids of numbers, defining addition, scalar multiplication, and the powerful (but non-commutative) matrix product.