Chapter 1 — Vector Spaces
following Axler’s Linear Algebra Done Right
1A. ℝⁿ and ℂⁿ
why complex numbers first?
Axler starts with ℂ immediately which is a bit unusual — most books do ℝ first, bolt on ℂ at the end. he wants one unified theory using 𝔽 to mean “either ℝ or ℂ.” the payoff comes in ch5: every operator on a complex vector space has an eigenvalue. that’s false over ℝ — rotation by 90° has no real eigenvalue. so if we built everything over ℝ only, we’d have to redo half of it later.
complex numbers
a complex number is a pair $(a, b)$ with $a, b \in \mathbb{R}$, written $a + bi$ where $i^2 = -1$.
\[\text{addition: } (a+bi) + (c+di) = (a+c) + (b+d)i\] \[\text{multiplication: } (a+bi)(c+di) = (ac - bd) + (ad + bc)i\]just expand normally and replace $i^2$ with $-1$ wherever it shows up.
multiplicative inverse of $a + bi \neq 0$:
\[\alpha^{-1} = \frac{a - bi}{a^2 + b^2}\]so ℂ is a field (every nonzero element has an inverse). so is ℝ. that’s why axler uses 𝔽 — the theory is literally identical for both.
[!note] I was confused about whether $i^2 = -1$ “breaks” arithmetic. the rigorous version is just ordered pairs $(a,b)$ with a specific multiplication law — $(0,1)^2 = (-1,0)$. no magic, just a definition.
lists
a list of length $n$ is an ordered collection $(x_1, \ldots, x_n)$. order matters, length matters.
$(1, 2) \neq (2, 1)$ and $(1, 2) \neq (1, 2, 0)$
different from a set where order and repetition don’t matter. lists are right here because coordinates have meaning — slot 1 is $x$, slot 2 is $y$, etc.
𝔽ⁿ
\[\mathbb{F}^n = \{(x_1, \ldots, x_n) : x_j \in \mathbb{F}\}\]addition is componentwise:
\[(x_1,\ldots,x_n) + (y_1,\ldots,y_n) = (x_1+y_1,\ldots,x_n+y_n)\]scalar multiplication: $\lambda(x_1,\ldots,x_n) = (\lambda x_1,\ldots,\lambda x_n)$
zero vector: $\mathbf{0} = (0,\ldots,0)$
digression on fields
axler uses 𝔽 throughout to mean ℝ or ℂ. he doesn’t define “field” abstractly — just need to know both have $+$, $\times$, all the usual rules, identities 0 and 1, additive inverses, and multiplicative inverses for nonzero elements. writing every theorem twice would be insane.
1B. Definition of Vector Space
the motivation
$\mathbb{F}^n$, functions $[0,1] \to \mathbb{R}$, polynomials — all have addition and scalar multiplication satisfying the same properties. abstract the common structure. prove theorems once, they apply everywhere.
definition
a vector space over 𝔽 is a set $V$ with addition $V \times V \to V$ and scalar multiplication $\mathbb{F} \times V \to V$ satisfying:
| property | statement |
|---|---|
| commutativity | $u + v = v + u$ |
| associativity (+) | $(u+v)+w = u+(v+w)$ |
| additive identity | $\exists\, \mathbf{0} \in V$ s.t. $v + \mathbf{0} = v$ |
| additive inverse | $\forall\, v,\; \exists\, w$ s.t. $v + w = \mathbf{0}$ |
| multiplicative identity | $1v = v$ |
| associativity (·) | $(ab)v = a(bv)$ |
| distributivity | $\lambda(u+v) = \lambda u + \lambda v$ |
| distributivity | $(a+b)v = av + bv$ |
elements of $V$ are called vectors. $\mathbf{0}$ is the zero vector, not the scalar 0.
examples
| space | vectors are |
|---|---|
| $\mathbb{F}^n$ | $n$-tuples |
| $\mathbb{F}^\infty$ | infinite sequences |
| $\mathcal{P}(\mathbb{F})$ | polynomials with coefficients in 𝔽 |
| $\mathcal{P}_m(\mathbb{F})$ | polynomials of degree $\leq m$ |
| $\mathbb{F}^S$ | functions $S \to \mathbb{F}$, for any set $S$ |
that last one is huge — for any set $S$, all functions $S \to \mathbb{F}$ form a vector space. when $S = {1,\ldots,n}$ you just get $\mathbb{F}^n$ back.
[!tip] the zero vector is unique. if 0 and 0’ are both additive identities: $\mathbf{0} = \mathbf{0} + \mathbf{0}’ = \mathbf{0}’$. same argument shows additive inverses are unique too.
basic consequences (just from the axioms)
1. $0 \cdot v = \mathbf{0}$
\[0 \cdot v = (0+0) \cdot v = 0 \cdot v + 0 \cdot v \implies 0 \cdot v = \mathbf{0}\]2. $\lambda \cdot \mathbf{0} = \mathbf{0}$
3. $(-1)v = -v$
\[v + (-1)v = 1 \cdot v + (-1)v = (1-1)v = 0 \cdot v = \mathbf{0}\]so $(-1)v$ is the additive inverse of $v$, and by uniqueness $(-1)v = -v$.
no coordinates anywhere in those proofs — that’s the whole point of the axiomatic approach.
1C. Subspaces
the idea
$U \subseteq V$ is a subspace if $U$ is itself a vector space (same operations). don’t need to check all 8 axioms — most are inherited from $V$. only check:
- $\mathbf{0} \in U$
- $u, w \in U \implies u + w \in U$
- $\lambda \in \mathbb{F},\; u \in U \implies \lambda u \in U$
[!note] condition 3 with $\lambda = 0$ gives $\mathbf{0} \in U$ for any $u \in U$, so if $U$ is nonempty, condition 1 follows from 3 alone. axler still lists it separately for clarity.
examples in ℝ³
| subset | subspace? | reason |
|---|---|---|
| ${(x_1,x_2,x_3) : x_1 = 0}$ | ✓ | plane through origin |
| ${(x_1,x_2,x_3) : x_1 = 1}$ | ✗ | doesn’t contain 0 |
| ${(x_1,x_2,x_3) : x_1 x_2 = 0}$ | ✗ | $(1,0,0)+(0,1,0)=(1,1,0)$ not in set |
| ${(x_1,x_2,0)}$ | ✓ | the $xy$-plane |
| ${\mathbf{0}}$ | ✓ | trivial subspace |
| $V$ itself | ✓ | always |
geometric intuition: subspaces of ℝ³ are exactly ${\mathbf{0}}$, lines through the origin, planes through the origin, and ℝ³. anything that doesn’t pass through the origin fails condition 1 immediately.
sums of subspaces
given subspaces $U_1, \ldots, U_m$ of $V$:
\[U_1 + \cdots + U_m = \{u_1 + \cdots + u_m : u_j \in U_j\}\]theorem: $U_1 + \cdots + U_m$ is a subspace — and the smallest subspace containing all of $U_1, \ldots, U_m$.
proof sketch:
- zero: $\mathbf{0} = \mathbf{0} + \cdots + \mathbf{0}$, each piece in $U_j$ ✓
- closed under $+$: $(u_1+\cdots+u_m)+(v_1+\cdots+v_m) = (u_1+v_1)+\cdots+(u_m+v_m)$, and $u_j+v_j \in U_j$ ✓
- closed under scalar mult: $\lambda(u_1+\cdots+u_m) = \lambda u_1+\cdots+\lambda u_m$, and $\lambda u_j \in U_j$ ✓
example in ℝ³: $U$ = $x$-axis, $W$ = $y$-axis $\implies U + W$ = the entire $xy$-plane.
[!important] $U + W \neq U \cup W$. the union of two lines through the origin is NOT a subspace — not closed under addition. the sum is. always use sum, not union.
direct sums
sometimes the sum has the extra property that every vector can be written uniquely as $u_1 + \cdots + u_m$ with $u_j \in U_j$.
\[V = U_1 \oplus \cdots \oplus U_m\]uniqueness is the whole point — without it you can’t treat $U_1,\ldots,U_m$ as “independent coordinates.”
theorem (two subspaces): $V = U \oplus W \iff$
- $V = U + W$
- $U \cap W = {\mathbf{0}}$
proof ($\Rightarrow$): suppose $v \in U \cap W$. then $v = v + \mathbf{0}$ (with $v \in U$, $\mathbf{0} \in W$) and $v = \mathbf{0} + v$ (with $\mathbf{0} \in U$, $v \in W$). two valid decompositions, so by uniqueness $v = \mathbf{0}$.
proof ($\Leftarrow$): suppose $v = u + w = u’ + w’$. then $u - u’ = w’ - w$. left side $\in U$, right side $\in W$, so both $\in U \cap W = {\mathbf{0}}$. therefore $u = u’$ and $w = w’$. $\square$
[!warning] $U \cap W = {\mathbf{0}}$ is sufficient for two subspaces. for three or more, pairwise intersections all being ${\mathbf{0}}$ is NOT enough for a direct sum.
counterexample in ℝ²: let $U_1$ = $x$-axis, $U_2$ = $y$-axis, $U_3 = {(x,x)}$ (diagonal).
all pairwise intersections are ${(0,0)}$, but:
\[(1,1) = (1,0)+(0,1)+(0,0) = (0,0)+(0,0)+(1,1)\]two different decompositions — not a direct sum.
summary
the logical flow of ch1:
- fields 𝔽 (ℝ or ℂ) — give us scalars. both from the start, one unified theory.
- 𝔽ⁿ — the prototype. lists of $n$ scalars, componentwise operations.
- vector space — abstracts the structure of 𝔽ⁿ. 8 axioms. covers functions, polynomials, sequences all at once.
- subspace — a subset that’s itself a vector space. 3-condition check.
- sums — generalize union (which breaks). $U_1 + \cdots + U_m$ is the smallest subspace containing all of them.
- direct sum — sum + uniqueness. the independence condition that underpins all of ch2 (bases, dimension).