Chapter 1 — Vector Spaces

[!note] Notes following Axler’s Linear Algebra Done Right. Thanks to Claude.ai for helping in terminology, notation, drawings, etc.


1A. ℝⁿ and ℂⁿ

why start with complex numbers?

Most LA books start with ℝⁿ, do everything there, then awkwardly bolt on ℂ at the end. Axler refuses this. He wants one theory that covers both fields simultaneously — so the very first thing he does is make sure we’re comfortable with ℂ.

The payoff comes in Chapter 5: every operator on a complex vector space has an eigenvalue. That’s false over ℝ (rotation by 90° has no real eigenvalue). If we built the whole theory over ℝ only, we’d have to redo half of it later.

complex numbers

A complex number is a pair $(a, b)$ with $a, b \in \mathbb{R}$, written $a + bi$ where $i^2 = -1$.

DEFINITION — COMPLEX ARITHMETIC

Addition: $(a + bi) + (c + di) = (a+c) + (b+d)i$

Multiplication: $(ac - bd) + (ad + bc)i$

These are just the rules you already know, with $i^2 = -1$ applied wherever it appears.

Properties worth noting:

  • $\mathbb{R} \subset \mathbb{C}$: every real number $a$ is the complex number $a + 0i$
  • Commutativity: $\alpha\beta = \beta\alpha$ for all $\alpha, \beta \in \mathbb{C}$
  • Every nonzero complex number has a multiplicative inverse: if $\alpha = a + bi \neq 0$ then $\alpha^{-1} = \frac{a - bi}{a^2 + b^2}$

The last point matters: ℂ is a field. So is ℝ. This is why Axler uses the letter $\mathbb{F}$ — everything works the same for both fields simultaneously.

Why $i^2 = -1$ gives a consistent system: You might worry that declaring $i^2 = -1$ breaks arithmetic. It doesn't — the pair definition $(a,b)$ with the multiplication rule above is the rigorous version. No "imaginary" magic. It's just ordered pairs of reals with a specific multiplication law that happens to make $(0,1)^2 = (-1,0)$.

lists

A list of length $n$ is an ordered collection $(x_1, \ldots, x_n)$. Order matters. $(1,2) \neq (2,1)$. Length matters. $(1,2) \neq (1,2,0)$.

This is different from a set where order and repetition don’t matter. Lists are the right notion here because coordinates have meaning — the first slot is the $x$-component, the second is $y$, etc.

$\mathbb{F}^n$

DEFINITION — $\mathbb{F}^n$

$\mathbb{F}^n$ is the set of all lists of length $n$ with entries in $\mathbb{F}$:

\[\mathbb{F}^n = \{(x_1, \ldots, x_n) : x_j \in \mathbb{F} \text{ for each } j = 1,\ldots,n\}\]

Addition is componentwise: $(x_1,\ldots,x_n) + (y_1,\ldots,y_n) = (x_1+y_1,\ldots,x_n+y_n)$

The zero vector is $\mathbf{0} = (0,\ldots,0)$.

Scalar multiplication: for $\lambda \in \mathbb{F}$ and $(x_1,\ldots,x_n) \in \mathbb{F}^n$:

\[\lambda(x_1,\ldots,x_n) = (\lambda x_1,\ldots,\lambda x_n)\]

INTERACTIVE — VECTOR ADDITION AND SCALAR MULTIPLICATION IN ℝ²

digression on fields

Axler uses $\mathbb{F}$ throughout to mean “either $\mathbb{R}$ or $\mathbb{C}$.” He doesn’t actually define the abstract notion of a field — you just need to know both have +, ×, commutativity, associativity, distributivity, identities 0 and 1, additive inverses, and multiplicative inverses for nonzero elements.

The reason for $\mathbb{F}$: writing every theorem twice (once for ℝ, once for ℂ) would be insane. Writing it once with $\mathbb{F}$ is cleaner and honestly more honest — the field structure is the only thing that actually gets used.


1B. Definition of Vector Space

the motivation

$\mathbb{F}^n$ has addition and scalar multiplication satisfying a bunch of nice properties. Functions like ${f : [0,1] \to \mathbb{R}}$ also have addition and scalar multiplication satisfying the same properties. Polynomials too.

Rather than proving theorems separately for each, Axler abstracts the common structure. A vector space is any set where addition and scalar multiplication satisfy those same properties — and every theorem proved for abstract vector spaces applies to all these cases at once.

DEFINITION — VECTOR SPACE

A vector space over $\mathbb{F}$ is a set $V$ equipped with:

  • an addition: $V \times V \to V$, written $(u, v) \mapsto u + v$
  • a scalar multiplication: $\mathbb{F} \times V \to V$, written $(\lambda, v) \mapsto \lambda v$

satisfying all of the following:

Property Statement
commutativity $u + v = v + u$
associativity (add) $(u+v)+w = u+(v+w)$
additive identity $\exists\, \mathbf{0} \in V$ s.t. $v + \mathbf{0} = v$ for all $v$
additive inverse $\forall\, v \in V, \exists\, w \in V$ s.t. $v + w = \mathbf{0}$
multiplicative identity $1v = v$ for all $v$
associativity (scalar mult) $(ab)v = a(bv)$
distributivity (vector) $\lambda(u+v) = \lambda u + \lambda v$
distributivity (scalar) $(a+b)v = av + bv$

Elements of $V$ are called vectors. The $\mathbf{0}$ is the zero vector (not the scalar 0).

examples — the zoo of vector spaces

Space Vectors are Addition Scalar mult
$\mathbb{F}^n$ $n$-tuples componentwise componentwise
$\mathbb{F}^\infty$ infinite sequences componentwise componentwise
$\mathcal{P}(\mathbb{F})$ polynomials with coefficients in $\mathbb{F}$ polynomial addition scalar × poly
$\mathcal{P}_m(\mathbb{F})$ polynomials of degree $\leq m$ same same
$\mathbb{F}^S$ functions $S \to \mathbb{F}$ $(f+g)(x)=f(x)+g(x)$ $(\lambda f)(x)=\lambda f(x)$

The last one is huge: for any set $S$, all functions $S \to \mathbb{F}$ form a vector space. When $S = {1,\ldots,n}$ you recover $\mathbb{F}^n$ (a function on a finite index set is just a list).

The zero vector is unique. Suppose $\mathbf{0}$ and $\mathbf{0}'$ are both additive identities. Then $\mathbf{0} = \mathbf{0} + \mathbf{0}' = \mathbf{0}'$. So there can only be one. Similarly, additive inverses are unique.

immediate consequences of the definition

These follow purely from the axioms — no extra assumptions:

THEOREM — BASIC CONSEQUENCES

For any vector space $V$ and $v \in V$, $\lambda \in \mathbb{F}$:

  1. $0 \cdot v = \mathbf{0}$ (the scalar zero times anything is the zero vector)
  2. $\lambda \cdot \mathbf{0} = \mathbf{0}$ (any scalar times the zero vector is zero)
  3. $(-1)v = -v$ (minus-one times $v$ is the additive inverse of $v$)

PROOF OF (1) 0·v = (0+0)·v = 0·v + 0·v [distributivity] subtract 0·v from both sides: 0·v - 0·v = 0·v + 0·v - 0·v 0̲ = 0·v ∎

PROOF OF (3) v + (-1)v = 1·v + (-1)v = (1 + (-1))v = 0·v = 0̲ so (-1)v satisfies the definition of -v (additive inverse of v) by uniqueness of additive inverses: (-1)v = -v ∎

Notice proof (1) is slick: we used only distributivity and the uniqueness of the zero vector. No coordinates anywhere. That’s the point of the axiomatic approach.

INTERACTIVE — EXPLORE VECTOR SPACE AXIOMS: pick one and see a geometric/algebraic illustration


1C. Subspaces

the idea

We have a big vector space $V$. A subspace is a subset $U \subseteq V$ that is itself a vector space (using the same addition and scalar multiplication from $V$).

We could check all 8 axioms, but most of them come for free from $V$ — commutativity, associativity, and distributivity all hold in $U$ because they hold in $V$ and $U$’s elements are $V$’s elements. The only things we actually need to verify:

DEFINITION — SUBSPACE (3-CONDITION TEST)

A subset $U \subseteq V$ is a subspace of $V$ if and only if:

  1. Additive identity: $\mathbf{0} \in U$
  2. Closed under addition: $u, w \in U \Rightarrow u + w \in U$
  3. Closed under scalar multiplication: $\lambda \in \mathbb{F},\, u \in U \Rightarrow \lambda u \in U$

Condition 1 ensures $U$ is nonempty and has an identity. Conditions 2 and 3 ensure the operations stay inside $U$ (they don’t “escape” to the rest of $V$).

Why not just 2 and 3? Condition 3 with $\lambda = 0$ gives $0 \cdot u = \mathbf{0} \in U$ for any $u \in U$ — so if $U$ is nonempty, condition 1 follows from condition 3 alone. Axler still lists it separately for clarity.

examples

Is it a subspace of $\mathbb{R}^3$? Why
${(x_1, x_2, x_3) : x_1 = 0}$ YES — plane through origin, closed under both ops
${(x_1, x_2, x_3) : x_1 = 1}$ NO — doesn’t contain $\mathbf{0}$
${(x_1, x_2, x_3) : x_1 x_2 = 0}$ (union of two planes) NO — not closed under addition: $(1,0,0)+(0,1,0)=(1,1,0)$ not in set
${(x_1, x_2, 0) : x_1, x_2 \in \mathbb{R}}$ YES — $xy$-plane
${0}$ YES — the trivial subspace
$V$ itself YES — every space is a subspace of itself
Key geometric intuition: subspaces of $\mathbb{R}^3$ are exactly: $\{\mathbf{0}\}$, lines through the origin, planes through the origin, and $\mathbb{R}^3$ itself. Any subset that doesn't pass through the origin is automatically disqualified by condition 1.

INTERACTIVE — SUBSPACE CHECK IN ℝ²: define a subset and test all 3 conditions

sums of subspaces

Given subspaces $U_1, U_2, \ldots, U_m$ of $V$, their sum is:

\[U_1 + U_2 + \cdots + U_m = \{u_1 + u_2 + \cdots + u_m : u_j \in U_j\}\]

This is the set of all possible sums where each piece comes from the corresponding subspace.

THEOREM — SUM OF SUBSPACES IS A SUBSPACE

If $U_1, \ldots, U_m$ are subspaces of $V$, then $U_1 + \cdots + U_m$ is a subspace of $V$.

Moreover, it is the smallest subspace of $V$ containing all of $U_1, \ldots, U_m$.

PROOF (sketch) Zero: 0̲ = 0̲ + … + 0̲, each piece in the corresponding Uⱼ. ✓ Closed under addition: (u₁+…+uₘ) + (v₁+…+vₘ) = (u₁+v₁)+…+(uₘ+vₘ), and uⱼ+vⱼ ∈ Uⱼ since Uⱼ is a subspace. ✓ Closed under scalar mult: λ(u₁+…+uₘ) = λu₁+…+λuₘ, and λuⱼ ∈ Uⱼ. ✓

Smallest: any subspace containing all Uⱼ must contain all sums of elements from U₁,…,Uₘ, so it must contain U₁+…+Uₘ. ∎

Example in $\mathbb{R}^3$: let $U = {(x,0,0):x\in\mathbb{R}}$ (the $x$-axis) and $W = {(0,y,0):y\in\mathbb{R}}$ (the $y$-axis). Then $U + W = {(x,y,0):x,y\in\mathbb{R}}$ — the entire $xy$-plane.

Note: $U + W \neq U \cup W$. The union of two lines is not a subspace (not closed under addition). The sum is.

direct sums

Sometimes a sum $U_1 + \cdots + U_m$ has a special property: every vector in the sum can be written uniquely as $u_1 + \cdots + u_m$ with $u_j \in U_j$.

DEFINITION — DIRECT SUM

$V = U_1 \oplus \cdots \oplus U_m$ (read: “$V$ is the direct sum of $U_1,\ldots,U_m$”) means:

  • $V = U_1 + \cdots + U_m$, AND
  • every vector in $V$ can be written uniquely as $u_1 + \cdots + u_m$ with $u_j \in U_j$

Why uniqueness matters: if the decomposition isn’t unique, we can’t use $U_1,\ldots,U_m$ as “independent coordinates.” Direct sum is the right condition for that independence.

THEOREM — DIRECT SUM CRITERION (two subspaces)

$V = U \oplus W$ if and only if:

  1. $V = U + W$, and
  2. $U \cap W = {\mathbf{0}}$

PROOF (⇒) Suppose V = U ⊕ W. We know V = U + W. Need to show U ∩ W = {0̲}. Let v ∈ U ∩ W. Then v ∈ U and v ∈ W, so: v = v + 0̲ (with v ∈ U and 0̲ ∈ W) v = 0̲ + v (with 0̲ ∈ U and v ∈ W) Both are valid decompositions of v as (element of U) + (element of W). By uniqueness: v = 0̲. So U ∩ W = {0̲}. ✓

PROOF (⇐) Suppose V = U+W and U ∩ W = {0̲}. Need to show uniqueness of decomposition. Suppose v = u + w = u’ + w’ with u,u’ ∈ U and w,w’ ∈ W. Then u - u’ = w’ - w. Left side ∈ U (since U is a subspace). Right side ∈ W. So both ∈ U ∩ W = {0̲}. Therefore u = u’ and w = w’. Uniqueness proved. ∎

The criterion U ∩ W = {0} is only for TWO subspaces. For three or more subspaces $U_1, U_2, U_3$, requiring all pairwise intersections to be ${0}$ is NOT sufficient for a direct sum. You also need, e.g., $U_1 \cap (U_2 + U_3) = {0}$. See the example below.

Counterexample for three subspaces: in $\mathbb{R}^2$, let:

  • $U_1 = {(x,0)}$ ($x$-axis)
  • $U_2 = {(0,y)}$ ($y$-axis)
  • $U_3 = {(x,x)}$ (diagonal)

All three pairwise intersections are ${(0,0)}$. But $U_1 + U_2 + U_3$ is not a direct sum: $(1,1) = (1,0) + (0,1) + (0,0) = (0,0) + (0,0) + (1,1)$. Two decompositions of $(1,1)$.

INTERACTIVE — DIRECT SUMS IN ℝ²: visualize unique decomposition


summary — what we built in chapter 1

The logical flow:

  1. Fields $\mathbb{F}$ (ℝ or ℂ) give us scalars. ℂ is included from the start so the theory works cleanly in both.
  2. $\mathbb{F}^n$ is the prototype — lists of $n$ scalars with componentwise operations.
  3. Vector space abstracts the structure of $\mathbb{F}^n$: any set with compatible addition and scalar multiplication satisfying 8 properties. This covers functions, polynomials, sequences, all at once.
  4. Subspace is a subset that is itself a vector space — checked via the 3-condition test.
  5. Sums of subspaces generalize union (which doesn’t work). $U_1 + \cdots + U_m$ is the smallest subspace containing all of them.
  6. Direct sum $U_1 \oplus \cdots \oplus U_m$ adds uniqueness of decomposition — the key independence condition that will underlie most of Chapter 2 (bases, dimension).


GitHub · RSS