Linear AlgebraVector SpacesMatrix MethodsLinear Maps

Vectors, Matrices, and Linear Maps

Module 1 of 520 min readLevel: Easy

Setup

Why linear algebra in quant finance?

Three operations dominate production quant code:

  1. Portfolio aggregation: multiply a weight vector wRnw \in \mathbb{R}^n by a covariance matrix ΣRn×n\Sigma \in \mathbb{R}^{n \times n} to get portfolio variance wΣww^\top \Sigma w.
  2. System solving: bootstrap a yield curve by solving a linear system Ax=bAx = b where AA encodes instrument sensitivities to discount factors.
  3. Factor decomposition: decompose a risk matrix to identify which linear combinations of assets drive most of the variance.

All three reduce to linear algebra. This module builds the precise foundations: what a vector space is, what a matrix does geometrically, and what the rank-nullity theorem tells you about the structure of solutions to linear systems.

INSIGHT

Why this matters on a derivatives desk. When bootstrapping a swap curve, the system Ax=bAx = b has ARn×nA \in \mathbb{R}^{n \times n} encoding instrument-to-discount-factor relationships and bb containing observed par swap rates. If two instruments are economically equivalent (redundant), AA is singular. The rank-nullity theorem tells you exactly by how many degrees the solution is underdetermined — and therefore how many regularisation constraints you need to add to recover uniqueness.

Conventions

  • All vectors are column vectors in Rn\mathbb{R}^n unless stated otherwise.
  • Rm×n\mathbb{R}^{m \times n} denotes the set of real m×nm \times n matrices.
  • The standard inner product on Rn\mathbb{R}^n is x,y=xy=i=1nxiyi\langle x, y \rangle = x^\top y = \sum_{i=1}^n x_i y_i.
  • Notation: ker(A)={xRn:Ax=0}\ker(A) = \{x \in \mathbb{R}^n : Ax = 0\} (null space); im(A)={Ax:xRn}\operatorname{im}(A) = \{Ax : x \in \mathbb{R}^n\} (column space / image).
  • Superscript \top denotes transpose: (A)ij=Aji(A^\top)_{ij} = A_{ji}.

Theory

1. Vector Spaces

DEFINITION

Definition 1.1 (Vector Space). A vector space over R\mathbb{R} is a non-empty set VV equipped with two operations — addition V×VVV \times V \to V and scalar multiplication R×VV\mathbb{R} \times V \to V — satisfying eight axioms:

  1. (Closure) u+vVu + v \in V and αvV\alpha v \in V for all u,vVu, v \in V, αR\alpha \in \mathbb{R}.
  2. (Commutativity) u+v=v+uu + v = v + u.
  3. (Associativity) (u+v)+w=u+(v+w)(u + v) + w = u + (v + w) and (αβ)v=α(βv)(\alpha\beta)v = \alpha(\beta v).
  4. (Zero vector) There exists 0V\mathbf{0} \in V with v+0=vv + \mathbf{0} = v for all vv.
  5. (Additive inverses) For each vVv \in V there exists vV-v \in V with v+(v)=0v + (-v) = \mathbf{0}.
  6. (Unit scalar) 1v=v1 \cdot v = v.
  7. (Scalar distributivity) α(u+v)=αu+αv\alpha(u + v) = \alpha u + \alpha v.
  8. (Vector distributivity) (α+β)v=αv+βv(\alpha + \beta)v = \alpha v + \beta v.

The canonical example is Rn\mathbb{R}^n: column vectors with componentwise addition and scaling. Other examples relevant to quant finance:

  • Rm×n\mathbb{R}^{m \times n}: the space of m×nm \times n matrices, used as parameter spaces in calibration.
  • C[0,T]C[0, T]: continuous functions on [0,T][0, T], the natural home for asset price paths.
  • L2(Ω,F,P)L^2(\Omega, \mathcal{F}, \mathbb{P}): square-integrable random variables; the space in which conditional expectations and martingales live (covered in the Probability Theory course).
DEFINITION

Definition 1.2 (Subspace). A non-empty subset WVW \subseteq V is a subspace if it is closed under addition and scalar multiplication — equivalently, if αu+βvW\alpha u + \beta v \in W for all u,vWu, v \in W and α,βR\alpha, \beta \in \mathbb{R}.

The subspace test: WW is a subspace iff (i) 0W\mathbf{0} \in W and (ii) WW is closed under linear combinations.

EXAMPLE

Example 1.3 (Dollar-neutral portfolios). Let V=RnV = \mathbb{R}^n be the space of portfolio weight vectors. The set W={wRn:1w=0}W = \{w \in \mathbb{R}^n : \mathbf{1}^\top w = 0\} of dollar-neutral portfolios (zero net exposure) is a subspace of dimension n1n - 1. The sum of two dollar-neutral portfolios remains dollar-neutral; scaling preserves the constraint. This subspace arises in statistical arbitrage and market-neutral strategies.

2. Linear Spans and Bases

DEFINITION

Definition 2.1 (Linear span). The span of vectors v1,,vkVv_1, \ldots, v_k \in V is span(v1,,vk)={i=1kαivi:αiR}.\operatorname{span}(v_1, \ldots, v_k) = \left\{ \sum_{i=1}^k \alpha_i v_i : \alpha_i \in \mathbb{R} \right\}. This is the smallest subspace of VV containing all viv_i.

DEFINITION

Definition 2.2 (Basis and dimension). A basis of VV is a set {e1,,en}\{e_1, \ldots, e_n\} that is simultaneously:

  • Linearly independent: iαiei=0\sum_i \alpha_i e_i = \mathbf{0} implies αi=0\alpha_i = 0 for all ii.
  • Spanning: span(e1,,en)=V\operatorname{span}(e_1, \ldots, e_n) = V.

The number of elements in any basis is the dimension dimV\dim V — this is well-defined (any two bases of a finite-dimensional space have the same cardinality, proved via the Steinitz exchange lemma).

REMARK

Coordinate representation. Once a basis {e1,,en}\{e_1, \ldots, e_n\} is fixed, every vVv \in V has a unique representation v=iαieiv = \sum_i \alpha_i e_i. The coefficients (α1,,αn)(\alpha_1, \ldots, \alpha_n) are the coordinates of vv in that basis. A change of basis is itself a linear map (an invertible matrix).

3. Matrices as Linear Maps

A matrix ARm×nA \in \mathbb{R}^{m \times n} defines a linear map TA:RnRmT_A : \mathbb{R}^n \to \mathbb{R}^m by TA(x)=AxT_A(x) = Ax. Linearity is immediate: TA(αx+βy)=A(αx+βy)=αAx+βAy=αTA(x)+βTA(y).T_A(\alpha x + \beta y) = A(\alpha x + \beta y) = \alpha Ax + \beta Ay = \alpha T_A(x) + \beta T_A(y).

Conversely, every linear map between finite-dimensional spaces has a unique matrix representation once bases are fixed — the jj-th column of AA is the image of the jj-th basis vector of the domain.

DEFINITION

Definition 3.1 (Kernel and image). For ARm×nA \in \mathbb{R}^{m \times n}:

  • The null space (kernel): ker(A)={xRn:Ax=0}\ker(A) = \{x \in \mathbb{R}^n : Ax = 0\} — a subspace of Rn\mathbb{R}^n.
  • The column space (image): im(A)={Ax:xRn}=span(columns of A)\operatorname{im}(A) = \{Ax : x \in \mathbb{R}^n\} = \operatorname{span}(\text{columns of } A) — a subspace of Rm\mathbb{R}^m.

Solvability: the system Ax=bAx = b has a solution iff bim(A)b \in \operatorname{im}(A). It has a unique solution iff additionally ker(A)={0}\ker(A) = \{0\}.

THEOREM

Theorem 3.2 (Rank-Nullity). For any ARm×nA \in \mathbb{R}^{m \times n}: dimim(A)rank(A)+dimker(A)nullity(A)=n.\underbrace{\dim \operatorname{im}(A)}_{\operatorname{rank}(A)} + \underbrace{\dim \ker(A)}_{\operatorname{nullity}(A)} = n.

Proof sketch. Let r=rank(A)r = \operatorname{rank}(A). Choose a basis {u1,,unr}\{u_1, \ldots, u_{n-r}\} for ker(A)\ker(A), then extend to a basis {u1,,unr,v1,,vr}\{u_1, \ldots, u_{n-r}, v_1, \ldots, v_r\} of Rn\mathbb{R}^n. For any x=βiui+γjvjx = \sum \beta_i u_i + \sum \gamma_j v_j, we have Ax=γjAvjAx = \sum \gamma_j Av_j. It follows that {Av1,,Avr}\{Av_1, \ldots, Av_r\} spans im(A)\operatorname{im}(A) and is linearly independent (if a combination were zero, it would imply a dependence contradicting the extended basis). So dimim(A)=r\dim \operatorname{im}(A) = r, giving r+(nr)=nr + (n - r) = n. \square

EXAMPLE

Example 3.3 (Redundant swap instruments). Suppose five market instruments each price as a linear combination of six discount factors. The pricing equations are Ax=bAx = b with AR5×6A \in \mathbb{R}^{5 \times 6}. By rank-nullity, rank(A)+nullity(A)=6\operatorname{rank}(A) + \operatorname{nullity}(A) = 6. Since AA has only 5 rows, rank(A)5\operatorname{rank}(A) \leq 5, so nullity(A)1\operatorname{nullity}(A) \geq 1: there is at least a one-dimensional family of solutions. The solution is not unique — additional constraints (e.g., smoothness of the forward curve) are required to select one.

4. The Four Fundamental Subspaces

For ARm×nA \in \mathbb{R}^{m \times n} with rank(A)=r\operatorname{rank}(A) = r, there are four canonical subspaces:

SubspaceLives inDimension
Column space im(A)\operatorname{im}(A)Rm\mathbb{R}^mrr
Left null space ker(A)\ker(A^\top)Rm\mathbb{R}^mmrm - r
Row space im(A)\operatorname{im}(A^\top)Rn\mathbb{R}^nrr
Null space ker(A)\ker(A)Rn\mathbb{R}^nnrn - r
THEOREM

Theorem 4.1 (Orthogonality of fundamental subspaces). im(A)ker(A)andim(A)ker(A).\operatorname{im}(A) \perp \ker(A^\top) \quad \text{and} \quad \operatorname{im}(A^\top) \perp \ker(A). Furthermore, im(A)ker(A)=Rm\operatorname{im}(A) \oplus \ker(A^\top) = \mathbb{R}^m and im(A)ker(A)=Rn\operatorname{im}(A^\top) \oplus \ker(A) = \mathbb{R}^n (orthogonal direct sum decompositions).

Proof. Let yker(A)y \in \ker(A^\top) and b=Axim(A)b = Ax \in \operatorname{im}(A). Then b,y=(Ax)y=xAy=x0=0\langle b, y \rangle = (Ax)^\top y = x^\top A^\top y = x^\top \mathbf{0} = 0. The direct sum decomposition follows from dimension counting and orthogonality. \square

REMARK

Geometric picture. Think of AA as a map from Rn\mathbb{R}^n to Rm\mathbb{R}^m. The row space im(A)\operatorname{im}(A^\top) is the "effective" part of Rn\mathbb{R}^n — the directions AA actually acts on. The null space ker(A)\ker(A) is the "invisible" part — directions AA crushes to zero. In Rm\mathbb{R}^m, the column space im(A)\operatorname{im}(A) is the range and the left null space ker(A)\ker(A^\top) is the "unachievable" part.

5. Symmetric and Positive (Semi-)Definite Matrices

DEFINITION

Definition 5.1. A matrix ARn×nA \in \mathbb{R}^{n \times n} is:

  • Symmetric if A=AA^\top = A.
  • Positive semi-definite (SPSD) if A=AA^\top = A and xAx0x^\top A x \geq 0 for all xRnx \in \mathbb{R}^n.
  • Positive definite (SPD) if A=AA^\top = A and xAx>0x^\top A x > 0 for all x0x \neq 0.

Notation: A0A \succeq 0 (SPSD), A0A \succ 0 (SPD).

The quadratic form xAxx^\top A x has a geometric interpretation: it measures how much AA stretches xx in the direction of xx itself. Positive definiteness says AA never "collapses" any direction to zero or below.

THEOREM

Proposition 5.2 (Covariance matrices are SPSD). Let XRT×nX \in \mathbb{R}^{T \times n} be a returns matrix (TT observations, nn assets) with mean-zero columns. The sample covariance matrix Σ=1TXXRn×n\Sigma = \frac{1}{T} X^\top X \in \mathbb{R}^{n \times n} satisfies Σ0\Sigma \succeq 0. It is positive definite iff XX has full column rank (no asset is a perfect linear combination of others).

Proof. Σ=(XX)=XX=Σ\Sigma^\top = (X^\top X)^\top = X^\top X = \Sigma (symmetric). For any wRnw \in \mathbb{R}^n: wΣw=1TwXXw=1TXw20.w^\top \Sigma w = \frac{1}{T} w^\top X^\top X w = \frac{1}{T} \|Xw\|^2 \geq 0. The form equals zero iff Xw=0Xw = 0, i.e., wker(X)w \in \ker(X). Full column rank of XX means ker(X)={0}\ker(X) = \{0\}, giving positive definiteness. \square

WARNING

Near-singular covariance matrices. In practice, assets are highly correlated (e.g., equity index options at adjacent strikes), and Σ\Sigma estimated from finite data is frequently near-singular: it is theoretically SPD but has eigenvalues close to zero. Inverting Σ\Sigma directly (as required for minimum-variance weights wΣ1μw^* \propto \Sigma^{-1}\mu) amplifies these small eigenvalues and produces wildly unstable portfolio weights. The fix — regularisation via SVD or adding a diagonal ridge — is treated in Module 4.


Validation

The companion notebook verifies:

  1. Vector space axioms — checks all eight axioms for R3\mathbb{R}^3 with explicit counterexamples showing what fails when each axiom is dropped.
  2. Subspace test — confirms the dollar-neutral subspace is closed under linear combinations.
  3. Rank-nullity — for a 4×64 \times 6 matrix, computes rank(A)\operatorname{rank}(A) and nullity(A)\operatorname{nullity}(A) via row reduction and verifies their sum equals 6.
  4. Orthogonality of fundamental subspaces — constructs a basis for each of the four subspaces and verifies the pairwise orthogonality via dot products.
  5. Covariance SPSD — builds a sample covariance from synthetic returns and confirms wΣw0w^\top \Sigma w \geq 0 for 1000 random weight vectors.
PRACTICE

By hand before opening the notebook. Let A=(123456789).A = \begin{pmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{pmatrix}.

  1. Are the rows of AA linearly independent? What is rank(A)\operatorname{rank}(A)?
  2. By the rank-nullity theorem, what is nullity(A)\operatorname{nullity}(A)?
  3. What is the dimension of ker(A)\ker(A^\top) (left null space)?
  4. Does the system Ax=(1,2,3)Ax = (1, 2, 3)^\top have a solution? Does Ax=(1,1,1)Ax = (1, 1, 1)^\top?

(Answers: rank(A)=2\operatorname{rank}(A) = 2 since row 3 = 2 × row 2 − row 1; nullity = 1; dimker(A)=1\dim \ker(A^\top) = 1; (1,2,3)(1,2,3)^\top is column 1 of AA so yes; check whether (1,1,1)(1,1,1)^\top is in the column space.)


Limitations

WARNING

Numerical rank is not algebraic rank. The definitions above are exact, but in floating-point arithmetic, a matrix that is "theoretically" rank-2 may appear rank-3 due to rounding errors — or vice versa. Production code determines rank using singular values with a tolerance: count the number of singular values exceeding εσmax(A)\varepsilon \cdot \sigma_{\max}(A) for a problem-specific threshold ε\varepsilon. This is treated rigorously in Module 4 (SVD and condition numbers).

WARNING

Invertibility is not conditioning. A matrix can be theoretically invertible but numerically catastrophic if its columns are nearly linearly dependent. The condition number κ(A)=AA1\kappa(A) = \|A\| \cdot \|A^{-1}\| quantifies this. A yield curve system with κ1012\kappa \approx 10^{12} means computed solutions are meaningless below 12 significant digits — which is the entire double-precision range on a bad day. Always check condition numbers before trusting a linear solve in calibration.

Scope limitations:

  • This module assumes finite-dimensional spaces over R\mathbb{R}. Infinite-dimensional Hilbert spaces (L2L^2, Sobolev spaces) require additional structure (completeness, bounded operators) and arise in the functional analytic foundations of interest rate models.
  • Results extend to complex vector spaces (replacing transpose with conjugate transpose AA^*), which are needed for Fourier-transform pricing (Module in fourier-fft-pricing course).

Interview Angle

PRACTICE

L1 (Junior quant / developer).

  1. "State the rank-nullity theorem and give an example with a 3×43 \times 4 matrix." — Expected answer: rank(A)+nullity(A)=4\operatorname{rank}(A) + \operatorname{nullity}(A) = 4 (number of columns). Example: rank-2 matrix has a 2-dimensional null space.

  2. "When does the system Ax=bAx = b have exactly one solution?" — Expected answer: when bim(A)b \in \operatorname{im}(A) (solvability) and ker(A)={0}\ker(A) = \{0\} (uniqueness), i.e., AA is square and invertible.

  3. "What does it mean for a covariance matrix to be positive definite, and why does it matter for portfolio optimisation?" — Expected answer: wΣw>0w^\top \Sigma w > 0 for all w0w \neq 0, so portfolio variance is always strictly positive (no costless zero-variance portfolio). Positive definiteness guarantees Σ\Sigma is invertible, which is needed to compute minimum-variance weights wΣ1μw^* \propto \Sigma^{-1}\mu.

PRACTICE

L2 (Senior quant).

  1. "You bootstrap a 5-instrument swap curve and get infinitely many solutions. What does linear algebra tell you?" — Expected answer: the system matrix AA is rank-deficient. By rank-nullity, nullity(A)1\operatorname{nullity}(A) \geq 1, so the solution set is an affine subspace of dimension 1\geq 1. The instruments are redundant or the system is underdetermined — either remove a redundant instrument or add a regularisation constraint (smoothness prior on the forward curve).

  2. "Explain the four fundamental subspaces and their orthogonality relations for an m×nm \times n matrix of rank rr." — Expected answer: column space and left null space are orthogonal complements in Rm\mathbb{R}^m with dimensions rr and mrm - r; row space and null space are orthogonal complements in Rn\mathbb{R}^n with dimensions rr and nrn - r.

  3. "When is a covariance matrix estimated from 50 days of returns on 100 assets positive definite?" — Expected answer: never — the data matrix XR50×100X \in \mathbb{R}^{50 \times 100} has rank at most 50, so Σ=XX/T\Sigma = X^\top X / T has rank at most 50 < 100. At least 50 eigenvalues are zero. The matrix is SPSD, not SPD.

PRACTICE

L3 (Researcher).

  1. "In a factor risk model, the factor covariance FRk×kF \in \mathbb{R}^{k \times k} is full rank, but the full asset covariance Σ=BFB+D\Sigma = B F B^\top + D (where BRn×kB \in \mathbb{R}^{n \times k} with nkn \gg k) is used in a Markowitz optimisation. How does the structure of BFBBFB^\top affect the optimisation?" — Expected answer: BFBBFB^\top is rank knk \ll n (SPSD, not SPD). It lies in an nn-dimensional space but has a (nk)(n-k)-dimensional null space. Portfolio optimisation using (BFB)1(BFB^\top)^{-1} directly is impossible; one must use the full Σ=BFB+D\Sigma = BFB^\top + D (where DD is a diagonal idiosyncratic matrix, making Σ\Sigma SPD), or use the Woodbury identity for efficient inversion.

  2. "How does the null space of a linear operator relate to regularisation in calibration?" — Expected answer: if ker(A){0}\ker(A) \neq \{0\}, the calibration objective has infinitely many minimisers differing by elements of ker(A)\ker(A). Tikhonov regularisation (penalising x2\|x\|^2) is equivalent to selecting the unique minimum-norm solution, which is the orthogonal projection of any solution onto the row space im(A)\operatorname{im}(A^\top).

Verify your understanding before moving on.