Linear AlgebraEigendecompositionPCARisk Factor Models

Eigendecomposition and the Spectral Theorem

Module 3 of 525 min readLevel: Medium

Setup

Where eigendecomposition appears in quant finance

Three workflows on a quant desk depend directly on eigendecomposition:

  1. Principal component analysis (PCA) of yield curves. The covariance matrix of daily rate changes is decomposed into eigenvectors: the first three principal components (level, slope, curvature) typically explain over 95% of the variance. Traders hedge using these components rather than individual maturities.

  2. Risk factor decomposition. A covariance matrix ΣRn×n\Sigma \in \mathbb{R}^{n \times n} for nn equity returns is decomposed as Σ=QΛQ\Sigma = Q\Lambda Q^\top. The eigenvalues λi\lambda_i are the variances of uncorrelated risk factors; the eigenvectors qiq_i are the factor loadings. The smallest eigenvalues flag near-arbitrage portfolios; the largest flag dominant market risk.

  3. Stability of calibration. The condition number of a matrix (treated in Module 4) is the ratio of its largest to smallest eigenvalue. A large condition number signals that small perturbations in input data produce large swings in calibrated parameters — a failure mode PMs care about.

INSIGHT

Why this matters on a rates desk. In PCA-based yield curve risk, each bucket DV01 is projected onto the principal component basis. The level component (q1q_1, approximately the flat vector) captures parallel shifts; the slope component (q2q_2) captures steepening/flattening. A trader who hedges only the level exposure while ignoring slope is exposed to a curve twist — which the spectral decomposition makes mathematically explicit.

Assumptions and conventions

  • Matrices are real and square: ARn×nA \in \mathbb{R}^{n \times n}.
  • For the Spectral Theorem, AA is symmetric: A=AA^\top = A.
  • Eigenvectors are normalised to unit length: q2=1\|q\|_2 = 1.
  • Eigenvalues of a real symmetric matrix are real (proved below) and eigenvectors can be chosen orthonormal.
  • The eigendecomposition A=QΛQA = Q\Lambda Q^\top uses QQ orthogonal (QQ=IQ^\top Q = I) and Λ\Lambda diagonal.
  • numpy.linalg.eigh is used for symmetric matrices (faster, guaranteed real eigenvalues); numpy.linalg.eig is the general (possibly complex) version.

Theory

1. Eigenvalues and Eigenvectors

DEFINITION

Definition 1.1 (Eigenvalue/eigenvector). A scalar λR\lambda \in \mathbb{R} and non-zero vector qRnq \in \mathbb{R}^n satisfying Aq=λqAq = \lambda q are called an eigenvalue and its associated eigenvector of ARn×nA \in \mathbb{R}^{n \times n}.

The equation Aq=λqAq = \lambda q rewrites as (AλI)q=0(A - \lambda I)q = 0, which has a non-zero solution iff AλIA - \lambda I is singular, i.e., iff det(AλI)=0\det(A - \lambda I) = 0.

DEFINITION

Definition 1.2 (Characteristic polynomial). The characteristic polynomial of AA is pA(λ)=det(AλI).p_A(\lambda) = \det(A - \lambda I). This is a degree-nn polynomial in λ\lambda. Its roots (counting multiplicity in C\mathbb{C}) are the eigenvalues of AA.

REMARK

Algebraic vs. geometric multiplicity. The algebraic multiplicity of an eigenvalue λ0\lambda_0 is its multiplicity as a root of pAp_A. The geometric multiplicity is dimker(Aλ0I)\dim\ker(A - \lambda_0 I) — the dimension of the eigenspace. Always: geometric \leq algebraic. Equality for every eigenvalue is the condition for diagonalisability.

EXAMPLE

Example 1.3 (2×2 covariance matrix). Let Σ=(4223)\Sigma = \begin{pmatrix} 4 & 2 \\ 2 & 3 \end{pmatrix}.

Characteristic polynomial: p(λ)=(4λ)(3λ)4=λ27λ+8p(\lambda) = (4-\lambda)(3-\lambda) - 4 = \lambda^2 - 7\lambda + 8.

Eigenvalues: λ=7±49322=7±172\lambda = \frac{7 \pm \sqrt{49 - 32}}{2} = \frac{7 \pm \sqrt{17}}{2}, so λ15.56\lambda_1 \approx 5.56, λ21.44\lambda_2 \approx 1.44.

Both positive — confirming Σ0\Sigma \succ 0 (positive eigenvalues \Leftrightarrow positive definite for symmetric matrices, proved via the spectral theorem below).

2. Diagonalisation

DEFINITION

Definition 2.1 (Diagonalisation). A matrix ARn×nA \in \mathbb{R}^{n \times n} is diagonalisable if there exists an invertible PRn×nP \in \mathbb{R}^{n \times n} and diagonal Λ\Lambda such that A=PΛP1,Λ=diag(λ1,,λn).A = P\Lambda P^{-1}, \qquad \Lambda = \operatorname{diag}(\lambda_1, \ldots, \lambda_n). The columns of PP are the eigenvectors of AA.

THEOREM

Theorem 2.2 (Sufficient condition for diagonalisability). If AA has nn distinct eigenvalues, then AA is diagonalisable. Eigenvectors corresponding to distinct eigenvalues are linearly independent.

Proof sketch. Suppose i=1kαiqi=0\sum_{i=1}^k \alpha_i q_i = 0 with qiq_i eigenvectors for distinct λi\lambda_i. Apply AA repeatedly and subtract to eliminate terms, eventually showing all αi=0\alpha_i = 0. \square

REMARK

Distinct eigenvalues are sufficient but not necessary. A matrix with repeated eigenvalues may still be diagonalisable (if geometric multiplicity equals algebraic multiplicity for every eigenvalue) — or it may not (Jordan form is needed in the latter case, but this is rarely relevant in finance practice where matrices are symmetric).

3. The Spectral Theorem

The Spectral Theorem is the cornerstone result for symmetric matrices. It guarantees not just diagonalisability, but orthogonal diagonalisation — the eigenvectors form an orthonormal basis of Rn\mathbb{R}^n.

THEOREM

Theorem 3.1 (Spectral Theorem for real symmetric matrices). If ARn×nA \in \mathbb{R}^{n \times n} is symmetric (A=AA^\top = A), then:

  1. All eigenvalues of AA are real.
  2. Eigenvectors corresponding to distinct eigenvalues are orthogonal.
  3. There exists an orthogonal matrix QQ (i.e., QQ=QQ=IQ^\top Q = QQ^\top = I) such that A=QΛQ,Λ=diag(λ1,,λn),λiR.A = Q\Lambda Q^\top, \qquad \Lambda = \operatorname{diag}(\lambda_1, \ldots, \lambda_n), \quad \lambda_i \in \mathbb{R}. The columns q1,,qnq_1, \ldots, q_n of QQ are orthonormal eigenvectors of AA.

Proof of (1) — eigenvalues are real. Let Aq=λqAq = \lambda q with qCnq \in \mathbb{C}^n, q0q \neq 0. Compute qˉAq=λqˉq=λq2\bar{q}^\top A q = \lambda \bar{q}^\top q = \lambda \|q\|^2. But also qˉAq=qˉ(Aq)=(Aqˉ)q=qˉAq\bar{q}^\top A q = \bar{q}^\top (Aq) = \overline{(A^\top \bar{q})^\top q} = \overline{\bar{q}^\top A q} (using A=AA = A^\top real symmetric). So λq2=λq2\lambda \|q\|^2 = \overline{\lambda \|q\|^2}, which forces λR\lambda \in \mathbb{R}. \square

Proof of (2) — orthogonality of distinct eigenvectors. Let Aq1=λ1q1Aq_1 = \lambda_1 q_1, Aq2=λ2q2Aq_2 = \lambda_2 q_2, λ1λ2\lambda_1 \neq \lambda_2. Then λ1q1,q2=Aq1,q2=q1,Aq2=λ2q1,q2\lambda_1 \langle q_1, q_2 \rangle = \langle Aq_1, q_2 \rangle = \langle q_1, Aq_2 \rangle = \lambda_2 \langle q_1, q_2 \rangle (using symmetry of AA for the middle step). So (λ1λ2)q1,q2=0(\lambda_1 - \lambda_2)\langle q_1, q_2 \rangle = 0, and since λ1λ2\lambda_1 \neq \lambda_2, we get q1,q2=0\langle q_1, q_2 \rangle = 0. \square

The existence of a full orthonormal basis of eigenvectors (part 3) follows from induction using the above, with the Gram-Schmidt procedure applied within each eigenspace when eigenvalues are repeated.

4. The Spectral Decomposition as a Sum of Rank-1 Projections

The factorisation A=QΛQA = Q\Lambda Q^\top can be written out as: A=i=1nλiqiqi.A = \sum_{i=1}^{n} \lambda_i \, q_i q_i^\top.

Each term qiqiq_i q_i^\top is a rank-1 orthogonal projection onto the span of qiq_i. The matrix AA is a weighted sum of these projections, with weights λi\lambda_i.

REMARK

The quadratic form picture. For xRnx \in \mathbb{R}^n with x=1\|x\| = 1: xAx=i=1nλi(qix)2.x^\top A x = \sum_{i=1}^n \lambda_i (q_i^\top x)^2. The coefficient (qix)2(q_i^\top x)^2 is the squared projection of xx onto the ii-th eigenvector. The quadratic form is therefore a weighted sum of these projections, with weights λi\lambda_i. Positive definiteness (xAx>0x^\top A x > 0 for all x0x \neq 0) is equivalent to λi>0\lambda_i > 0 for all ii — the spectral characterisation of SPD.

THEOREM

Corollary 4.1. A real symmetric matrix AA is:

  • Positive definite iff all eigenvalues are strictly positive.
  • Positive semi-definite iff all eigenvalues are non-negative.
  • Indefinite iff it has both positive and negative eigenvalues.

5. Principal Component Analysis

PCA applies the spectral theorem to a sample covariance matrix Σ^Rn×n\hat{\Sigma} \in \mathbb{R}^{n \times n} to find the directions of maximum variance.

DEFINITION

Definition 5.1 (Principal components). Given Σ^=QΛQ\hat{\Sigma} = Q\Lambda Q^\top with λ1λ2λn0\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_n \geq 0, the kk-th principal component is the direction qkq_k (the kk-th column of QQ). The variance explained by the first kk components is VarExp(k)=i=1kλii=1nλi=i=1kλitr(Σ^).\text{VarExp}(k) = \frac{\sum_{i=1}^k \lambda_i}{\sum_{i=1}^n \lambda_i} = \frac{\sum_{i=1}^k \lambda_i}{\operatorname{tr}(\hat{\Sigma})}.

EXAMPLE

Example 5.2 (Yield curve PCA). Daily changes in a 5-maturity yield curve (3m, 1y, 2y, 5y, 10y) are stored in a T×5T \times 5 matrix XX. The covariance matrix Σ^=XX/T\hat{\Sigma} = X^\top X / T is computed and eigendecomposed. Typical findings:

  • q1q_1(0.44,0.46,0.45,0.44,0.43)(0.44, 0.46, 0.45, 0.44, 0.43)^\top (flat vector) — level shift, λ1\lambda_1 explains ~75% of variance.
  • q2q_2(0.55,0.35,0.10,0.42,0.63)(-0.55, -0.35, -0.10, 0.42, 0.63)^\topslope (steepening/flattening), λ2\lambda_2 explains ~15%.
  • q3q_3(0.50,0.18,0.63,0.10,0.56)(0.50, -0.18, -0.63, -0.10, 0.56)^\topcurvature (butterfly), λ3\lambda_3 explains ~5%.

The first 3 components explain ~95% of daily P&L. A trader who hedges only the level component retains ~20% of variance unhedged.

REMARK

Low-rank approximation. The best rank-kk approximation to Σ^\hat{\Sigma} in the Frobenius norm is Σ^k=i=1kλiqiqi.\hat{\Sigma}_k = \sum_{i=1}^k \lambda_i q_i q_i^\top. This is the matrix obtained by zeroing out the nkn-k smallest eigenvalues. For n=100n = 100 assets and k=5k = 5 factors, the approximation stores 5n5n numbers instead of n2/2n^2/2 — a compression ratio of 10:110:1.


Validation

The companion notebook verifies:

  1. Spectral theorem — decomposes a 4×44 \times 4 symmetric matrix, verifies QQ=IQ^\top Q = I, A=QΛQA = Q\Lambda Q^\top, all eigenvalues real.
  2. SPD characterisation — confirms eigenvalue signs match positive-definiteness check.
  3. Rank-1 projection decomposition — verifies iλiqiqi=A\sum_i \lambda_i q_i q_i^\top = A.
  4. PCA on synthetic returns — generates 5-asset correlated returns, computes covariance, runs PCA, verifies variance explained and orthogonality of components.
  5. Low-rank approximation error — measures Σ^Σ^kF\|\hat{\Sigma} - \hat{\Sigma}_k\|_F for k=1,,nk = 1, \ldots, n.
PRACTICE

By hand before opening the notebook. Let A=(3113)A = \begin{pmatrix} 3 & 1 \\ 1 & 3 \end{pmatrix}.

  1. Compute the eigenvalues by solving det(AλI)=0\det(A - \lambda I) = 0.
  2. For each eigenvalue, find the eigenvector (up to scaling).
  3. Verify the eigenvectors are orthogonal.
  4. Write down the spectral decomposition A=λ1q1q1+λ2q2q2A = \lambda_1 q_1 q_1^\top + \lambda_2 q_2 q_2^\top.

(Answer: λ1=4\lambda_1 = 4, q1=(1/2)(1,1)q_1 = (1/\sqrt{2})(1,1)^\top; λ2=2\lambda_2 = 2, q2=(1/2)(1,1)q_2 = (1/\sqrt{2})(1,-1)^\top. Check: 412(11)(1,1)+212(11)(1,1)=(2222)+(1111)=(3113)4 \cdot \frac{1}{2}\begin{pmatrix}1\\1\end{pmatrix}(1,1) + 2 \cdot \frac{1}{2}\begin{pmatrix}1\\-1\end{pmatrix}(1,-1) = \begin{pmatrix}2&2\\2&2\end{pmatrix} + \begin{pmatrix}1&-1\\-1&1\end{pmatrix} = \begin{pmatrix}3&1\\1&3\end{pmatrix}. ✓)


Limitations

WARNING

Near-repeated eigenvalues make eigenvectors unstable. If λ1λ2\lambda_1 \approx \lambda_2, the eigenspace for λ1\lambda_1 is nearly two-dimensional and any two orthonormal vectors in the span are valid eigenvectors. Numerical algorithms will return one particular pair, but a small perturbation to AA can produce a completely different pair. In PCA of equity returns, principal components corresponding to near-equal eigenvalues are statistically meaningless — they rotate arbitrarily within the degenerate eigenspace.

WARNING

Eigenvalues are not volatilities. A common error: treating λi\sqrt{\lambda_i} as the volatility of the ii-th principal component. This is correct only if Σ^\hat{\Sigma} is the covariance matrix of annualised returns expressed in variance units. If Σ^\hat{\Sigma} is computed from daily returns, λi\sqrt{\lambda_i} is the daily standard deviation of the ii-th component — multiply by 252\sqrt{252} to annualise.

WARNING

The spectral theorem applies to symmetric matrices only. For general (non-symmetric) matrices, eigenvalues may be complex and the eigenvectors need not be orthogonal. The correct generalisation is the singular value decomposition (SVD, Module 4), which always exists and gives orthonormal bases for both domain and range — regardless of symmetry or square shape.

Scope limitations:

  • This module covers the real symmetric case. The complex Hermitian case (A=AA^* = A) has an identical spectral theorem with conjugate transpose replacing transpose.
  • Infinite-dimensional operators (integral operators, differential operators) have a spectral theory requiring functional analysis (Hilbert-Schmidt operators, self-adjoint operators). The intuition from finite dimensions carries over, but compactness conditions are needed.

Interview Angle

PRACTICE

L1 (Junior quant / developer).

  1. "What is an eigenvalue and eigenvector, and how do you compute them?" — Expected: Aq=λqAq = \lambda q; compute via det(AλI)=0\det(A - \lambda I) = 0 for small matrices; use numpy.linalg.eigh for symmetric matrices in code. Mention eigh vs eig.

  2. "What does the spectral theorem say, and why does it matter for covariance matrices?" — Expected: every real symmetric matrix has an orthonormal basis of real eigenvectors (A=QΛQA = Q\Lambda Q^\top, QQ orthogonal). For covariance matrices this means the principal components are orthogonal risk factors, and eigenvalues are their variances.

  3. "If a covariance matrix has a negative eigenvalue, what does that mean practically?" — Expected: the matrix is not positive semi-definite — some portfolio has "negative variance", which is impossible for a true covariance matrix. Causes: estimation error (short history), numerical noise. Fix: project to SPD (clamp eigenvalues at a small positive floor, then reconstruct).

PRACTICE

L2 (Senior quant).

  1. "Explain PCA of a yield curve in terms of the spectral theorem. What are the first three principal components?" — Expected: eigendecompose the covariance of daily rate changes. q1q_1 = level (parallel shift, ~75% variance), q2q_2 = slope (steepening/flattening, ~15%), q3q_3 = curvature (butterfly, ~5%). These are the hedgeable risk factors; the remaining components are noise.

  2. "A covariance matrix for 200 assets is estimated from 60 days of data. How many meaningful principal components are there? How should you use them?" — Expected: at most 60 meaningful components (rank of the data matrix). The remaining 140 eigenvalues are zero (or near-zero noise). Use the top kk components that explain e.g. 95% of variance; the rest are estimation artefacts. Build a regularised covariance as Σ^k+σ^2I\hat{\Sigma}_k + \hat{\sigma}^2 I where σ^2\hat{\sigma}^2 is the average of the discarded eigenvalues (the Ledoit-Wolf shrinkage intuition).

  3. "How does the low-rank spectral approximation Σ^k=i=1kλiqiqi\hat{\Sigma}_k = \sum_{i=1}^k \lambda_i q_i q_i^\top minimise the approximation error?" — Expected: it is the best rank-kk approximation in the Frobenius norm. The error is Σ^Σ^kF2=i=k+1nλi2\|\hat{\Sigma} - \hat{\Sigma}_k\|_F^2 = \sum_{i=k+1}^n \lambda_i^2. This follows from the Eckart-Young theorem (the matrix analogue of truncating a Fourier series at the leading terms).

PRACTICE

L3 (Researcher).

  1. "In a statistical factor model r=Bf+εr = Bf + \varepsilon, the covariance is Σ=BFB+D\Sigma = BFB^\top + D. How does the spectral structure of BFBBFB^\top compare to that of the full Σ\Sigma, and what are the implications for PCA-based risk attribution?" — Expected: BFBBFB^\top has at most kk non-zero eigenvalues (rank kk, the number of factors), so its spectrum drops to zero after kk components. Σ=BFB+D\Sigma = BFB^\top + D adds the idiosyncratic diagonal DD, which "fills in" the zero eigenvalues — the full Σ\Sigma is SPD. PCA of Σ\Sigma recovers the factor structure only approximately; exact recovery requires the structured model (e.g., via EM on the factor model). PCA conflates systematic and idiosyncratic variance.

  2. "What is the variational characterisation of eigenvalues (Courant-Fischer), and how does it explain why the first principal component maximises variance?" — Expected: λ1=maxx=1xAx\lambda_1 = \max_{\|x\|=1} x^\top A x (Rayleigh quotient). The argmax is q1q_1 (the leading eigenvector). More generally, λk=maxx=1,xq1,,qk1xAx\lambda_k = \max_{\|x\|=1, x \perp q_1, \ldots, q_{k-1}} x^\top A x. This makes precise the statement that q1q_1 is the direction of maximum portfolio variance — it is a theorem, not a definition.