Setup
Every matrix — whether square or rectangular, rank-deficient or full-rank — admits a singular value decomposition (SVD). This is a strictly stronger result than eigendecomposition, which applies only to square matrices and fails for non-diagonalisable ones. The SVD is the canonical tool for understanding the geometry of a linear map and the sensitivity of the linear system .
Where this lives on a desk. The SVD appears in three distinct quant workflows:
-
Calibration and least-squares fitting. When you fit a volatility surface or calibrate a yield curve model, you solve an overdetermined system . The normal equations are solved via the pseudoinverse — the SVD-based answer that minimises the residual with minimum-norm .
-
Risk factor models. Returns matrices are decomposed into orthogonal factors. SVD of gives the principal components directly without forming — numerically more stable when is small relative to .
-
Numerical stability diagnostics. The condition number quantifies how much the solution of amplifies errors in . An ill-conditioned calibration problem signals that the model has redundant parameters or the data is insufficient to identify them separately.
Mathematical setting. Let with (the overdetermined case common in calibration). No symmetry assumption. .
Notation. are the singular values of . Subscripts follow the usual convention: is the largest (spectral norm of ).
Financial Insight. SVD makes the geometry of the calibration problem explicit: rotates the output space (market observables), rotates the input space (model parameters), and stretches/compresses each independent direction by its singular value. Directions with are directions in parameter space that barely affect market observables — the model is near-unidentifiable in those directions.
Theory
1. The Singular Value Decomposition
Theorem 4.1 (SVD Existence). For any there exist orthogonal matrices and , and a matrix with and for , such that The diagonal entries of , taken in non-increasing order , are unique and called the singular values of .
Derivation. The key is to relate singular values to eigenvalues of the symmetric matrices and .
Since is symmetric positive semi-definite (SPSD), the Spectral Theorem (Module 3) guarantees
Define for (the non-zero eigenvalues). For , set
These are orthonormal: for ,
Extend to an orthonormal basis for . Then holds by construction.
Note: for — the non-zero eigenvalues of and coincide.
Definition 4.1 (Thin SVD). The thin (economy) SVD retains only the non-zero singular values: where , , . For this is far cheaper to compute and store than the full SVD.
2. Geometric Interpretation
The decomposition factors the action of into three steps:
- : rotate the input vector into the right singular vector basis.
- : independently scale each coordinate by (and discard the null space).
- : rotate the stretched vector into the output (column) space.
Financial Insight. Think of as transforming raw model parameters into uncorrelated "eigen-parameters". tells you how sensitively observable prices respond to each eigen-parameter. Eigen-parameters with near-zero singular values are unobservable from the data — the calibration is degenerate along those directions.
3. The Four Fundamental Subspaces via SVD
The SVD gives explicit orthonormal bases for all four subspaces first seen in Module 1.
Theorem 4.2 (Fundamental Subspaces). Let with rank . Then:
- Column space (image): — first left singular vectors.
- Left null space: — last left singular vectors.
- Row space: — first right singular vectors.
- Null space: — last right singular vectors.
Proof sketch. for (so , ), and for (so ). The orthogonality of and gives the subspace dimensions.
4. The Moore-Penrose Pseudoinverse
When is overdetermined (, more equations than unknowns) the system generally has no exact solution. The least-squares solution minimising is:
Definition 4.2 (Pseudoinverse). The Moore-Penrose pseudoinverse of is where replaces each non-zero by and leaves zero entries as zero.
Why this solves the LS problem. For with full column rank (): This is the same formula as the normal equations, but computed via SVD — numerically stable even when is ill-conditioned.
Among all minimisers, is the one with smallest norm — relevant when the system has a null space and you want the minimum-norm parameter vector.
Example 4.1 (Surface calibration as LS). Fitting SABR parameters to market option prices gives an overdetermined system. The pseudoinverse finds the minimum-residual parameter set. If the condition number , then a relative error in market quotes translates to a relative error in the inferred parameters — the calibration is numerically singular and requires regularisation.
5. Condition Number and Perturbation Theory
Definition 4.3 (Condition number). The 2-norm condition number of invertible is For a rectangular (or rank-deficient), use where is the smallest non-zero singular value.
Why it matters. Consider . If is perturbed by (market quote error, floating-point rounding), the perturbed solution satisfies:
Theorem 4.3 (Perturbation bound). Let . Then
Proof. , so . Combine with .
Remark. The condition number is a worst-case amplification factor. In practice, errors in are not aligned with the worst-case direction (the left singular vector ), so the actual amplification is often much smaller. However, is the right diagnostic for "is this problem well-posed?"
Intuition for values:
| Interpretation | |
|---|---|
| Well-conditioned; errors not amplified | |
| Machine precision → solution accurate to | |
| Solution may have only 8 significant figures | |
| Numerically singular at double precision |
6. Low-Rank Approximation (Eckart-Young Theorem)
The spectral truncation from Module 3 extends naturally to non-symmetric matrices via SVD.
Theorem 4.4 (Eckart-Young, 1936). Among all matrices of rank at most , the best approximation to in both the spectral norm and the Frobenius norm is The approximation errors are:
Comparison with eigendecomposition (Module 3). For symmetric , singular values equal absolute values of eigenvalues: . The Eckart-Young theorem for the symmetric case used — identical to the SVD formula when eigenvalues are non-negative.
Example 4.2 (Returns matrix compression). A returns matrix (252 daily returns, 100 assets) has rank 100. The rank- SVD approximation captures the dominant risk factors. The ratio measures the fraction of the total Frobenius-norm squared (proportional to total variance) explained by the first factors — equivalent to the PCA variance explained ratio from Module 3.
7. Regularisation and Truncated SVD
When is large, the naive pseudoinverse amplifies noise in . Two standard remedies:
Truncated SVD (TSVD). Retain only the largest singular values; set : Bias increases but variance (noise amplification) decreases as decreases.
Tikhonov regularisation. Solve , which has the closed-form solution: This smoothly down-weights directions with rather than hard-truncating them.
Warning. Choosing (or ) is a model selection problem, not a linear algebra problem. Too small: noise amplified. Too large: the solution is biased toward zero. Cross-validation or L-curve methods are required. On a calibration desk, the regularisation parameter often encodes prior information about parameter magnitude — equivalent to a Bayesian prior.
Validation
The companion notebook verifies:
- SVD factorisation: for a random matrix.
- Orthogonality: , both near zero.
- Four subspaces: for ; for .
- Pseudoinverse: minimises over all ; verified by perturbing from and checking residual increases.
- Condition number: Hilbert matrix with is notoriously ill-conditioned; grows exponentially with .
- Eckart-Young error: checked against direct computation.
Hand exercise (before running the notebook). Let . By inspection: (a) What are the singular values of ? (b) What is ? What are its eigenvalues? (c) Write down , , explicitly. (d) What is ? Verify . (Answer: , ; ; , , ; .)
Limitations
Warning: SVD cost vs. eigendecomposition. Full SVD of costs flops. For large, square symmetric matrices (covariance matrices in risk), eigh (which exploits symmetry) is faster. Use SVD when is rectangular, not symmetric, or when you need the pseudoinverse explicitly.
Warning: condition number is a worst-case bound. does not mean the solution is wrong by a factor of — only that it could be. If the right-hand side is nearly orthogonal to the worst-case left singular vector , the actual error is much smaller. But in calibration with many market quotes, the error vector can project onto any direction, so worst-case is relevant.
Warning: truncated SVD vs. Tikhonov at the boundary. TSVD is optimal when the signal and noise occupy disjoint singular value subspaces (rare in practice). Tikhonov is optimal when noise is Gaussian and the prior on is Gaussian. In the common case where neither holds exactly, both are approximations and the choice of threshold / dominates the solution quality.
Model failure modes:
- Near-degenerate singular values: like near-equal eigenvalues (Module 3), the singular vector basis is unstable. A small perturbation in rotates and arbitrarily if .
- Rank determination: floating-point are never exactly zero. A threshold must be chosen to determine effective rank. The common choice is the
numpy.linalg.matrix_rankdefault. - Double-precision overflow: for matrices with , intermediate computations in overflow. Work with the matrix directly via
scipy.linalg.lstsqrather than forming explicitly.
Interview Angle
L1 (Junior Quant Developer / Junior Quant).
Expected depth: definitions, the formula, one worked example.
-
"What is the singular value decomposition? How does it differ from eigendecomposition?" Answer: SVD works for any matrix (rectangular, non-symmetric). Eigendecomposition requires a square matrix and may fail (no full eigenbasis). For symmetric , singular values = |eigenvalues|, from the spectral theorem.
-
"What does the condition number tell you?" Answer: ; it bounds how much relative errors in are amplified in the solution to . For calibration, a large condition number signals near-unidentifiability of some parameters.
-
"How do you compute the pseudoinverse? When do you need it?" Answer: . Needed for overdetermined LS (calibration, regression), rank-deficient systems (degenerate covariance), and minimum-norm solutions.
L2 (Senior Quant / Quant Researcher).
Expected depth: derivation, connection to LS, condition number perturbation theory.
-
"Derive the pseudoinverse from first principles and prove it solves the LS problem." Answer: Start from . LS minimises where , . Minimise: set for , for (minimum norm). Back-substitute: .
-
"A calibration of 20 SABR parameters to 80 market prices gives condition number . What is the implication and how do you fix it?" Answer: A relative bid-ask spread translates to relative error in some parameter directions — numerically meaningless. Fix: (i) identify near-zero singular values (those below threshold ); (ii) either truncate (TSVD) or regularise (Tikhonov with chosen by L-curve or cross-validation); (iii) consider reducing model complexity — the ill-conditioning signals over-parameterisation.
L3 (Quant Researcher / Model Risk).
Expected depth: original analysis, regularisation theory, model risk implications.
-
"Compare TSVD and Tikhonov regularisation. Under what data-generating processes is each optimal, and how would you choose between them for vol surface calibration?" Answer: TSVD is optimal when the signal lives in a low-dimensional subspace (top singular directions) and noise is in the complement. Tikhonov is the minimum-variance unbiased estimator under Gaussian prior on and Gaussian noise on . For vol surfaces, neither assumption holds strictly: skew structure creates a soft separation. Practical choice: use Tikhonov with determined by GCV (generalised cross-validation) or the L-curve. TSVD is preferable when interpretability matters (clear separation of priced vs. unpriced risk factors).
-
"How does the SVD of the forward sensitivity matrix inform model risk for a structured product?" Answer: Singular values of the Jacobian measure how sensitively each market observable responds to each parameter direction. Directions with are parameter combinations that cannot be identified from the hedging instruments. These directions represent model risk: their contribution to the price is not hedged. Monitoring these via daily re-calibration SVD is a model risk control. If a previously identifiable singular direction becomes near-zero, it signals either a regime change or a model breakdown.