Setup
The Ill-Posed Inverse Problem
Calibration is an inverse problem: given observations (market implied vols), infer the underlying model parameters. Hadamard (1902) defined a well-posed problem as one in which a solution exists, is unique, and depends continuously on the data. Calibration typically violates the third condition: small perturbations in market quotes can produce large changes in calibrated parameters.
This ill-posedness is not a modelling pathology — it is a fundamental property of the calibration problem for any sufficiently flexible model. The Heston model has a flat loss surface along parameter combinations that produce similar implied vol surfaces. A local vol model has infinitely many solutions (the ill-posedness is more severe). Understanding and controlling this instability is what separates a robust production calibrator from an academic prototype.
Conventions
- Market data: implied vol vector , perturbed by noise (bid-ask spread, model error).
- Model parameters: , constrained to .
- Calibration operator: , .
- Condition number: , where is the Jacobian. Large indicates ill-conditioning.
Theory: Sources of Instability
Flat Loss Surfaces and Ridges
Define the calibration loss . Its curvature is characterised by the Hessian approximation . If has small eigenvalues , the loss surface is nearly flat in the corresponding eigendirections.
Formally: if for a unit vector , then changing by changes the objective by only . But the model output changes by , and . The parameter direction is weakly identified: it barely affects the implied vol surface.
When calibrating to noisy data , the perturbed optimum is:
The sensitivity is , where are singular values of . For ill-conditioned , small data noise causes large parameter perturbation .
SVD Decomposition of the Calibration Problem
Write the SVD of the Jacobian: , where , , .
The unregularised least-squares solution is:
Small singular values amplify the noise component arbitrarily. This is the mechanism of instability.
Tikhonov Regularisation
The Penalised Objective
Replace the unconstrained calibration objective with:
where:
- : regularisation parameter, controls the bias-variance tradeoff.
- : regularisation matrix (often the identity or a difference operator).
- : prior (e.g., yesterday's calibrated parameters, or a reference set of parameters).
Interpretation: We seek parameters that fit the market data AND are close to the prior . The penalty discourages large deviations from the prior.
Modified Normal Equations
The first-order condition for the penalised objective (linearising around ):
More practically, in the LM setting, the Tikhonov-LM update is:
where is the LM damping and is the Tikhonov strength. The regularisation term adds to the diagonal (or subdiagonal if is a difference operator), directly bounding the minimum eigenvalue of the system matrix from below.
Bias-Variance Tradeoff
Small : The solution minimises the data fit with little constraint. The estimator is approximately unbiased () but highly variable: small noise in causes large swings in . High variance.
Large : The solution is heavily pulled towards the prior . The estimator has low variance (stable day-to-day calibration) but is biased away from the true parameters. High bias.
Optimal : Minimises the mean squared error . Methods to select are discussed below.
Effect on Singular Values
With , the regularised singular values become:
The factor dampens directions with small : for , it is approximately (no regularisation effect); for , it is approximately (strongly suppressed). This is a soft truncation of the SVD.
Parameter Selection Methods
L-Curve Method
Plot (residual norm) versus (regularisation norm) for a range of values. The curve is typically L-shaped:
- Horizontal arm (small ): residual small but regularisation term large (overfitting, unstable parameters).
- Vertical arm (large ): residual large (underfitting) but regularisation norm small.
- Corner: the optimal that balances fit and stability.
The corner is identified as the point of maximum curvature on the log-log plot:
where and as functions of .
Morozov Discrepancy Principle
If the noise level in the data is known (e.g., from the bid-ask spread), choose such that the residual matches the expected data noise:
Interpretation: if the calibrator fits significantly better than the noise floor, it is overfitting to noise and parameters will be unstable. The discrepancy principle says: stop when fit is as good as the data quality warrants.
In practice, is estimated from the average bid-ask half-spread in implied vol units.
Generalised Cross-Validation (GCV)
For linear problems, generalised cross-validation selects by minimising:
where is the hat (influence) matrix. GCV approximates leave-one-out cross-validation for linear problems without the computational overhead of fitting separate models.
Smoothness Regularisation for Vol Surfaces
For full vol surface calibration (e.g., fitting a nonparametric local vol function), regularisation must additionally enforce smoothness. The penalised objective is:
F_\lambda(\sigma_{\mathrm{loc}}) = \sum_{i=1}^N (C_{\mathrm{model}}(K_i, T_i; \sigma_{\mathrm{loc}}) - C_{\mathrm{mkt}}_i)^2 + \lambda_1 \|\partial_{kk} \sigma_{\mathrm{loc}}\|^2 + \lambda_2 \|\partial_T \sigma_{\mathrm{loc}}\|^2.
The first penalty penalises curvature in the strike dimension (suppresses oscillations across strikes). The second penalty penalises variation in the maturity dimension (suppresses unsmooth vol surfaces across maturities). Both are discretised on the calibration grid using finite differences.
Implementation
import numpy as np
from scipy.optimize import minimize
from scipy.linalg import svd
def tikhonov_calibrate(
residuals_fn, # callable: theta -> r (N-vector)
jacobian_fn, # callable: theta -> J (N x p matrix)
theta0: np.ndarray, # prior / initial guess
lb: np.ndarray, # lower bounds on parameters
ub: np.ndarray, # upper bounds on parameters
mu: float = 1e-3, # Tikhonov regularisation strength
gamma: np.ndarray | None = None # regularisation matrix (default: identity)
) -> dict:
"""
Tikhonov-regularised calibration via L-BFGS-B.
The penalised objective is:
F_mu(theta) = ||r(theta)||^2 + mu * ||Gamma(theta - theta0)||^2
Inputs:
residuals_fn: callable (p,) -> (N,)
jacobian_fn: callable (p,) -> (N, p)
theta0: prior parameter vector (p,)
lb, ub: box constraints on parameters
mu: regularisation strength
gamma: regularisation matrix, shape (q, p); default = identity
Returns:
dict with keys: params, rms, converged, reg_norm
"""
p = len(theta0)
if gamma is None:
gamma = np.eye(p)
def penalised_objective_and_grad(theta: np.ndarray):
r = residuals_fn(theta)
J = jacobian_fn(theta)
# Data fit
obj = float(np.dot(r, r))
g = 2.0 * J.T @ r
# Tikhonov penalty
diff = theta - theta0
reg = float(np.dot(diff, gamma.T @ gamma @ diff))
obj += mu * reg
g += 2.0 * mu * (gamma.T @ gamma @ diff)
return obj, g
from scipy.optimize import minimize, Bounds
result = minimize(
penalised_objective_and_grad,
theta0,
method='L-BFGS-B',
jac=True,
bounds=Bounds(lb=lb, ub=ub),
options=dict(ftol=1e-14, gtol=1e-10, maxiter=500),
)
r_final = residuals_fn(result.x)
rms = float(np.sqrt(np.mean(r_final**2)))
reg_norm = float(np.linalg.norm(gamma @ (result.x - theta0)))
return dict(params=result.x, rms=rms, converged=result.success, reg_norm=reg_norm)
def l_curve_analysis(
residuals_fn,
jacobian_fn,
theta0: np.ndarray,
lb: np.ndarray,
ub: np.ndarray,
mu_grid: np.ndarray | None = None
) -> dict:
"""
Compute the L-curve across a range of regularisation strengths.
Returns dict with keys: mu_grid, rms_grid, reg_norm_grid,
optimal_mu (corner of L-curve).
"""
if mu_grid is None:
mu_grid = np.logspace(-6, 1, 30)
rms_list = []
reg_list = []
for mu in mu_grid:
result = tikhonov_calibrate(residuals_fn, jacobian_fn, theta0, lb, ub, mu=mu)
rms_list.append(result['rms'])
reg_list.append(result['reg_norm'])
rms_arr = np.array(rms_list)
reg_arr = np.array(reg_list)
# Identify corner: maximum curvature on log-log L-curve
log_rms = np.log(rms_arr + 1e-16)
log_reg = np.log(reg_arr + 1e-16)
# Numerical curvature via finite differences
drms = np.gradient(log_rms, np.log(mu_grid))
dreg = np.gradient(log_reg, np.log(mu_grid))
d2rms = np.gradient(drms, np.log(mu_grid))
d2reg = np.gradient(dreg, np.log(mu_grid))
kappa = np.abs(d2rms * dreg - drms * d2reg) / (drms**2 + dreg**2)**1.5
corner_idx = int(np.argmax(kappa))
return dict(
mu_grid=mu_grid,
rms_grid=rms_arr,
reg_norm_grid=reg_arr,
optimal_mu=float(mu_grid[corner_idx]),
)
def condition_number_analysis(jacobian: np.ndarray) -> dict:
"""
SVD analysis of the Jacobian to assess calibration conditioning.
Returns singular values, condition number, and relative contributions
of each parameter direction to the implied vol surface variation.
"""
U, s, Vt = svd(jacobian, full_matrices=False)
cond = float(s[0] / s[-1]) if s[-1] > 0 else float('inf')
return dict(
singular_values=s,
condition_number=cond,
right_singular_vectors=Vt, # rows are parameter-space directions
left_singular_vectors=U, # rows are data-space directions
)
Practical Calibration Workflow
def daily_recalibration(
market_vols: np.ndarray, # today's market implied vols
prev_params: np.ndarray, # yesterday's calibrated parameters
model_residuals_fn, # model residuals function
model_jacobian_fn, # model Jacobian function
lb: np.ndarray,
ub: np.ndarray,
noise_level: float = 0.002, # ~0.2 vol point bid-ask half-spread
n_instruments: int = 50,
) -> dict:
"""
Production daily recalibration:
1. Attempt unregularised calibration from previous day's parameters.
2. If condition number > threshold, switch to Tikhonov with mu from discrepancy principle.
3. Cross-validate against held-out instruments.
"""
# Step 1: Condition number check
J_prev = model_jacobian_fn(prev_params)
cond = condition_number_analysis(J_prev)['condition_number']
print(f"Condition number at prior: {cond:.1f}")
# Step 2: Choose regularisation
if cond > 1e4:
# Morozov: target residual ~ noise_level * sqrt(N)
target_rms = noise_level
# Sweep mu to find where rms ~= target
lc = l_curve_analysis(model_residuals_fn, model_jacobian_fn, prev_params, lb, ub)
mu = float(lc['optimal_mu'])
print(f"Ill-conditioned: using Tikhonov mu={mu:.2e}")
else:
mu = 0.0 # no regularisation needed
result = tikhonov_calibrate(
model_residuals_fn, model_jacobian_fn, prev_params, lb, ub, mu=mu
)
return result
Limitations
Bias is always present. Tikhonov regularisation introduces bias: the calibrated parameters are pulled towards the prior . If is wrong (e.g., the market has moved significantly overnight), the bias can be substantial. The regularised calibration will underfit the market at some strikes. This must be monitored via residual analysis.
Prior dependence. The choice of prior matters significantly when is large. Starting each day from scratch with (a historical average) is more stable than using the previous day's calibration when the market has experienced a large move. A jump-aware prior (detecting vol regime changes) is important in production.
Smoothness regularisation for local vol. Regularising with suppresses oscillations but also smooths genuine sharp features (e.g., the steep ATM skew in single-stock options with near-term earnings). The regularisation parameter must be small enough to preserve these features.
Condition number as a calibration health metric. In production, the condition number of should be logged daily. A sudden increase in condition number may indicate that the model is losing its ability to fit the market — a sign that a model change or regime shift has occurred.
GCV and L-curve are approximate. Both methods assume that the calibration problem is (approximately) linear, which is only valid near the solution. For strongly nonlinear problems (e.g., calibrating a rough vol model), these criteria can select poor regularisation strengths.
Interview Angle
L1. What does it mean for a calibration problem to be ill-posed? Give a concrete example using the Heston model and explain what "instability" means in practical terms for a trading desk.
Ill-posedness: The calibrated Heston parameters change significantly from one day to the next despite a nearly unchanged implied vol surface. Concretely: and individually are poorly identified — many pairs with the same produce indistinguishable surfaces. On the desk, this means the P&L attribution from theta (time decay, via and ) fluctuates wildly day-to-day without a corresponding market move. Model risk reserves are inflated to cover this instability.
L2. Explain Tikhonov regularisation. What is the penalised objective, and how does the regularisation parameter control the bias-variance tradeoff? What is the Morozov discrepancy principle, and how do you estimate the noise level from market data?
Penalised objective: . Small : low bias (fits market well), high variance (parameters jump with market noise). Large : high bias (pulled to prior), low variance (stable calibration). Morozov principle: set such that , where is the noise standard deviation per instrument. Estimate from bid-ask half-spreads in implied vol space: for liquid equity index options, (1–3 vol points).
L3. Analyse the ill-posedness via the SVD of the Jacobian. How does Tikhonov regularisation modify the singular value decomposition? Derive the bias and variance of the regularised estimator as a function of and discuss the optimal in the bias-variance sense.
SVD analysis: Write . The regularised solution is where are the Tikhonov filter factors. For the true parameters and noise :
The mean squared error is . The optimal trades off these two terms: for a specific direction , the optimal balance gives . In practice, the L-curve or GCV approximates this optimum without requiring knowledge of .