Setup
Market Context
Local volatility is a calibration tool, not a dynamical model of volatility. It reproduces any vanilla surface by construction, but it makes a specific and empirically wrong prediction: once the spot moves, the smile stays in place (sticky strike). Observed equity markets are closer to sticky delta: the ATM vol moves with the spot, and the smile translates with the forward. More fundamentally, local vol has zero vol-of-vol — realised volatility does not itself fluctuate — which contradicts the persistent empirical observation that volatility is stochastic.
Heston (1993) introduced the first analytically tractable stochastic volatility model. Volatility is driven by a separate mean-reverting process, and the model admits a semi-closed-form pricing formula via the characteristic function. Heston remains the most widely used stochastic vol model in equity and FX derivatives: it is the benchmark for calibration comparisons, the natural starting point for extensions (Bates jumps, rough vol), and the pedagogical foundation for understanding how stochastic vol generates the smile.
Financial insight. On an equity derivatives desk, Heston is calibrated to the vanilla surface daily. It is used directly to price vanilla exotics where the smile dynamics matter more than perfect vanilla fit, and as the stochastic vol component in LSV models where it is paired with a Dupire leverage function for exact vanilla calibration. Understanding what the five parameters control — and what they cannot control — is essential knowledge for every equity quant.
Assumptions
- The underlying and its instantaneous variance follow the Heston SDE system under the risk-neutral measure .
- follows a CIR (Cox-Ingersoll-Ross) process: mean-reverting, non-negative (under the Feller condition), driven by a Brownian motion correlated with .
- Correlation between spot and vol Brownian motions is . For equities, (leverage effect: vol spikes when spot falls).
- Interest rate and dividend yield are deterministic. The model is single-factor in rates.
- No jumps. Heston is a pure diffusion model; adding jumps (Bates 1996) is an extension.
- The market price of variance risk is proportional to : . Under this specification, the risk-neutral dynamics preserve the affine CIR structure.
Theory
1. The Heston SDE System
Under the risk-neutral measure :
Definition 4.1 (Heston Model). The Heston (1993) model specifies the joint dynamics of spot and variance as:
Parameters:
- : speed of mean reversion of variance (units: yr)
- : long-run mean of variance ( where is long-run vol)
- : vol of vol — the diffusion coefficient of (units: yr)
- : correlation between spot returns and variance increments
- : initial variance (often written )
The log-price satisfies:
The variance process is a CIR process — the same process used to model short rates in Cox-Ingersoll-Ross (1985). Key properties: almost surely (under the Feller condition); the process is mean-reverting to at rate ; the diffusion scales as so that variance cannot go negative.
2. The Feller Condition
Theorem 4.1 (Feller Condition). The CIR process is strictly positive ( a.s.) for all if and only if:
When , the process hits zero with positive probability and must be reflected or absorbed.
Interpretation. The Feller condition compares the drift pulling away from zero (proportional to ) against the diffusion pushing it toward zero (proportional to ). When is large relative to , the process stays safely positive. In practice, calibrated Heston parameters often violate the Feller condition — particularly for short maturities with strong smiles where large is required. Monte Carlo implementations must handle the absorption at zero numerically (full-truncation Euler or Andersen QE scheme).
Remark. In Monte Carlo simulation under Heston, the naive Euler scheme for can produce negative values. The full-truncation Euler scheme replaces with in the diffusion and drift terms, using the truncated value only for the diffusion coefficient while leaving the drift as . This is the standard industry scheme and preserves the Feller-condition boundary behaviour correctly.
3. Financial Interpretation of Parameters
Each parameter controls a specific feature of the implied vol surface:
| Parameter | Controls | Effect on smile |
|---|---|---|
| Short-dated ATM vol | Level of short-end ATM implied vol | |
| Long-run vol | ATM vol at long maturities (term structure level) | |
| Mean-reversion speed | How quickly ATM vol term structure decays from to ; also affects smile curvature | |
| Vol of vol | Curvature of the smile (butterfly / kurtosis) | |
| Spot-vol correlation | Skew of the smile (25d risk reversal); gives a left-skewed smile |
Financial insight. A practitioner's first sanity check after calibration: (1) Does correspond to the long-run ATM vol observed in the vol surface? (2) Is in the range for equity indices? (3) Is large enough to fit the curvature but not so large that the Feller condition is violated by a factor greater than 4? If these are violated, the calibration has likely found a degenerate solution.
4. The Characteristic Function
The key analytical result is that the Heston model is affine — the log-price characteristic function has an exponential affine form in .
Theorem 4.2 (Heston Characteristic Function, Albrecher et al. 2007 — stable form). The characteristic function of under is:
where, defining and :
Remark: Two branches of the characteristic function. Heston's original (1993) paper used a formulation involving which has a branch-cut discontinuity causing the "rotation" problem — the imaginary part of the complex logarithm jumps by for large or large . The Albrecher et al. (2007) formulation above avoids this by expressing in terms of , which has a well-defined principal branch for the parameter regimes encountered in calibration. Always use the Albrecher form in production code.
Given the characteristic function, European option prices follow via the Lewis (2001) formula (covered in the Fourier & FFT Pricing course):
where . This integral is evaluated via numerical quadrature (Gauss-Laguerre or adaptive Simpson) for a single strike-maturity pair, or via FFT for a full grid of strikes at fixed maturity.
5. The Smile Generated by Heston
The Heston model generates a non-flat implied vol surface with both skew and curvature. Understanding the smile shape qualitatively is as important as computing it numerically.
Short-maturity behaviour. As , the smile is driven by : all options price at and the smile is flat. The vol of vol and correlation take time to inflate the wings. Short-dated skew is therefore limited in Heston — a known failure mode for index options at short maturities.
Long-maturity behaviour. As , the variance mean-reverts to and the smile becomes approximately flat at . The skew and butterfly flatten at a rate determined by — faster mean reversion → faster smile decay.
Skew. Correlation drives the skew: when , spot-down moves are accompanied by variance-up moves, inflating the left wing and suppressing the right. The risk reversal at 25 delta increases (in absolute value) as increases.
Curvature. Vol of vol drives the curvature (butterfly): large inflates both wings symmetrically. The butterfly spread increases with .
Smile dynamics. Unlike local vol, Heston generates sticky delta behaviour to leading order: as spot moves, the ATM vol shifts with the spot because is an independent process. This is more consistent with observed market behaviour. Heston is still imperfect — empirically, the forward smile decay is too fast — but it is a substantial improvement over local vol for forward-starting products.
Remark: Skew-butterfly decomposition. In Heston, skew and curvature are not independently controlled: affects both the skew and (through its interaction with ) the curvature. This limited parameterisation is a structural constraint of the two-factor affine architecture. Models with more free parameters (SABR, rough vol) achieve better decoupling.
6. Calibration
Heston is calibrated by minimising the weighted squared error between model and market implied vols:
subject to: , , , , .
Warning: Non-convexity. The Heston calibration objective is non-convex with multiple local minima. Gradient-based solvers (L-BFGS-B, trust-region) are sensitive to initialisation. Standard practice: (1) initialise with a moment-matching estimate; (2) run multiple restarts from randomly perturbed initial conditions; (3) apply soft penalisation of the Feller condition to avoid degenerate solutions.
A practical calibration sequence:
- Estimate from the short-dated ATM implied vol: .
- Estimate from the long-dated ATM implied vol: .
- Estimate from the ATM skew: (rough approximation).
- Run the full optimisation with these as initial parameters.
Remark: Weights. Standard weighting schemes: (a) equal weights; (b) vega weighting — up-weights near-ATM options which are most liquid; (c) inverse bid-offer weighting — downweights illiquid far-wing quotes. Vega weighting is the most common in practice.
Validation
The companion notebook verifies the following:
- CIR path simulation: simulate under Heston and verify empirical mean and variance (the CIR stationary variance).
- Characteristic function moment check: verify from the characteristic function at .
- MC vs analytic pricing: compare Heston Monte Carlo call prices to Lewis formula prices at several strikes and maturities.
- Parameter sensitivity: display how the implied vol smile changes as each parameter (, , , ) varies, with the others fixed.
Before opening the notebook. Consider the Heston model with , , (), , , , , .
(a) Is the Feller condition satisfied? Compute and .
(b) What is the stationary variance of ? What is the stationary variance of ?
(c) Qualitatively, does this parametrisation generate a left-skewed or right-skewed smile? Does it generate positive or negative curvature?
Limitations
Short-dated smile calibration. Heston cannot fit steep short-dated smiles (maturities under 1 month for equity indices) because the affine characteristic function generates smiles that flatten too quickly as . The short-dated implied vol skew in Heston decays as , while observed equity smiles often persist as or even steeper. Rough volatility models (Bergomi, Rough Heston) were developed specifically to address this.
Feller violation. Calibrated Heston parameters frequently violate the Feller condition . When the condition is violated, hits zero with positive probability. Numerical schemes that do not handle this correctly (e.g., reflection schemes) can introduce systematic pricing bias. The Andersen (2008) QE scheme is the gold standard for zero-touching variance processes; full-truncation Euler is the standard approximation.
Forward smile decay. Heston's forward smile flattens at a rate . For mean-reversion speeds , the forward smile at 1 year is already significantly flatter than the spot smile. This underestimates the premium of cliquets and forward-starting options relative to the market. Bergomi (2005) and subsequent models were designed to control the forward smile decay directly.
Correlation instability. For extreme values (), the Heston model can exhibit numerical instability in the characteristic function evaluation — the argument of the complex square root can have a small modulus, causing branch-cut issues even in the Albrecher formulation. In calibration, constraining for equities is advisable.
Appropriate use cases:
- Vanilla exotics where the smile dynamics (forward vol, skew persistence) matter: cliquets, forward-starting options, timer options.
- Pricing barriers and digitals under stochastic vol to capture the smile impact on digital risk.
- As the stochastic vol component in LSV models (leveraged Heston).
- Research baseline for stochastic vol model comparisons.
Inappropriate use cases (where extensions are needed):
- Very short-dated smile fitting (< 1 month): use rough vol or jump models.
- Precise long-dated skew fitting (> 5 years): use multi-factor models (Bergomi, ZABR).
- Variance swap vol-of-vol: Heston over-predicts vol-of-vol for long maturities.
Interview Angle
L1 — Junior Quant / Quant Developer.
-
"Write down the Heston model SDEs. What does each parameter control?" Expected: the two SDEs for and , plus the correlation . For each of the five parameters, the candidate should identify which surface feature it controls (see the table in §3). A common mistake: confusing (vol of vol, in ) with a vol level.
-
"What is the Feller condition and why does it matter for Monte Carlo?" Expected: ensures a.s. For MC: if violated, the Euler scheme can produce negative variance values; the full-truncation scheme sets in the diffusion coefficient to prevent this.
-
"Why is Heston preferred over local vol for cliquets?" Expected: Heston has stochastic vol (positive vol-of-vol) which generates richer forward smile dynamics. Local vol's forward smile collapses; Heston's does not collapse as fast. The forward smile in Heston is controlled by and .
L2 — Senior Quant / Structurer.
-
"How is the Heston call price computed analytically, and what is the role of the characteristic function?" Expected: the Lewis (2001) formula expresses the call price as a single Fourier integral of the characteristic function. The characteristic function is available in closed form (Theorem 4.2) because the Heston model is affine. The key exponential-affine structure follows from the Riccati ODEs for and , which are solvable in closed form for affine models. The candidate should mention the branch-cut issue in the original Heston (1993) formulation and the Albrecher et al. (2007) fix.
-
"How does the Heston smile decay with maturity, and how does this differ from what markets show?" Expected: in Heston, the skew decays approximately as — it is bounded and decays to zero as at a rate controlled by . Observed equity index smiles often show a slower decay. Rough vol models (Bergomi, RFSV, Rough Heston) show skew decaying as where (the Hurst exponent), which better matches data.
-
"If calibrated Heston gives , , , , , is the Feller condition satisfied? What are the implications?" Expected: , . Feller is violated (). hits zero with positive probability. For pricing: use full-truncation Euler or the Andersen QE scheme. For calibration: consider adding a Feller penalty to the objective; or accept the violation and trust that the pricing bias is small if variance stays away from zero in practice.
L3 — Quant Researcher.
-
"Derive the Riccati ODEs that the functions and in the Heston characteristic function satisfy. What property of the Heston model makes them solvable in closed form?" Expected: the Heston model is affine: the drift and squared diffusion coefficients of are affine functions of the state . By the Duffie-Pan-Singleton (2000) theorem, the characteristic function of an affine process is exponential-affine in the state. Substituting into the Feynman-Kac PDE for the characteristic function and matching coefficients yields:
- (Riccati ODE for )
- (linear ODE for , given ) The Riccati ODE for is solvable in closed form because its coefficients are constant (independent of ), giving the explicit formula in Theorem 4.2.
-
"How does the Gyöngy theorem connect Heston to Dupire local vol? What does the Heston local vol surface look like?" Expected: by Gyöngy, the Dupire local vol of the Heston model is . This conditional expectation can be computed numerically. For , the Heston local vol surface is steeper (more skewed) than the Heston implied vol surface — consistent with the half-skew result from Dupire. In an LSV model, the ratio defines the leverage function that adjusts the stochastic vol to match the market surface.
-
"What is the variance term structure in Heston, and how is it related to the VIX?" Expected: the expected total variance from 0 to under Heston is . The instantaneous forward variance is . The VIX is a model-free measure of 30-day expected variance; calibrating Heston to the VIX requires the model's 30-day total variance to match the squared VIX. Multi-factor variance models (Bergomi, 2-factor CIR) allow independent control of the short-end and long-end variance term structure, which single-factor Heston cannot achieve.