Finite Difference Schemes and Convergence Analysis

Hard·25 min read
Numerical MethodsFinite DifferencesBlack-Scholes PDEConvergence Analysis

Setup

The Black-Scholes PDE as a Parabolic Problem

The Black-Scholes PDE for a European option price C(t,S)C(t, S) is:

Ct+12σ2S22CS2+rSCSrC=0,(t,S)[0,T)×(0,).\frac{\partial C}{\partial t} + \frac{1}{2}\sigma^2 S^2 \frac{\partial^2 C}{\partial S^2} + rS \frac{\partial C}{\partial S} - rC = 0, \qquad (t, S) \in [0, T) \times (0, \infty).

This is a parabolic PDE (heat equation type) in the variables tt and SS. It is solved backwards in time from the terminal condition C(T,S)=g(S)C(T, S) = g(S) (the payoff function).

Log-Spot Transformation

The variable coefficient σ2S2\sigma^2 S^2 in the second-order term introduces numerical difficulty. Setting x=lnSx = \ln S and reversing time τ=Tt\tau = T - t (time to expiry), the PDE becomes:

Cτ=12σ22Cx2+(r12σ2)CxrC,\frac{\partial C}{\partial \tau} = \frac{1}{2}\sigma^2 \frac{\partial^2 C}{\partial x^2} + \left(r - \frac{1}{2}\sigma^2\right)\frac{\partial C}{\partial x} - rC,

with constant coefficients. This is the canonical form used in finite difference implementations. It reduces to a heat equation modulo first-order and zero-order terms.

Grid Setup

Discretise on a uniform grid in xx and τ\tau:

  • xi=xmin+iΔxx_i = x_{\min} + i \Delta x, for i=0,1,,Mi = 0, 1, \ldots, M.
  • τj=jΔτ\tau_j = j \Delta\tau, for j=0,1,,Nj = 0, 1, \ldots, N; so τN=T\tau_N = T.

Let CijC(xi,τj)C_i^j \approx C(x_i, \tau_j). Boundary conditions:

  • At xminx_{\min} (far-left, S0S \approx 0): Cij=0C_i^j = 0 for a call.
  • At xmaxx_{\max} (far-right, SS \to \infty): Cij=exiKerτjC_i^j = e^{x_i} - K e^{-r\tau_j} for a call (forward price minus discounted strike).

Initial condition (τ=0\tau = 0): Ci0=g(exi)C_i^0 = g(e^{x_i}) (the payoff at expiry).


Explicit Scheme (FTCS)

Discretisation

Forward in time, central in space (FTCS — Forward Time, Centred Space):

Cij+1CijΔτ=σ22Ci+1j2Cij+Ci1jΔx2+(rσ22)Ci+1jCi1j2ΔxrCij.\frac{C_i^{j+1} - C_i^j}{\Delta\tau} = \frac{\sigma^2}{2} \frac{C_{i+1}^j - 2C_i^j + C_{i-1}^j}{\Delta x^2} + \left(r - \frac{\sigma^2}{2}\right)\frac{C_{i+1}^j - C_{i-1}^j}{2\Delta x} - rC_i^j.

Define λ=σ2Δτ2Δx2\lambda = \frac{\sigma^2 \Delta\tau}{2\Delta x^2}, ν=(rσ2/2)Δτ2Δx\nu = \frac{(r - \sigma^2/2)\Delta\tau}{2\Delta x}. Rearranging:

Cij+1=(λν)Ci1j+(12λrΔτ)Cij+(λ+ν)Ci+1j.C_i^{j+1} = (\lambda - \nu) C_{i-1}^j + (1 - 2\lambda - r\Delta\tau) C_i^j + (\lambda + \nu) C_{i+1}^j.

This is explicit: each new value Cij+1C_i^{j+1} is computed directly from the previous time level. No linear system to solve. Cost per time step: O(M)O(M).

Stability: Von Neumann Analysis

Assume Cij=ξjeiαxiC_i^j = \xi^j e^{i\alpha x_i} for some complex amplitude ξ\xi. Substituting into the FTCS stencil (considering only the diffusion term for clarity):

ξ=1+2λ(cos(αΔx)1).\xi = 1 + 2\lambda(\cos(\alpha\Delta x) - 1).

Stability requires ξ1|\xi| \leq 1 for all frequencies α\alpha. Since cos(αΔx)1[2,0]\cos(\alpha\Delta x) - 1 \in [-2, 0]:

ξmax=14λ1    λ12    σ2ΔτΔx21.|\xi|_{\max} = |1 - 4\lambda| \leq 1 \implies \lambda \leq \frac{1}{2} \implies \frac{\sigma^2 \Delta\tau}{\Delta x^2} \leq 1.

This is the CFL (Courant-Friedrichs-Lewy) condition for the parabolic problem. In physical variables: ΔtΔx2/σ2\Delta t \leq \Delta x^2 / \sigma^2. The time step is quadratically constrained by the spatial step — a serious limitation for fine grids.

Accuracy

Truncation error: Taylor-expanding each term shows the FTCS scheme is O(Δτ)+O(Δx2)O(\Delta\tau) + O(\Delta x^2): first-order in time, second-order in space. The overall convergence rate is O(Δτ)+O(Δx2)O(\Delta\tau) + O(\Delta x^2).


Implicit Scheme (BTCS)

Discretisation

Backward in time, central in space (BTCS):

Cij+1CijΔτ=σ22Ci+1j+12Cij+1+Ci1j+1Δx2+(rσ22)Ci+1j+1Ci1j+12ΔxrCij+1.\frac{C_i^{j+1} - C_i^j}{\Delta\tau} = \frac{\sigma^2}{2} \frac{C_{i+1}^{j+1} - 2C_i^{j+1} + C_{i-1}^{j+1}}{\Delta x^2} + \left(r - \frac{\sigma^2}{2}\right)\frac{C_{i+1}^{j+1} - C_{i-1}^{j+1}}{2\Delta x} - rC_i^{j+1}.

Rearranging into standard tridiagonal form:

(λν)Ci1j+1+(1+2λ+rΔτ)Cij+1(λ+ν)Ci+1j+1=Cij.-(\lambda - \nu) C_{i-1}^{j+1} + (1 + 2\lambda + r\Delta\tau) C_i^{j+1} - (\lambda + \nu) C_{i+1}^{j+1} = C_i^j.

This is a tridiagonal linear system of size (M1)×(M1)(M-1)\times(M-1). Solved in O(M)O(M) operations by the Thomas algorithm (forward elimination + back substitution).

Stability

Von Neumann analysis on the BTCS scheme gives:

ξ=11+2λ(1cos(αΔx)).\xi = \frac{1}{1 + 2\lambda(1 - \cos(\alpha\Delta x))}.

Since the denominator is always 1\geq 1, we have ξ1|\xi| \leq 1 for all α\alpha and all λ>0\lambda > 0. The BTCS scheme is unconditionally stable: no constraint between Δτ\Delta\tau and Δx\Delta x. Large time steps are permitted. The accuracy remains O(Δτ)+O(Δx2)O(\Delta\tau) + O(\Delta x^2).


Crank-Nicolson Scheme

Discretisation

The θ\theta-method averages the explicit and implicit discretisations with weights θ\theta and 1θ1-\theta:

Cij+1CijΔτ=θLj+1Ci+(1θ)LjCi,\frac{C_i^{j+1} - C_i^j}{\Delta\tau} = \theta \cdot \mathcal{L}^{j+1} C_i + (1-\theta) \cdot \mathcal{L}^j C_i,

where L\mathcal{L} is the finite difference operator for the spatial terms. Crank-Nicolson sets θ=1/2\theta = 1/2:

λν2Ci1j+1+(1+λ+rΔτ2)Cij+1λ+ν2Ci+1j+1=λν2Ci1j+(1λrΔτ2)Cij+λ+ν2Ci+1j.-\frac{\lambda - \nu}{2} C_{i-1}^{j+1} + \left(1 + \lambda + \frac{r\Delta\tau}{2}\right) C_i^{j+1} - \frac{\lambda + \nu}{2} C_{i+1}^{j+1} = \frac{\lambda - \nu}{2} C_{i-1}^j + \left(1 - \lambda - \frac{r\Delta\tau}{2}\right) C_i^j + \frac{\lambda + \nu}{2} C_{i+1}^j.

The left side is a tridiagonal system. The right side is the explicit contribution from the current time level.

Accuracy and Stability

Accuracy: Crank-Nicolson is O(Δτ2)+O(Δx2)O(\Delta\tau^2) + O(\Delta x^2): second-order in both time and space. This is the critical advantage over explicit and implicit schemes, which are only first-order in time.

Stability: Von Neumann analysis gives:

ξ2=12λsin2(αΔx/2)1+2λsin2(αΔx/2)11+(rΔτ)2/(4λ2sin4)1.|\xi|^2 = \frac{1 - 2\lambda\sin^2(\alpha\Delta x/2)}{1 + 2\lambda\sin^2(\alpha\Delta x/2)} \cdot \frac{1}{1 + (r\Delta\tau)^2/(4\lambda^2 \sin^4)} \leq 1.

More simply: ξ1|\xi| \leq 1 always, so CN is unconditionally stable.

Spurious Oscillations

CN is second-order accurate but can produce non-monotone oscillations near payoff discontinuities (e.g., digital options, or the kink at the strike for vanilla options on a coarse grid). The oscillation arises because CN uses θ=1/2\theta = 1/2 which has no dissipation (it does not damp high-frequency modes).

Rannacher smoothing: replace the first two time steps with BTCS (which is dissipative) and then switch to CN for all remaining steps. This eliminates oscillations while preserving second-order accuracy in the overall scheme (since the error from two BTCS steps is O(Δτ)O(\Delta\tau) but only contributes 2Δτ/T02\Delta\tau / T \to 0 of the total).


Consistency, Stability, and Convergence

Definitions

Let LhL_h be the finite difference operator on a grid with spacing h=(Δx,Δτ)h = (\Delta x, \Delta\tau), and LL the continuous PDE operator.

Consistency: The scheme is consistent if the truncation error τh=LhCLC\tau_h = L_h C - L C satisfies τh0\tau_h \to 0 as h0h \to 0. All three schemes above are consistent.

Stability: The scheme is stable if the amplification matrix is uniformly bounded: AnC\|A^n\| \leq C for all nT/Δτn \leq T/\Delta\tau. Von Neumann stability (all ξ1|\xi| \leq 1) is sufficient for constant-coefficient problems.

Convergence: The numerical solution converges to the true solution: ChC0\|C_h - C\| \to 0 as h0h \to 0.

Lax Equivalence Theorem

Theorem (Lax-Richtmyer). For a consistent finite difference scheme applied to a well-posed linear initial-value problem:

Stability    Convergence.\text{Stability} \iff \text{Convergence}.

Consistency alone is not sufficient — an unstable consistent scheme may diverge. The Lax theorem is the fundamental result justifying stability analysis as a proxy for convergence. It applies to linear problems; for nonlinear problems (e.g., American option PSOR) a separate convergence argument is needed.

Convergence Orders Summary

SchemeTime orderSpace orderStability
Explicit (FTCS)O(Δτ)O(\Delta\tau)O(Δx2)O(\Delta x^2)Conditional: σ2ΔτΔx2\sigma^2\Delta\tau \leq \Delta x^2
Implicit (BTCS)O(Δτ)O(\Delta\tau)O(Δx2)O(\Delta x^2)Unconditional
Crank-NicolsonO(Δτ2)O(\Delta\tau^2)O(Δx2)O(\Delta x^2)Unconditional

Grid Design in Practice

Log-spot transformation. Always apply x=lnSx = \ln S before discretising. This makes the PDE coefficients constant and the grid uniform in log-moneyness, giving denser coverage near the money.

Grid boundaries. Set xmin=ln(S0)5σTx_{\min} = \ln(S_0) - 5\sigma\sqrt{T} and xmax=ln(S0)+5σTx_{\max} = \ln(S_0) + 5\sigma\sqrt{T}: five standard deviations captures essentially all of the distribution. Using smaller boundaries introduces truncation error at the boundaries.

Grid spacing. A rule of thumb for CN: ΔxσΔτ\Delta x \approx \sigma\sqrt{\Delta\tau} ensures λ1/2\lambda \approx 1/2, which is near-optimal for accuracy without triggering stability issues. For vanilla options: M=200M = 200400400 spatial points and N=100N = 100200200 time steps typically give 4–5 significant figures.

Non-uniform grids. For barrier options or near-expiry options, use a finer grid near the barrier or the strike and coarser grids elsewhere. Requires modified stencils.


Limitations

Multidimensional PDEs. For two-factor models (Heston, two-asset options), the PDE is two-dimensional in state space. Standard CN in 2D requires solving a system with O(M2)O(M^2) unknowns and a non-tridiagonal matrix. The Alternating Direction Implicit (ADI) method (Craig-Sneyd, Hout-Foulon) splits the 2D problem into a sequence of 1D tridiagonal solves, recovering O(M)O(M) cost per time step while maintaining second-order accuracy.

American options. The early exercise constraint converts the PDE into a variational inequality, requiring either PSOR (see next module) or the penalty method. The Lax theorem does not directly apply; a separate monotonicity argument (Barles-Souganidis) establishes convergence.

Rough payoffs. Digital options have a discontinuous payoff, causing Gibbs-like oscillations with CN. Rannacher smoothing or use of BTCS throughout is recommended.

Greeks by finite difference on the grid. Delta and Gamma can be read directly from the grid values at adjacent nodes (no bump-reval needed), making FD particularly attractive for Greeks-intensive applications.


Interview Angle

L1. What is the Black-Scholes PDE? Write down the FTCS discretisation. Why must the time step satisfy the CFL condition?

BS PDE: tC+12σ2S2SSC+rSSCrC=0\partial_t C + \frac{1}{2}\sigma^2 S^2 \partial_{SS} C + rS\partial_S C - rC = 0. After log-transform: τC=σ22xxC+(rσ22)xCrC\partial_\tau C = \frac{\sigma^2}{2}\partial_{xx}C + (r-\frac{\sigma^2}{2})\partial_x C - rC.

FTCS CFL: The explicit scheme computes Cij+1C_i^{j+1} as a linear combination of Ci1j,Cij,Ci+1jC_{i-1}^j, C_i^j, C_{i+1}^j. For stability, the coefficients must be non-negative (monotone scheme). The coefficient of CijC_i^j is 12λrΔτ0λ1/2σ2ΔτΔx21 - 2\lambda - r\Delta\tau \geq 0 \Rightarrow \lambda \leq 1/2 \Rightarrow \sigma^2\Delta\tau \leq \Delta x^2. If this is violated, the scheme amplifies high-frequency noise exponentially — the solution blows up.

L2. Prove that the BTCS scheme is unconditionally stable via Von Neumann analysis. State the Lax equivalence theorem and explain its significance. Why is Crank-Nicolson preferred over BTCS in practice?

BTCS stability proof: Assume Cij=ξjeiαxiC_i^j = \xi^j e^{i\alpha x_i}. The BTCS relation for the diffusion part gives ξ(1+2λ(1cosαΔx))=1\xi(1 + 2\lambda(1-\cos\alpha\Delta x)) = 1, so ξ=1/(1+2λ(1cosαΔx))\xi = 1/(1 + 2\lambda(1-\cos\alpha\Delta x)). Since 1cosαΔx01 - \cos\alpha\Delta x \geq 0 for all α\alpha, the denominator is 1\geq 1, hence ξ1|\xi| \leq 1 for all λ>0\lambda > 0. No CFL condition.

Lax: Consistency + stability \Leftrightarrow convergence (for linear well-posed problems). Without Lax, proving convergence directly for FD schemes would require detailed error analysis at every grid point; Lax reduces it to: prove consistency (a Taylor expansion check) and prove stability (an eigenvalue/Von Neumann check).

CN vs. BTCS: CN achieves O(Δτ2)O(\Delta\tau^2) vs. O(Δτ)O(\Delta\tau) time accuracy, so for the same total error, CN requires N\sqrt{N} fewer time steps than BTCS. In practice, CN halves the number of time steps needed for the same accuracy, at negligible additional cost (same tridiagonal structure).

L3. Derive the Von Neumann stability condition for Crank-Nicolson applied to the heat equation. Explain why CN oscillates near discontinuities and how Rannacher smoothing eliminates this. Describe the ADI extension to two-dimensional PDEs.

CN stability for heat equation: With τC=DxxC\partial_\tau C = D \partial_{xx} C, the CN amplification factor is ξ=(1λ(1cosαΔx))/(1+λ(1cosαΔx))\xi = (1 - \lambda(1-\cos\alpha\Delta x))/(1 + \lambda(1-\cos\alpha\Delta x)) where λ=DΔτ/Δx2\lambda = D\Delta\tau/\Delta x^2. Since 1λ()1+λ()1 - \lambda(\cdot) \leq 1 + \lambda(\cdot), we have ξ1\xi \leq 1; and ξ(1+λ())/(1+λ())=1\xi \geq -(1+\lambda(\cdot))/(1+\lambda(\cdot)) = -1 only if numerator is non-negative, but for λ>1/2\lambda > 1/2 and cosαΔx=1\cos\alpha\Delta x = -1, ξ=(12λ)/(1+2λ)\xi = (1-2\lambda)/(1+2\lambda), which has ξ<1|\xi| < 1 for all λ>0\lambda > 0. So unconditionally stable.

Oscillations: CN has zero dissipation at the Nyquist frequency (ξ=1|\xi| = 1 for α=π/Δx\alpha = \pi/\Delta x): high-frequency errors are not damped. A discontinuous payoff excites all frequencies; the undamped high-frequency components produce oscillations that persist for many time steps.

Rannacher: Two BTCS steps at the start damp all high-frequency modes (BTCS has ξ<1|\xi| < 1 strictly). After this "smoothing phase," CN continues with second-order accuracy. The two BTCS steps contribute O(Δτ)O(\Delta\tau) local error but only over a time interval 2Δτ02\Delta\tau \to 0, leaving the global error dominated by the CN phase.

ADI (Craig-Sneyd): For a 2D PDE τC=(Lx+Lxy+Ly)C\partial_\tau C = (L_x + L_{xy} + L_y)C, split each time step into substeps that solve 1D tridiagonal systems alternately in xx and yy directions. The Craig-Sneyd scheme (1988) is second-order in time while maintaining the O(M)O(M) cost of 1D solves. The cross-derivative term LxyL_{xy} (arising from the correlation ρ\rho in Heston) is treated explicitly in the predictor step and corrected in a subsequent corrector step.

Read the theory? Run the code.

View Notebook