Itô's Lemma: Derivation and Applications

Hard·22 min read
Stochastic CalculusItô's LemmaStochastic Differential Equations

Setup

We work on (Ω,F,P)(\Omega, \mathcal{F}, \mathbb{P}) with filtration F=(Ft)t0\mathbb{F} = (\mathcal{F}_t)_{t \geq 0} satisfying the usual conditions. Let (Wt)t0(W_t)_{t \geq 0} be a standard P\mathbb{P}-Brownian motion.

An Itô process is a continuous adapted process of the form: Xt=X0+0tμsds+0tσsdWs,X_t = X_0 + \int_0^t \mu_s \, ds + \int_0^t \sigma_s \, dW_s, where μ\mu (drift) and σ\sigma (diffusion) are F\mathbb{F}-progressively measurable and satisfy: 0Tμsds<and0Tσs2ds<a.s.\int_0^T |\mu_s| \, ds < \infty \quad \text{and} \quad \int_0^T \sigma_s^2 \, ds < \infty \quad \text{a.s.}

In differential notation: dXt=μtdt+σtdWtdX_t = \mu_t \, dt + \sigma_t \, dW_t.


The Itô Integral

The stochastic integral 0TσsdWs\int_0^T \sigma_s \, dW_s is defined as the L2L^2 limit of non-anticipating Riemann sums: 0TσsdWs=L2- ⁣limπ0iσti1(WtiWti1).\int_0^T \sigma_s \, dW_s = L^2\text{-}\!\lim_{|\pi| \to 0} \sum_{i} \sigma_{t_{i-1}} \bigl(W_{t_i} - W_{t_{i-1}}\bigr).

The requirement to evaluate σ\sigma at the left endpoint ti1t_{i-1} — not the midpoint — is what makes the integral Itô. Left-endpoint evaluation ensures σti1\sigma_{t_{i-1}} is Fti1\mathcal{F}_{t_{i-1}}-measurable (non-anticipating). The choice of evaluation point matters: different conventions yield different integrals (see Itô vs Stratonovich below).

Itô isometry. For square-integrable adapted σ\sigma: E ⁣[(0TσsdWs)2]=E ⁣[0Tσs2ds].\mathbb{E}\!\left[\left(\int_0^T \sigma_s \, dW_s\right)^2\right] = \mathbb{E}\!\left[\int_0^T \sigma_s^2 \, ds\right].

This is the key L2L^2 norm identity. It follows from expanding the square and using independence of non-overlapping Brownian increments: cross terms E[σti1(WtiWti1)σtj1(WtjWtj1)]=0\mathbb{E}[\sigma_{t_{i-1}}(W_{t_i} - W_{t_{i-1}}) \cdot \sigma_{t_{j-1}}(W_{t_j} - W_{t_{j-1}})] = 0 for iji \neq j.

A square-integrable adapted integrand produces a martingale: E ⁣[0tσsdWsFr]=0rσsdWs,rt.\mathbb{E}\!\left[\int_0^t \sigma_s \, dW_s \,\Big|\, \mathcal{F}_r\right] = \int_0^r \sigma_s \, dW_s, \quad r \leq t.


Itô's Lemma: Statement

Let f:R+×RRf: \mathbb{R}_+ \times \mathbb{R} \to \mathbb{R} be C1,2C^{1,2} (once continuously differentiable in tt, twice in xx). Define Yt=f(t,Xt)Y_t = f(t, X_t). Then YY is again an Itô process and: dYt=ft(t,Xt)dt+fx(t,Xt)dXt+122fx2(t,Xt)(dXt)2,dY_t = \frac{\partial f}{\partial t}(t, X_t) \, dt + \frac{\partial f}{\partial x}(t, X_t) \, dX_t + \frac{1}{2} \frac{\partial^2 f}{\partial x^2}(t, X_t) \, (dX_t)^2, where (dXt)2(dX_t)^2 is evaluated using the Itô multiplication table: dtdt=0,dtdWt=0,dWtdWt=dt.dt \cdot dt = 0, \qquad dt \cdot dW_t = 0, \qquad dW_t \cdot dW_t = dt.

Substituting dXt=μtdt+σtdWtdX_t = \mu_t \, dt + \sigma_t \, dW_t and (dXt)2=σt2dt(dX_t)^2 = \sigma_t^2 \, dt: dYt=(ft+μtfx+12σt22fx2)dt+σtfxdWt.\boxed{dY_t = \left(\frac{\partial f}{\partial t} + \mu_t \frac{\partial f}{\partial x} + \frac{1}{2}\sigma_t^2 \frac{\partial^2 f}{\partial x^2}\right) dt + \sigma_t \frac{\partial f}{\partial x} \, dW_t.}

The term 12σt2fxx\frac{1}{2}\sigma_t^2 f_{xx} is the Itô correction. It has no analogue in classical calculus and arises solely from the non-zero quadratic variation of Brownian motion.


Derivation

Apply Taylor's theorem to f(t+dt,Xt+dt)f(t + dt, X_{t+dt}) around (t,Xt)(t, X_t): df=ftdt+fxdX+12fxx(dX)2+12ftt(dt)2+fxtdtdX+df = f_t \, dt + f_x \, dX + \frac{1}{2}f_{xx}(dX)^2 + \frac{1}{2}f_{tt}(dt)^2 + f_{xt} \, dt \, dX + \cdots

Substitute dX=μdt+σdWdX = \mu \, dt + \sigma \, dW and expand (dX)2(dX)^2: (dX)2=μ2(dt)2+2μσdtdW+σ2(dW)2.(dX)^2 = \mu^2(dt)^2 + 2\mu\sigma \, dt \, dW + \sigma^2(dW)^2.

Apply the Itô multiplication table. The key step: (dW)2=dt(dW)^2 = dt is the quadratic variation result [W]t=t[W]_t = t in differential form. The other products vanish as o(dt)o(dt): (dX)2=σ2dt+O ⁣((dt)3/2).(dX)^2 = \sigma^2 \, dt + O\!\left((dt)^{3/2}\right).

Similarly, (dt)2=0(dt)^2 = 0 and dtdW=0dt \cdot dW = 0 in the L2L^2 sense. Retaining only terms of order dtdt: df=ftdt+fx(μdt+σdW)+12fxxσ2dt.df = f_t \, dt + f_x(\mu \, dt + \sigma \, dW) + \frac{1}{2}f_{xx}\sigma^2 \, dt.

Collecting drift and diffusion terms: df=(ft+μfx+12σ2fxx)dt+σfxdW.df = \left(f_t + \mu f_x + \frac{1}{2}\sigma^2 f_{xx}\right) dt + \sigma f_x \, dW.

This is Itô's lemma. The heuristic argument is exact in identifying the correct terms; the rigorous version replaces the Taylor remainder analysis with an L2L^2 convergence argument using the Itô isometry.


Application: Solving the GBM SDE

Let StS_t satisfy the geometric Brownian motion SDE: dSt=μStdt+σStdWt.dS_t = \mu S_t \, dt + \sigma S_t \, dW_t.

Apply Itô's lemma to f(St)=lnStf(S_t) = \ln S_t, with fx=1/xf_x = 1/x, fxx=1/x2f_{xx} = -1/x^2, ft=0f_t = 0: d(lnSt)=1StdSt121St2σ2St2dt=(μσ22)dt+σdWt.d(\ln S_t) = \frac{1}{S_t} \, dS_t - \frac{1}{2} \cdot \frac{1}{S_t^2} \cdot \sigma^2 S_t^2 \, dt = \left(\mu - \frac{\sigma^2}{2}\right) dt + \sigma \, dW_t.

Integrating from 0 to TT: lnSTlnS0=(μσ22)T+σWT.\ln S_T - \ln S_0 = \left(\mu - \frac{\sigma^2}{2}\right)T + \sigma W_T.

Therefore: ST=S0exp ⁣((μσ22)T+σWT).S_T = S_0 \exp\!\left(\left(\mu - \frac{\sigma^2}{2}\right)T + \sigma W_T\right).

The Role of the Itô Correction

The term σ2/2-\sigma^2/2 is not a typo. It is the Itô correction from fxx=1/S2f_{xx} = -1/S^2. Without it — if one naively wrote lnST=lnS0+μT+σWT\ln S_T = \ln S_0 + \mu T + \sigma W_T — the expectation would be wrong: E[ST]=S0eμT,correct from the SDE:dE[St]=μE[St]dt.\mathbb{E}[S_T] = S_0 e^{\mu T}, \quad \text{correct from the SDE:} \quad d\mathbb{E}[S_t] = \mu \mathbb{E}[S_t] \, dt.

But E[e(μ+σ2/2)T+σWT]=e(μ+σ2/2)Teσ2T/2=e(μ+σ2)TeμT\mathbb{E}[e^{(\mu + \sigma^2/2)T + \sigma W_T}] = e^{(\mu + \sigma^2/2)T} \cdot e^{\sigma^2 T/2} = e^{(\mu + \sigma^2)T} \neq e^{\mu T}. The Itô correction σ2/2-\sigma^2/2 is what reconciles the arithmetic mean growth μ\mu of the SDE with the geometric mean growth μσ2/2\mu - \sigma^2/2 of the log process.


Itô vs Stratonovich

The Stratonovich integral evaluates the integrand at the midpoint: 0TσsdWs=L2- ⁣limπ0iσti+σti12(WtiWti1).\int_0^T \sigma_s \circ dW_s = L^2\text{-}\!\lim_{|\pi| \to 0} \sum_i \frac{\sigma_{t_i} + \sigma_{t_{i-1}}}{2}\bigl(W_{t_i} - W_{t_{i-1}}\bigr).

The Stratonovich integral obeys the classical chain rule — no correction term: df(Wt)=f(Wt)dWt.df(W_t) = f'(W_t) \circ dW_t.

The two integrals are related by: 0TσsdWs=0TσsdWs+120Tσx(Xs)σsds.\int_0^T \sigma_s \circ dW_s = \int_0^T \sigma_s \, dW_s + \frac{1}{2}\int_0^T \frac{\partial \sigma}{\partial x}(X_s) \sigma_s \, ds.

The Stratonovich convention is preferred in physics and differential geometry (because it obeys the classical chain rule and behaves well under coordinate changes). The Itô convention is standard in financial mathematics because:

  1. The Itô integral of an adapted square-integrable process is a martingale — essential for risk-neutral pricing.
  2. The Stratonovich integral anticipates future information through the midpoint evaluation, which is economically meaningless.

Multidimensional Itô Lemma

For a vector of Itô processes X=(X1,,Xd)X = (X^1, \ldots, X^d) driven by correlated Brownian motions W1,,WmW^1, \ldots, W^m with dWi,Wjt=ρijdtd\langle W^i, W^j\rangle_t = \rho_{ij} \, dt, and a function fC1,2f \in C^{1,2}: df(t,Xt)=ftdt+i=1dfxidXti+12i,j=1dfxixjdXi,Xjt.df(t, X_t) = f_t \, dt + \sum_{i=1}^d f_{x_i} \, dX_t^i + \frac{1}{2}\sum_{i,j=1}^d f_{x_i x_j} \, d\langle X^i, X^j\rangle_t.

Here dXi,Xjt=k=1mσikσjkdtd\langle X^i, X^j\rangle_t = \sum_{k=1}^m \sigma^{ik}\sigma^{jk} \, dt is the quadratic covariation, where σik\sigma^{ik} is the diffusion coefficient of XiX^i with respect to WkW^k.


Limitations

Regularity. Itô's lemma requires fC1,2f \in C^{1,2}. For payoffs with kinks — the call payoff (xK)+(x - K)^+ has fxx=δx=Kf_{xx} = \delta_{x=K} as a distribution — the formula breaks down. The correct generalization uses Tanaka's formula and local time.

Pathwise inapplicability. The Itô integral cannot be defined sample-path by sample-path: the paths of WW have infinite total variation. The L2L^2 construction is inherently probabilistic. All Itô calculus identities hold in the almost-sure or L2L^2 sense, not pointwise in ω\omega.

Itô correction in parameter estimation. If one observes a log-price process and estimates μ\mu from the empirical mean of log-returns, the estimate targets μσ2/2\mu - \sigma^2/2, not μ\mu. The distinction matters for maximum likelihood estimation of drift in GBM models.


Interview Angle

L1: State Itô's lemma. Apply it to f(St)=lnStf(S_t) = \ln S_t for dSt=μStdt+σStdWtdS_t = \mu S_t \, dt + \sigma S_t \, dW_t. What is the economic meaning of the σ2/2-\sigma^2/2 term?

Statement. For fC1,2f \in C^{1,2} and dXt=μtdt+σtdWtdX_t = \mu_t \, dt + \sigma_t \, dW_t: df(t,Xt)=(ft+μtfx+12σt2fxx)dt+σtfxdWt.df(t, X_t) = \left(f_t + \mu_t f_x + \tfrac{1}{2}\sigma_t^2 f_{xx}\right) dt + \sigma_t f_x \, dW_t.

Application to lnSt\ln S_t. With f(x)=lnxf(x) = \ln x: fx=1/xf_x = 1/x, fxx=1/x2f_{xx} = -1/x^2, ft=0f_t = 0, μt=μSt\mu_t = \mu S_t, σt=σSt\sigma_t = \sigma S_t: d(lnSt)=1StμStdt121St2σ2St2dt+σStStdWt=(μσ22)dt+σdWt.d(\ln S_t) = \frac{1}{S_t} \cdot \mu S_t \, dt - \frac{1}{2} \cdot \frac{1}{S_t^2} \cdot \sigma^2 S_t^2 \, dt + \frac{\sigma S_t}{S_t} \, dW_t = \left(\mu - \frac{\sigma^2}{2}\right) dt + \sigma \, dW_t. Integrating: ST=S0exp ⁣((μσ2/2)T+σWT)S_T = S_0 \exp\!\left((\mu - \sigma^2/2)T + \sigma W_T\right).

Economic meaning of σ2/2-\sigma^2/2. This is the volatility drag — the gap between arithmetic and geometric growth. The SDE dS=μSdt+σSdWdS = \mu S \, dt + \sigma S \, dW implies E[ST]=S0eμT\mathbb{E}[S_T] = S_0 e^{\mu T}: the stock grows arithmetically at rate μ\mu. But E[ln(ST/S0)]=(μσ2/2)T\mathbb{E}[\ln(S_T/S_0)] = (\mu - \sigma^2/2)T: log-returns grow at the lower rate μσ2/2\mu - \sigma^2/2. The gap σ2/2\sigma^2/2 arises from the convexity of exp\exp (Jensen's inequality): because log-returns are normally distributed, the average of the exponent exceeds the exponent of the average by exactly σ2/2\sigma^2/2. In practice: a fund with annualised vol σ=20%\sigma = 20\% suffers 2% drag annually — its compound growth rate is 2% below its average return. Practitioners who confuse arithmetic and geometric returns misprice long-horizon options and misreport expected performance.

L2: Derive Itô's lemma from a Taylor expansion. Why does (dW)2=dt(dW)^2 = dt in the Itô table? What is the Itô isometry, and why does it imply the stochastic integral is a martingale?

Derivation. Apply the second-order Taylor expansion to f(t+dt,Xt+dt)f(t + dt, X_{t+dt}) around (t,Xt)(t, X_t): df=ftdt+fxdX+12fxx(dX)2+O(dt3/2).df = f_t \, dt + f_x \, dX + \frac{1}{2}f_{xx}(dX)^2 + O(dt^{3/2}). Expand (dX)2=(μdt+σdW)2=μ2(dt)2+2μσdtdW+σ2(dW)2(dX)^2 = (\mu \, dt + \sigma \, dW)^2 = \mu^2 (dt)^2 + 2\mu\sigma \, dt \cdot dW + \sigma^2(dW)^2. Apply the multiplication table: (dt)2=0(dt)^2 = 0, dtdW=0dt \cdot dW = 0, and (dW)2=dt(dW)^2 = dt. Only the last term survives, giving (dX)2=σ2dt(dX)^2 = \sigma^2 \, dt. Substituting: df=ftdt+fx(μdt+σdW)+12fxxσ2dt=(ft+μfx+12σ2fxx)dt+σfxdW.df = f_t \, dt + f_x(\mu \, dt + \sigma \, dW) + \frac{1}{2}f_{xx}\sigma^2 \, dt = \left(f_t + \mu f_x + \tfrac{1}{2}\sigma^2 f_{xx}\right)dt + \sigma f_x \, dW. \qquad \square

Why (dW)2=dt(dW)^2 = dt. This is the differential form of [W]t=t[W]_t = t. On a partition of mesh hh, the squared increment (Wt+hWt)2(W_{t+h} - W_t)^2 has mean hh and variance 2h22h^2. Summing over n=T/hn = T/h intervals: the total has mean TT and variance 2T2/n02T^2/n \to 0 as nn \to \infty. The sum concentrates on TT in L2L^2 (not just in expectation) — the fluctuations vanish, and the quadratic variation is deterministically equal to tt. In the Taylor expansion this means (dW)2(dW)^2 contributes a deterministic correction of size dtdt, not a random term: it is absorbed into the drift, not the martingale part.

Itô isometry. For square-integrable adapted σ\sigma: E ⁣[(0TσsdWs)2]=E ⁣[0Tσs2ds].\mathbb{E}\!\left[\left(\int_0^T \sigma_s \, dW_s\right)^2\right] = \mathbb{E}\!\left[\int_0^T \sigma_s^2 \, ds\right]. Proof. Expand the square of the Riemann-sum approximant iσti1ΔWi\sum_i \sigma_{t_{i-1}} \Delta W_i: E ⁣[(iσti1ΔWi)2]=i,jE[σti1σtj1ΔWiΔWj].\mathbb{E}\!\left[\left(\sum_i \sigma_{t_{i-1}} \Delta W_i\right)^2\right] = \sum_{i,j} \mathbb{E}[\sigma_{t_{i-1}}\sigma_{t_{j-1}} \Delta W_i \Delta W_j]. For iji \neq j (say i<ji < j): ΔWj\Delta W_j is independent of Ftj1\mathcal{F}_{t_{j-1}}, which contains σti1,σtj1,ΔWi\sigma_{t_{i-1}}, \sigma_{t_{j-1}}, \Delta W_i. So E[σti1σtj1ΔWiΔWj]=E[σti1σtj1ΔWi]E[ΔWj]=0\mathbb{E}[\sigma_{t_{i-1}}\sigma_{t_{j-1}} \Delta W_i \Delta W_j] = \mathbb{E}[\sigma_{t_{i-1}}\sigma_{t_{j-1}} \Delta W_i] \cdot \mathbb{E}[\Delta W_j] = 0. Only diagonal terms contribute: iE[σti12(ΔWi)2]=iE[σti12]hE[0Tσs2ds]\sum_i \mathbb{E}[\sigma_{t_{i-1}}^2 (\Delta W_i)^2] = \sum_i \mathbb{E}[\sigma_{t_{i-1}}^2] \cdot h \to \mathbb{E}[\int_0^T \sigma_s^2 \, ds].

Why the stochastic integral is a martingale. For an adapted square-integrable integrand, Mt=0tσsdWsM_t = \int_0^t \sigma_s \, dW_s is a martingale: E[MtFr]=Mr,rt.\mathbb{E}[M_t \mid \mathcal{F}_r] = M_r, \quad r \leq t. The key is the non-anticipating (left-endpoint) evaluation: σti1\sigma_{t_{i-1}} is Fti1\mathcal{F}_{t_{i-1}}-measurable and hence independent of ΔWi=WtiWti1\Delta W_i = W_{t_i} - W_{t_{i-1}}. So E[σti1ΔWiFti1]=σti1E[ΔWiFti1]=0\mathbb{E}[\sigma_{t_{i-1}} \Delta W_i \mid \mathcal{F}_{t_{i-1}}] = \sigma_{t_{i-1}} \mathbb{E}[\Delta W_i \mid \mathcal{F}_{t_{i-1}}] = 0. Summing over future steps: E[MtMrFr]=0\mathbb{E}[M_t - M_r \mid \mathcal{F}_r] = 0. This is why the Itô convention (left-endpoint) is essential for finance: the resulting stochastic integral represents the gains from a non-anticipating trading strategy, and the martingale property ensures no-arbitrage.

L3: Compare Itô and Stratonovich conventions. Why is the Stratonovich integral not a martingale? State the multidimensional Itô lemma and apply it to a two-factor stochastic volatility model where dSdS and dνd\nu are correlated.

Itô vs Stratonovich. The Itô integral evaluates the integrand at the left endpoint of each interval; the Stratonovich integral at the midpoint: 0TσsdWs=L2-limiσti1+σti2(WtiWti1).\int_0^T \sigma_s \circ dW_s = L^2\text{-}\lim \sum_i \frac{\sigma_{t_{i-1}} + \sigma_{t_i}}{2}(W_{t_i} - W_{t_{i-1}}). The midpoint value 12(σti1+σti)\frac{1}{2}(\sigma_{t_{i-1}} + \sigma_{t_i}) partially anticipates the future: σti\sigma_{t_i} is Fti\mathcal{F}_{t_i}-measurable, and there is a non-trivial correlation E[(σtiσti1)(WtiWti1)]0\mathbb{E}[(\sigma_{t_i} - \sigma_{t_{i-1}})(W_{t_i} - W_{t_{i-1}})] \neq 0 when σ\sigma is itself driven by WW. Concretely, if dσ=adt+bdWd\sigma = a \, dt + b \, dW, then: E ⁣[σti1+σti2(WtiWti1)]12bh0.\mathbb{E}\!\left[\frac{\sigma_{t_{i-1}} + \sigma_{t_i}}{2}(W_{t_i} - W_{t_{i-1}})\right] \approx \frac{1}{2} b \cdot h \neq 0. So E[MtStratMrStratFr]0\mathbb{E}[M^{\mathrm{Strat}}_t - M^{\mathrm{Strat}}_r \mid \mathcal{F}_r] \neq 0: the conditional increment has a non-zero drift, and the Stratonovich integral is not a martingale. The conversion formula makes this precise: 0TσsdWs=0TσsdWs+120Tσx(Xs)σsds.\int_0^T \sigma_s \circ dW_s = \int_0^T \sigma_s \, dW_s + \frac{1}{2}\int_0^T \frac{\partial\sigma}{\partial x}(X_s)\sigma_s \, ds. The correction term 12σxσds\frac{1}{2}\int \frac{\partial\sigma}{\partial x}\sigma \, ds is a finite-variation process — it is the martingale-killing drift added by midpoint evaluation. Stratonovich is standard in physics (where it obeys the classical chain rule and is appropriate for ODEs perturbed by smooth noise) and differential geometry, but it is the wrong convention for financial modelling.

Multidimensional Itô lemma. For Itô processes X1,,XdX^1, \ldots, X^d driven by correlated Brownian motions W1,,WmW^1, \ldots, W^m with dWi,Wjt=ρijdtd\langle W^i, W^j\rangle_t = \rho_{ij} \, dt, and fC1,2f \in C^{1,2}: df(t,Xt)=ftdt+ifxidXti+12i,jfxixjdXi,Xjt.df(t, X_t) = f_t \, dt + \sum_i f_{x_i} \, dX^i_t + \frac{1}{2}\sum_{i,j} f_{x_i x_j} \, d\langle X^i, X^j\rangle_t.

Application: Heston-type model. Let StS_t (spot) and νt\nu_t (instantaneous variance) satisfy: dSt=μStdt+νtStdWt1,dνt=κ(νˉνt)dt+ξνtdWt2,dW1,W2t=ρdt.dS_t = \mu S_t \, dt + \sqrt{\nu_t} S_t \, dW^1_t, \qquad d\nu_t = \kappa(\bar\nu - \nu_t) \, dt + \xi\sqrt{\nu_t} \, dW^2_t, \qquad d\langle W^1, W^2\rangle_t = \rho \, dt. The quadratic covariations are: dS,St=νtSt2dt,dν,νt=ξ2νtdt,dS,νt=ρξνtStdt.d\langle S, S\rangle_t = \nu_t S_t^2 \, dt, \quad d\langle \nu, \nu\rangle_t = \xi^2 \nu_t \, dt, \quad d\langle S, \nu\rangle_t = \rho\xi\nu_t S_t \, dt. Applying the multidimensional Itô lemma to V(t,St,νt)V(t, S_t, \nu_t): dV=[Vt+μSVS+κ(νˉν)Vν+12νS2VSS+ρξνSVSν+12ξ2νVνν]driftdt+νSVSdW1+ξνVνdW2.dV = \underbrace{\left[V_t + \mu S V_S + \kappa(\bar\nu - \nu) V_\nu + \tfrac{1}{2}\nu S^2 V_{SS} + \rho\xi\nu S \, V_{S\nu} + \tfrac{1}{2}\xi^2\nu V_{\nu\nu}\right]}_{\text{drift}} dt + \sqrt{\nu} S V_S \, dW^1 + \xi\sqrt{\nu} V_\nu \, dW^2. Under the risk-neutral measure Q\mathbb{Q}, we replace μr\mu \to r and the two Brownian terms constitute the option's hedge portfolio. Setting the discounted option price to be a Q\mathbb{Q}-martingale forces the drift to equal rVrV, giving the Heston PDE: Vt+rSVS+κ(νˉν)Vν+12νS2VSS+ρξνSVSν+12ξ2νVννrV=0.V_t + rS V_S + \kappa(\bar\nu - \nu) V_\nu + \tfrac{1}{2}\nu S^2 V_{SS} + \rho\xi\nu S \, V_{S\nu} + \tfrac{1}{2}\xi^2\nu V_{\nu\nu} - rV = 0. Note the cross-derivative term ρξνSVSν\rho\xi\nu S \, V_{S\nu}: it is absent in Black-Scholes and directly encodes the vol-of-vol and spot-vol correlation that generates the implied vol skew.

Read the theory? Verify your understanding.

Take the Quiz