Brownian Bridge™

Setup

What Monte Carlo Computes

Monte Carlo (MC) methods estimate expectations of the form

$\mu = \mathbb{E}[f(X)],$

where $X$ is a random variable (or vector) with a known distribution and $f$ is a payoff or functional. In derivatives pricing, $X$ is typically a path of the risk-neutral process and $f$ is the discounted payoff.

The standard MC estimator with $N$ i.i.d. samples $X_1, \ldots, X_N$ is

$\hat{\mu}_N = \frac{1}{N}\sum_{i=1}^N f(X_i).$

By the law of large numbers, $\hat{\mu}_N \to \mu$ almost surely. By the central limit theorem,

$\sqrt{N}\left(\hat{\mu}_N - \mu\right) \xrightarrow{d} \mathcal{N}(0, \sigma_f^2), \qquad \sigma_f^2 = \mathrm{Var}(f(X)).$

The standard error is $\mathrm{SE} = \sigma_f / \sqrt{N}$ , converging at rate $O(N^{-1/2})$ regardless of the dimension of $X$ . This is the central virtue of MC: convergence rate is dimension-independent.

Notation

$\sigma_f^2 = \mathrm{Var}(f(X))$ : variance of the estimator (per sample).
$N$ : number of independent sample paths.
$\hat{\sigma}_f^2$ : sample variance estimator, used to construct confidence intervals.
The $95\%$ confidence interval for $\mu$ is $\hat{\mu}_N \pm 1.96 \cdot \hat{\sigma}_f / \sqrt{N}$ .

A key principle: reducing $\sigma_f$ by a factor $k$ is equivalent to increasing $N$ by $k^2$ . Variance reduction techniques achieve this without generating more samples.

Antithetic Variates

Principle

For each sample $X_i$ , generate a paired sample $X_i'$ from the antithetic distribution — typically by negating the underlying normal draw. The antithetic estimator is:

$\hat{\mu}_N^{\mathrm{AV}} = \frac{1}{N}\sum_{i=1}^N \frac{f(X_i) + f(X_i')}{2}.$

This is unbiased: $\mathbb{E}\!\left[\frac{f(X) + f(X')}{2}\right] = \mu$ since $X'$ has the same marginal distribution as $X$ .

Variance Analysis

$\mathrm{Var}\!\left(\frac{f(X) + f(X')}{2}\right) = \frac{\mathrm{Var}(f(X)) + \mathrm{Var}(f(X')) + 2\,\mathrm{Cov}(f(X), f(X'))}{4}.$

Since $X$ and $X'$ are identically distributed, $\mathrm{Var}(f(X)) = \mathrm{Var}(f(X')) = \sigma_f^2$ . Therefore:

$\mathrm{Var}\!\left(\frac{f(X) + f(X')}{2}\right) = \frac{\sigma_f^2 + \mathrm{Cov}(f(X), f(X'))}{2}.$

Variance is reduced whenever $\mathrm{Cov}(f(X), f(X')) < 0$ . Compared to the standard estimator variance $\sigma_f^2 / N$ using $2N$ samples (matching the computational cost — two paths per antithetic pair), the antithetic estimator variance is:

$\frac{1}{N} \cdot \frac{\sigma_f^2 + \mathrm{Cov}(f(X), f(X'))}{2} \quad \text{vs.} \quad \frac{\sigma_f^2}{2N}.$

The efficiency gain is:

$\text{Efficiency gain} = \frac{\sigma_f^2 / (2N)}{(\sigma_f^2 + \mathrm{Cov})/2N} = \frac{\sigma_f^2}{\sigma_f^2 + \mathrm{Cov}(f(X), f(X'))} = \frac{1}{1 + \rho_{f,f'}},$

where $\rho_{f,f'}$ is the correlation between $f(X)$ and $f(X')$ . For this to be a gain, we need $\rho_{f,f'} < 0$ .

Application to GBM Call Option

Under GBM, one path realises $W_T = \sqrt{T} Z$ for $Z \sim \mathcal{N}(0,1)$ . The antithetic path uses $-Z$ :

$S_T = S_0 \exp\!\left((r - \tfrac{1}{2}\sigma^2)T + \sigma\sqrt{T}\, Z\right), \qquad S_T' = S_0 \exp\!\left((r - \tfrac{1}{2}\sigma^2)T - \sigma\sqrt{T}\, Z\right).$

The payoff $f(Z) = e^{-rT}(S_T - K)^+$ is an increasing function of $Z$ . When $Z$ is large (call in the money), $-Z$ is large and negative (call out of the money). So $\mathrm{Cov}(f(Z), f(-Z)) < 0$ for a convex, monotone payoff — antithetic variates are effective.

For path-dependent options, antithetic variates require generating the full antithetic path $\{-Z_1, \ldots, -Z_n\}_{n=1}^{N_\text{steps}}$ .

Control Variates

Principle

Let $Y$ be a random variable correlated with $f(X)$ such that $\mathbb{E}[Y] = \mu_Y$ is known analytically. The control variate estimator is:

$\hat{\mu}_N^{\mathrm{CV}}(c) = \frac{1}{N}\sum_{i=1}^N \left[f(X_i) - c(Y_i - \mu_Y)\right].$

This is unbiased for any constant $c$ : the correction term $c(Y_i - \mu_Y)$ has mean zero.

Optimal Control Coefficient

The variance of $f(X) - c(Y - \mu_Y)$ is:

$\mathrm{Var}(f(X) - c Y) = \sigma_f^2 - 2c\,\mathrm{Cov}(f,Y) + c^2 \sigma_Y^2.$

Minimising over $c$ :

$c^* = \frac{\mathrm{Cov}(f(X), Y)}{\mathrm{Var}(Y)}.$

The variance after control variate at $c = c^*$ :

$\mathrm{Var}^* = \sigma_f^2\left(1 - \rho_{f,Y}^2\right),$

where $\rho_{f,Y}$ is the correlation between $f(X)$ and $Y$ . The variance reduction factor is $1 - \rho_{f,Y}^2$ ; a correlation of $|\rho| = 0.9$ gives $81\%$ variance reduction.

In practice, $c^*$ is estimated from a pilot sample or computed jointly with the main estimator using the sample covariance. The bias from estimating $c^*$ is $O(1/N)$ — negligible for large $N$ .

Common Control Variates in Option Pricing

Underlying as control. Under $\mathbb{Q}$ , $\mathbb{E}^{\mathbb{Q}}[S_T] = S_0 e^{rT}$ . Set $Y = e^{-rT} S_T$ ; then $\mu_Y = S_0$ . For a call option, $\mathrm{Corr}(f(S_T), S_T)$ is high (near 1 for at-the-money options), giving substantial reduction.

European as control for path-dependent. For an Asian option on an arithmetic average, the geometric-average Asian has a known closed form. The correlation between arithmetic and geometric payoffs is high ( $\approx 0.99$ in typical parameters), yielding near-perfect variance reduction.

Delta-hedged portfolio. For any smooth payoff, the discounted stock price increment $\int_0^T e^{-rt}\partial_S C \, dS_t$ has mean zero (it is a martingale under $\mathbb{Q}$ ). Using this as a control variate is known as martingale control variates and can achieve variance reduction close to $1 - \rho^2 \approx 0.99$ for smooth payoffs.

Quasi-Monte Carlo

Motivation

Standard MC uses pseudo-random numbers, which are i.i.d. uniform on $[0,1]^d$ . The expected error is $O(N^{-1/2})$ . Can we do better by replacing pseudo-random sequences with deterministic low-discrepancy sequences that fill the hypercube more uniformly?

Discrepancy

The star discrepancy of a sequence $\{x_1, \ldots, x_N\} \subset [0,1]^d$ is:

$D_N^* = \sup_{[a,b] \subseteq [0,1]^d} \left|\frac{\#\{i : x_i \in [a,b]\}}{N} - \mathrm{Vol}([a,b])\right|.$

It measures the worst-case deviation between the empirical distribution of the sequence and the uniform distribution. A sequence is low-discrepancy (LD) if $D_N^* = O\!\left((\log N)^d / N\right)$ — much smaller than the $O(N^{-1/2})$ discrepancy of a random sequence.

Koksma-Hlawka Inequality

For a function $f$ of bounded Hardy-Krause variation $V(f)$ :

$\left|\frac{1}{N}\sum_{i=1}^N f(x_i) - \int_{[0,1]^d} f(u)\, du\right| \leq V(f) \cdot D_N^*.$

For a low-discrepancy sequence: error $\leq V(f) \cdot O\!\left((\log N)^d / N\right)$ , which beats $O(N^{-1/2})$ for fixed $d$ and large $N$ . In practice, the improvement is dramatic for $d \leq 10$ and smooth $f$ .

Sobol Sequences

Sobol sequences are the most widely used LD sequences in finance. They are digital nets constructed in base 2 using generating matrices that ensure equidistribution across dyadic subintervals. Key properties:

Each dimension is independently constructed (with cross-dimension properties ensured via scrambling).
The first $2^k$ points of a Sobol sequence cover $[0,1]^d$ in a maximally stratified manner.
Scrambled Sobol (Owen 1995) randomises the sequence while preserving the LD property, enabling unbiased estimation and confidence intervals.

Effective Dimension

The Koksma-Hlawka bound grows with $d$ . For high-dimensional problems (e.g., simulating a 252-step path = $d = 252$ ), the theoretical advantage of QMC may be lost. In practice, what matters is the effective dimension: for a GBM path, the variation in the payoff is dominated by a few directions (early time steps, large moves). The Brownian bridge construction assigns Sobol dimensions to the most influential time points first, concentrating the low-discrepancy property where it matters most.

Convergence Comparison

Method	Expected error	Notes
Standard MC	$O(N^{-1/2})$	Dimension-independent, easy to implement
Antithetic variates	$O(N^{-1/2})$	Reduces variance by $(1+\rho)/2$ , no extra paths
Control variates	$O(N^{-1/2})$	Reduces variance by $1 - \rho^2$ , needs known $\mathbb{E}[Y]$
Quasi-MC (Sobol)	$O((\log N)^d / N)$	Pre-asymptotic gains for $d \leq 20$ , smooth integrands
Randomised QMC	$O(N^{-1/2-1/d})$ (theoretical)	Confidence intervals available; best for smooth functions

All methods have the same asymptotic $O(N^{-1/2})$ rate in the worst case for non-smooth payoffs. The gains are problem-dependent.

Limitations

Non-smooth payoffs. Digital options have a payoff with a jump discontinuity. The Koksma-Hlawka bound requires bounded variation — a digital payoff has $V(f) = \infty$ in the Hardy-Krause sense (discontinuous). QMC loses its advantage; standard MC performs comparably. Smoothing the payoff (e.g., replace the digital with a call spread) restores QMC efficiency.

Curse of dimensionality. The $(\log N)^d$ factor in QMC becomes prohibitive for $d > 50$ at typical sample sizes. The effective-dimension reduction via Brownian bridge or principal component analysis (PCA) construction is essential.

Antithetic for discontinuous payoffs. Antithetic variates can increase variance if the payoff is not monotone (e.g., a butterfly spread). Always verify the sign of $\mathrm{Cov}(f(X), f(X'))$ before applying.

Control variate estimation. If $c^*$ is estimated from the same sample used for the main estimate, the combined estimator is biased (though at rate $O(1/N)$ ). For small $N$ or high-stakes applications, use a separate pilot sample for $c^*$ estimation.

Interview Angle

L1. State the Monte Carlo convergence rate. How many additional paths are needed to halve the standard error? What is the standard error of a MC estimate of a Black-Scholes call price?

Rate: $\mathrm{SE} = \sigma_f / \sqrt{N}$ . To halve $\mathrm{SE}$ , multiply $N$ by 4. For a BS call: simulate $S_T^{(i)} = S_0 \exp((r - \sigma^2/2)T + \sigma\sqrt{T} Z_i)$ , compute $f_i = e^{-rT}(S_T^{(i)} - K)^+$ . Then $\hat{\mu}_N = \bar{f}_N$ and $\mathrm{SE} = \hat{\sigma}_f / \sqrt{N}$ where $\hat{\sigma}_f^2 = \frac{1}{N-1}\sum(f_i - \bar{f}_N)^2$ .

L2. Derive the optimal control variate coefficient $c^*$ and the resulting variance reduction formula $1 - \rho_{f,Y}^2$ . Explain why $\mathbb{E}^{\mathbb{Q}}[S_T] = S_0 e^{rT}$ makes the discounted stock a valid control variate.

Derivation: $\mathrm{Var}(f - cY) = \sigma_f^2 - 2c\,\mathrm{Cov}(f,Y) + c^2\sigma_Y^2$ . Differentiate in $c$ and set to zero: $c^* = \mathrm{Cov}(f,Y)/\sigma_Y^2$ . Substituting back: $\mathrm{Var}^* = \sigma_f^2 - [\mathrm{Cov}(f,Y)]^2/\sigma_Y^2 = \sigma_f^2(1 - \rho_{f,Y}^2)$ .

Why $S_T$ is valid: Under $\mathbb{Q}$ , the discounted stock $e^{-rT}S_T$ is a martingale, so $\mathbb{E}^{\mathbb{Q}}[e^{-rT}S_T] = S_0$ — this is known analytically. Setting $Y_i = e^{-rT}S_T^{(i)}$ with $\mu_Y = S_0$ , the correction term $c^*(Y_i - S_0)$ has zero mean, making the control variate estimator unbiased.

L3. State the Koksma-Hlawka inequality. Under what conditions does quasi-MC outperform standard MC? Explain the role of the Brownian bridge construction in making Sobol sequences effective for path-dependent options.

Koksma-Hlawka: $|\frac{1}{N}\sum f(x_i) - \int f| \leq V(f) \cdot D_N^*$ . For LD sequences $D_N^* = O((\log N)^d/N)$ , so the error is $O(V(f)(\log N)^d/N)$ vs. $O(\sigma_f/\sqrt{N})$ for standard MC. QMC wins when: (1) $f$ has bounded variation $V(f) < \infty$ (smooth or Lipschitz payoffs), (2) $d$ is moderate (the $(\log N)^d$ factor is manageable), and (3) $N$ is large enough for the pre-asymptotic advantage to materialise.

Brownian bridge: For a path $\{W_{t_1}, \ldots, W_{t_n}\}$ , the standard construction maps Sobol dimension $j$ to time step $t_j$ in order. But the payoff variance is dominated by the terminal value $W_T$ and large-scale moves, not fine-scale increments. The Brownian bridge construction uses Sobol dimension 1 for $W_T$ , dimension 2 for $W_{T/2}$ (interpolated given $W_0$ and $W_T$ ), then fills in midpoints recursively. This assigns the lowest-discrepancy dimensions to the highest-variance components of the path, making the effective dimension of the integral much smaller than $n$ and restoring QMC efficiency.