Brownian Bridge™

Setup

From Riemann to Lebesgue

The Riemann integral partitions the domain — the x-axis — into thin vertical slices and approximates $f$ by step functions on those slices. For continuous functions on bounded intervals this works well. It fails in three ways that matter for finance:

Discontinuous functions. The indicator $\mathbf{1}_{\mathbb{Q}}$ (1 on rationals, 0 elsewhere) is not Riemann integrable. Its Lebesgue integral over $[0,1]$ is $0$ — the rationals have measure zero — yet this cannot be recovered from Riemann's construction.
Limit interchange. A uniformly bounded sequence of Riemann-integrable functions can converge pointwise to a function that is not Riemann integrable. The Lebesgue theory has clean theorems (MCT, DCT) governing when limits and integrals commute.
Abstract sample spaces. $\Omega$ is not the real line. Asset price paths live in $C([0,T])$ ; there is no natural x-axis to partition. Expectation must be defined on an arbitrary measurable space.

The Lebesgue integral resolves all three by partitioning the range of $f$ — the y-axis — into thin horizontal strips, measuring the probability of the preimage of each strip, and summing.

INSIGHT

Why this matters on a derivatives desk. Every option price is an expectation: $V_0 = e^{-rT} \mathbb{E}^\mathbb{Q}[\Phi(S_T)]$ . Monte Carlo is a law-of-large-numbers approximation to this Lebesgue integral. Changing between the physical measure $\mathbb{P}$ and the risk-neutral measure $\mathbb{Q}$ (Girsanov's theorem, Module 3 of this course) is a change in the integrating measure — it has precise meaning only in the Lebesgue framework.

Conventions

$(\Omega, \mathcal{F}, \mathbb{P})$ is a probability space as defined in Module 1.
$X, Y : \Omega \to \mathbb{R}$ always denote $\mathcal{F}$ -measurable random variables.
$\mathbb{E}[X]$ and $\int_\Omega X \, d\mathbb{P}$ are used interchangeably — they are the same object.
"Integrable" without qualification means $\mathbb{E}[|X|] < \infty$ (i.e., $X \in L^1(\Omega, \mathcal{F}, \mathbb{P})$ ).
"a.s." means " $\mathbb{P}$ -almost surely" — holding except on a set of probability zero.

Theory

1. Simple Functions

The Lebesgue integral is built in three ascending stages: (1) simple functions, (2) non-negative measurables, (3) general integrable functions.

DEFINITION

Definition 1.1 (Simple Function). A function $s : \Omega \to \mathbb{R}$ is simple if it takes only finitely many values. Every simple function has a unique standard form

$s = \sum_{i=1}^n a_i \, \mathbf{1}_{A_i},$

where $a_1, \ldots, a_n$ are the distinct values of $s$ and $A_i = s^{-1}(\{a_i\}) \in \mathcal{F}$ are the corresponding preimages. The $A_i$ are disjoint and cover $\Omega$ .

The sets $A_i$ belong to $\mathcal{F}$ because $s$ is measurable ( $\{s = a_i\}$ is the preimage of a Borel set). Their disjointness and covering property follow directly from the fact that the $a_i$ are distinct.

DEFINITION

Definition 1.2 (Integral of a Non-Negative Simple Function). If $s = \sum_{i=1}^n a_i \mathbf{1}_{A_i}$ with $a_i \ge 0$ , the Lebesgue integral of $s$ is

$\int_\Omega s \, d\mathbb{P} := \sum_{i=1}^n a_i \, \mathbb{P}(A_i) \in [0, \infty],$

with the convention $0 \cdot \infty = 0$ .

The integral is a weighted average: value $a_i$ times the probability of the region where $s$ takes that value. This is the natural generalisation of the discrete expected value formula $\mathbb{E}[X] = \sum_k x_k \mathbb{P}(X = x_k)$ .

EXAMPLE

Example 1.3. Let $\Omega = \{H, T\}$ (fair coin), $\mathbb{P}(\{H\}) = \mathbb{P}(\{T\}) = 1/2$ , and $X(H) = 3$ , $X(T) = -1$ . Then $X = 3 \cdot \mathbf{1}_{\{H\}} + (-1) \cdot \mathbf{1}_{\{T\}}$ is simple (but not non-negative; we handle the sign in §3). If instead $X(H) = 3$ , $X(T) = 1$ , then $X$ is a non-negative simple function and

$\int_\Omega X \, d\mathbb{P} = 3 \cdot \tfrac{1}{2} + 1 \cdot \tfrac{1}{2} = 2.$

2. Non-Negative Measurable Functions

Every non-negative measurable function is a pointwise limit of increasing simple functions. This is the foundational approximation lemma of Lebesgue integration.

THEOREM

Theorem 2.1 (Monotone Approximation). For every measurable $f : \Omega \to [0, \infty]$ , there exists a sequence of non-negative simple functions $0 \le s_1 \le s_2 \le \cdots$ with $s_n \nearrow f$ pointwise. One canonical construction is

$s_n(\omega) := \sum_{k=0}^{n \cdot 2^n - 1} \frac{k}{2^n} \, \mathbf{1}_{\left\{\frac{k}{2^n} \le f < \frac{k+1}{2^n}\right\}}(\omega) + n \cdot \mathbf{1}_{\{f \ge n\}}(\omega).$

This truncates $f$ at level $n$ and quantises the range into intervals of width $2^{-n}$ . Each preimage is in $\mathcal{F}$ by measurability.

DEFINITION

Definition 2.2 (Integral of a Non-Negative Measurable Function). For measurable $f : \Omega \to [0, \infty]$ ,

$\int_\Omega f \, d\mathbb{P} := \sup \left\{ \int_\Omega s \, d\mathbb{P} : 0 \le s \le f,\ s \text{ simple} \right\}.$

The value may be $+\infty$ . Theorem 2.1 guarantees the supremum is achieved by the canonical approximating sequence.

THEOREM

Theorem 2.3 (Monotone Convergence Theorem — MCT). Let $0 \le f_1 \le f_2 \le \cdots$ be measurable functions with $f_n \nearrow f$ pointwise ( $\mathbb{P}$ -a.s.). Then

$\lim_{n \to \infty} \int_\Omega f_n \, d\mathbb{P} = \int_\Omega f \, d\mathbb{P}.$

The MCT justifies swapping the limit and the integral whenever the sequence is increasing. It is used to compute the expectation of any non-negative random variable as the limit of expectations of bounded approximations.

REMARK

The increasing condition is essential. Let $\Omega = \mathbb{R}$ with Lebesgue measure $\lambda$ and set $f_n = \mathbf{1}_{[n, n+1]}$ . Then $f_n \to 0$ pointwise, yet $\int f_n \, d\lambda = 1$ for all $n$ . The limit and integral do not commute because the mass "escapes to infinity" — an MCT failure. This is precisely the phenomenon behind heavy-tailed return distributions in finance where variance contributions at large values do not vanish.

3. General Integrable Functions

Every measurable $f : \Omega \to \mathbb{R}$ decomposes as $f = f^+ - f^-$ , where $f^+ := \max(f, 0) \ge 0$ and $f^- := \max(-f, 0) \ge 0$ are both non-negative and measurable.

DEFINITION

Definition 3.1 (Lebesgue Integral). For a measurable $f : \Omega \to \mathbb{R}$ :

$\int_\Omega f \, d\mathbb{P} := \int_\Omega f^+ \, d\mathbb{P} - \int_\Omega f^- \, d\mathbb{P},$

provided at least one term is finite. If both terms are finite — equivalently, if $\int_\Omega |f| \, d\mathbb{P} < \infty$ — then $f$ is $\mathbb{P}$ -integrable, written $f \in L^1(\Omega, \mathcal{F}, \mathbb{P})$ .

THEOREM

Theorem 3.2 (Standard Properties). For $f, g \in L^1$ and constants $a, b \in \mathbb{R}$ :

Linearity: $\displaystyle\int (af + bg) \, d\mathbb{P} = a\!\int f \, d\mathbb{P} + b\!\int g \, d\mathbb{P}$ .
Monotonicity: $f \le g$ a.s. $\Rightarrow$ $\displaystyle\int f \, d\mathbb{P} \le \int g \, d\mathbb{P}$ .
Null sets: modifying $f$ on a $\mathbb{P}$ -null set does not change $\displaystyle\int f \, d\mathbb{P}$ .
Triangle inequality: $\displaystyle\left|\int f \, d\mathbb{P}\right| \le \int |f| \, d\mathbb{P}$ .

THEOREM

Theorem 3.3 (Dominated Convergence Theorem — DCT). Let $(f_n)$ be measurable functions with $f_n \to f$ $\mathbb{P}$ -a.s. If there exists $g \in L^1$ with $|f_n| \le g$ $\mathbb{P}$ -a.s. for all $n$ , then $f \in L^1$ and

$\lim_{n \to \infty} \int_\Omega f_n \, d\mathbb{P} = \int_\Omega f \, d\mathbb{P}.$

The dominating function $g$ prevents mass from escaping. In pricing, the DCT is the justification for differentiating expectations to compute Greeks — provided the derivative of the payoff with respect to the parameter (spot, vol, rate) is dominated by an integrable function.

4. Expectation

DEFINITION

Definition 4.1 (Expectation). The expectation (expected value) of an integrable random variable $X$ on $(\Omega, \mathcal{F}, \mathbb{P})$ is

$\mathbb{E}[X] := \int_\Omega X \, d\mathbb{P}.$

This is not a separate definition. The symbol $\mathbb{E}[X]$ is shorthand for the Lebesgue integral of $X$ against $\mathbb{P}$ . All of Theorem 3.2 applies directly:

$\mathbb{E}[aX + bY] = a\,\mathbb{E}[X] + b\,\mathbb{E}[Y], \qquad X \le Y \text{ a.s.} \Rightarrow \mathbb{E}[X] \le \mathbb{E}[Y], \qquad |\mathbb{E}[X]| \le \mathbb{E}[|X|].$

When $X$ has a density $p_X$ with respect to Lebesgue measure on $\mathbb{R}$ , the change-of-variables formula (image measure) recovers the classical expression:

$\mathbb{E}[X] = \int_{-\infty}^\infty x \, p_X(x) \, dx.$

This is not a separate theorem — it follows from the general machinery of pushforward measures.

INSIGHT

Risk-neutral pricing is an expectation. Under the risk-neutral measure $\mathbb{Q}$ , a European derivative with payoff $\Phi(S_T)$ is priced as

$V_0 = e^{-rT} \mathbb{E}^\mathbb{Q}[\Phi(S_T)] = e^{-rT} \int_\Omega \Phi(S_T(\omega)) \, d\mathbb{Q}(\omega).$

Monte Carlo samples $N$ paths and estimates this as $e^{-rT} \cdot N^{-1} \sum_{k=1}^N \Phi(S_T(\omega_k))$ . The strong law of large numbers — a theorem about Lebesgue integrals — guarantees convergence $\mathbb{Q}$ -a.s. as $N \to \infty$ .

5. Key Inequalities

THEOREM

Theorem 5.1 (Jensen's Inequality). Let $X \in L^1$ and $\varphi : \mathbb{R} \to \mathbb{R}$ be a convex function with $\varphi(X) \in L^1$ . Then

$\varphi\!\bigl(\mathbb{E}[X]\bigr) \le \mathbb{E}[\varphi(X)].$

Proof. Set $\mu = \mathbb{E}[X]$ . Convexity of $\varphi$ at $\mu$ means there exists a supporting line: a constant $c \in \mathbb{R}$ such that $\varphi(x) \ge \varphi(\mu) + c(x - \mu)$ for all $x$ . Substituting $x = X(\omega)$ and taking expectations: $\mathbb{E}[\varphi(X)] \ge \varphi(\mu) + c\,(\mathbb{E}[X] - \mu) = \varphi(\mu) = \varphi(\mathbb{E}[X]). \qquad \square$

EXAMPLE

Example 5.2 (Call lower bound). The call payoff $(S_T - K)^+$ is convex in $S_T$ . Jensen's inequality gives

$\bigl(\mathbb{E}^\mathbb{Q}[S_T] - K\bigr)^+ \le \mathbb{E}^\mathbb{Q}[(S_T - K)^+].$

Under the risk-neutral measure, $\mathbb{E}^\mathbb{Q}[S_T] = S_0 e^{rT}$ (forward price). Multiplying by $e^{-rT}$ :

$V_0 = e^{-rT}\mathbb{E}^\mathbb{Q}[(S_T-K)^+] \ge (S_0 - Ke^{-rT})^+.$

This is the intrinsic-value lower bound for a European call — derived purely from convexity, without Black-Scholes.

THEOREM

Theorem 5.3 (Markov's Inequality). For a non-negative $X$ and $\alpha > 0$ :

$\mathbb{P}(X \ge \alpha) \le \frac{\mathbb{E}[X]}{\alpha}.$

Proof. $\mathbb{E}[X] \ge \int_{\{X \ge \alpha\}} X \, d\mathbb{P} \ge \alpha \, \mathbb{P}(X \ge \alpha)$ . $\square$

THEOREM

Theorem 5.4 (Chebyshev's Inequality). For $X \in L^2$ with $\mu = \mathbb{E}[X]$ , $\sigma^2 = \text{Var}(X)$ , and any $k > 0$ :

$\mathbb{P}(|X - \mu| \ge k\sigma) \le \frac{1}{k^2}.$

Proof. Apply Markov's inequality to $(X - \mu)^2$ with $\alpha = k^2\sigma^2$ . $\square$

REMARK

Markov and model-free risk bounds. If a portfolio's daily P&L has known mean $\mu$ and the loss is non-negative after shifting, Markov's inequality bounds the tail probability without any distributional assumption. The bound is often loose in practice — real loss distributions have far heavier tails than Markov implies — but it is the only bound achievable from the mean alone.

6. Geometric Intuition

Range-partitioning versus domain-partitioning. The Riemann integral approximates the area under a curve by summing thin vertical rectangles of width $\Delta x$ . The Lebesgue integral sums thin horizontal strips of height $\Delta y$ : for each level $y$ , the "width" is not $\Delta x$ but $\mathbb{P}(\{y \le f < y + \Delta y\})$ , the probability of the preimage. On abstract $\Omega$ there is no natural x-axis to partition — the range-based approach is the only one available.

Jensen geometrically. If $\varphi$ is convex, its graph curves upward. The mean $\mu = \mathbb{E}[X]$ is the centre of mass of $X$ 's distribution under $\mathbb{P}$ . The point $(\mu, \varphi(\mu))$ lies on the curve. The value $\mathbb{E}[\varphi(X)]$ is the probability-weighted average of heights of the curve — always at least as high as the height at the centre of mass, because curvature pushes the average up. Equality holds if and only if $X$ is constant a.s. or $\varphi$ is affine on the support of $X$ .

MCT geometrically. The integral $\int f \, d\mathbb{P}$ is the total probability-weighted "volume under $f$ ." An increasing sequence $f_n \nearrow f$ fills this volume from below: each additional slice of the range adds non-negative probability mass. The integral accumulates monotonically to the full volume.

Validation

The companion notebook verifies the following claims computationally on finite probability spaces using exact rational arithmetic:

Simple function integral — $\mathbb{E}[X]$ computed as $\sum_i a_i \mathbb{P}(A_i)$ on a discrete $\Omega$ , confirmed against the naive weighted sum.
MCT verification — a non-decreasing sequence $f_n \nearrow f$ on a discrete space; confirm $\int f_n \nearrow \int f$ .
DCT verification — a bounded convergent sequence on a discrete space; confirm limit and integral commute.
Jensen's inequality — checked for $\varphi(x) = x^2$ , $\varphi(x) = e^x$ , and the call payoff $(x-K)^+$ .
Markov and Chebyshev — verified that the bounds hold exactly for specific distributions.

PRACTICE

Before opening the notebook, try the following by hand:

Let $\Omega = \{1, 2, 3, 4\}$ with $\mathbb{P}(\{k\}) = 1/4$ for each $k$ . Define $X(\omega) = \omega - 2.5$ , so $X \in \{-1.5,\ -0.5,\ 0.5,\ 1.5\}$ .

Compute $\mathbb{E}[X]$ directly from Definition 1.2.
Compute $\mathbb{E}[X^2]$ and verify Jensen with $\varphi = x^2$ : check $(\mathbb{E}[X])^2 \le \mathbb{E}[X^2]$ .
Compute $\mathbb{P}(|X| \ge 1)$ and verify Markov with $\alpha = 1$ : check $\mathbb{P}(|X| \ge 1) \le \mathbb{E}[|X|]/1$ .

Limitations

$L^1$ is the minimum; $L^2$ is what stochastic calculus needs. The Itô integral (Module 1 of Stochastic Calculus) is constructed for processes satisfying $\mathbb{E}[\int_0^T f_t^2 \, dt] < \infty$ . This is an $L^2$ condition — square-integrability, not just integrability. A random variable in $L^1 \setminus L^2$ cannot serve as an Itô integrand. Practically: power-law tailed payoffs (e.g., leveraged volatility products) may be in $L^1$ but fail $L^2$ integrability, invalidating standard Itô calculus arguments.

Non-integrable random variables are real. The Cauchy distribution has $\mathbb{E}[|X|] = \infty$ — the expectation does not exist. A model that assumes finite mean when the true distribution is heavy-tailed (Pareto with tail index $\alpha < 1$ ) is not a conservative approximation; it is a misspecification. This arises in extreme credit events and in certain volatility-of-volatility models.

Fubini's theorem is not free. To interchange the order of a double integral $\int \int f \, d\mathbb{P} \, d\mathbb{Q}$ , the Fubini–Tonelli theorem requires $f \in L^1$ of the product measure (or $f \ge 0$ for Tonelli). This condition is non-trivial for joint distributions under stochastic vol models, pricing in multi-currency frameworks (FX triangle), or computing expected values of path-dependent integrals.

Differentiating under $\mathbb{E}$ requires DCT. The formula $\partial_\theta \mathbb{E}[f(X, \theta)] = \mathbb{E}[\partial_\theta f(X, \theta)]$ holds if $|\partial_\theta f| \le g$ with $g \in L^1$ . For discontinuous payoffs (digitals, barriers, first-touch), $\partial_\theta f$ can be a Dirac delta — not integrable — and the formula fails. The resulting bias in finite-difference Greeks is not a numerical artefact; it is a mathematical failure of the interchange condition.

WARNING

" $\mathbb{E}[X]$ exists" is not automatic. Always verify $\mathbb{E}[|X|] < \infty$ before applying linearity, Jensen, or the DCT. For a product $XY$ , existence of $\mathbb{E}[X]$ and $\mathbb{E}[Y]$ separately does not imply $\mathbb{E}[XY]$ is finite — for that you need $X \in L^2$ and $Y \in L^2$ (Cauchy-Schwarz: $\mathbb{E}[|XY|] \le \|X\|_2 \|Y\|_2$ ).

Lebesgue Integration and Expectation

Setup

From Riemann to Lebesgue

Conventions

Theory

1. Simple Functions

2. Non-Negative Measurable Functions

3. General Integrable Functions

4. Expectation

5. Key Inequalities

6. Geometric Intuition

Validation

Limitations

The Interview Angle requires Premium