Setup
From Riemann to Lebesgue
The Riemann integral partitions the domain — the x-axis — into thin vertical slices and approximates by step functions on those slices. For continuous functions on bounded intervals this works well. It fails in three ways that matter for finance:
-
Discontinuous functions. The indicator (1 on rationals, 0 elsewhere) is not Riemann integrable. Its Lebesgue integral over is — the rationals have measure zero — yet this cannot be recovered from Riemann's construction.
-
Limit interchange. A uniformly bounded sequence of Riemann-integrable functions can converge pointwise to a function that is not Riemann integrable. The Lebesgue theory has clean theorems (MCT, DCT) governing when limits and integrals commute.
-
Abstract sample spaces. is not the real line. Asset price paths live in ; there is no natural x-axis to partition. Expectation must be defined on an arbitrary measurable space.
The Lebesgue integral resolves all three by partitioning the range of — the y-axis — into thin horizontal strips, measuring the probability of the preimage of each strip, and summing.
Why this matters on a derivatives desk. Every option price is an expectation: . Monte Carlo is a law-of-large-numbers approximation to this Lebesgue integral. Changing between the physical measure and the risk-neutral measure (Girsanov's theorem, Module 3 of this course) is a change in the integrating measure — it has precise meaning only in the Lebesgue framework.
Conventions
- is a probability space as defined in Module 1.
- always denote -measurable random variables.
- and are used interchangeably — they are the same object.
- "Integrable" without qualification means (i.e., ).
- "a.s." means "-almost surely" — holding except on a set of probability zero.
Theory
1. Simple Functions
The Lebesgue integral is built in three ascending stages: (1) simple functions, (2) non-negative measurables, (3) general integrable functions.
Definition 1.1 (Simple Function). A function is simple if it takes only finitely many values. Every simple function has a unique standard form
where are the distinct values of and are the corresponding preimages. The are disjoint and cover .
The sets belong to because is measurable ( is the preimage of a Borel set). Their disjointness and covering property follow directly from the fact that the are distinct.
Definition 1.2 (Integral of a Non-Negative Simple Function). If with , the Lebesgue integral of is
with the convention .
The integral is a weighted average: value times the probability of the region where takes that value. This is the natural generalisation of the discrete expected value formula .
Example 1.3. Let (fair coin), , and , . Then is simple (but not non-negative; we handle the sign in §3). If instead , , then is a non-negative simple function and
2. Non-Negative Measurable Functions
Every non-negative measurable function is a pointwise limit of increasing simple functions. This is the foundational approximation lemma of Lebesgue integration.
Theorem 2.1 (Monotone Approximation). For every measurable , there exists a sequence of non-negative simple functions with pointwise. One canonical construction is
This truncates at level and quantises the range into intervals of width . Each preimage is in by measurability.
Definition 2.2 (Integral of a Non-Negative Measurable Function). For measurable ,
The value may be . Theorem 2.1 guarantees the supremum is achieved by the canonical approximating sequence.
Theorem 2.3 (Monotone Convergence Theorem — MCT). Let be measurable functions with pointwise (-a.s.). Then
The MCT justifies swapping the limit and the integral whenever the sequence is increasing. It is used to compute the expectation of any non-negative random variable as the limit of expectations of bounded approximations.
The increasing condition is essential. Let with Lebesgue measure and set . Then pointwise, yet for all . The limit and integral do not commute because the mass "escapes to infinity" — an MCT failure. This is precisely the phenomenon behind heavy-tailed return distributions in finance where variance contributions at large values do not vanish.
3. General Integrable Functions
Every measurable decomposes as , where and are both non-negative and measurable.
Definition 3.1 (Lebesgue Integral). For a measurable :
provided at least one term is finite. If both terms are finite — equivalently, if — then is -integrable, written .
Theorem 3.2 (Standard Properties). For and constants :
- Linearity: .
- Monotonicity: a.s. .
- Null sets: modifying on a -null set does not change .
- Triangle inequality: .
Theorem 3.3 (Dominated Convergence Theorem — DCT). Let be measurable functions with -a.s. If there exists with -a.s. for all , then and
The dominating function prevents mass from escaping. In pricing, the DCT is the justification for differentiating expectations to compute Greeks — provided the derivative of the payoff with respect to the parameter (spot, vol, rate) is dominated by an integrable function.
4. Expectation
Definition 4.1 (Expectation). The expectation (expected value) of an integrable random variable on is
This is not a separate definition. The symbol is shorthand for the Lebesgue integral of against . All of Theorem 3.2 applies directly:
When has a density with respect to Lebesgue measure on , the change-of-variables formula (image measure) recovers the classical expression:
This is not a separate theorem — it follows from the general machinery of pushforward measures.
Risk-neutral pricing is an expectation. Under the risk-neutral measure , a European derivative with payoff is priced as
Monte Carlo samples paths and estimates this as . The strong law of large numbers — a theorem about Lebesgue integrals — guarantees convergence -a.s. as .
5. Key Inequalities
Theorem 5.1 (Jensen's Inequality). Let and be a convex function with . Then
Proof. Set . Convexity of at means there exists a supporting line: a constant such that for all . Substituting and taking expectations:
Example 5.2 (Call lower bound). The call payoff is convex in . Jensen's inequality gives
Under the risk-neutral measure, (forward price). Multiplying by :
This is the intrinsic-value lower bound for a European call — derived purely from convexity, without Black-Scholes.
Theorem 5.3 (Markov's Inequality). For a non-negative and :
Proof. .
Theorem 5.4 (Chebyshev's Inequality). For with , , and any :
Proof. Apply Markov's inequality to with .
Markov and model-free risk bounds. If a portfolio's daily P&L has known mean and the loss is non-negative after shifting, Markov's inequality bounds the tail probability without any distributional assumption. The bound is often loose in practice — real loss distributions have far heavier tails than Markov implies — but it is the only bound achievable from the mean alone.
6. Geometric Intuition
Range-partitioning versus domain-partitioning. The Riemann integral approximates the area under a curve by summing thin vertical rectangles of width . The Lebesgue integral sums thin horizontal strips of height : for each level , the "width" is not but , the probability of the preimage. On abstract there is no natural x-axis to partition — the range-based approach is the only one available.
Jensen geometrically. If is convex, its graph curves upward. The mean is the centre of mass of 's distribution under . The point lies on the curve. The value is the probability-weighted average of heights of the curve — always at least as high as the height at the centre of mass, because curvature pushes the average up. Equality holds if and only if is constant a.s. or is affine on the support of .
MCT geometrically. The integral is the total probability-weighted "volume under ." An increasing sequence fills this volume from below: each additional slice of the range adds non-negative probability mass. The integral accumulates monotonically to the full volume.
Validation
The companion notebook verifies the following claims computationally on finite probability spaces using exact rational arithmetic:
- Simple function integral — computed as on a discrete , confirmed against the naive weighted sum.
- MCT verification — a non-decreasing sequence on a discrete space; confirm .
- DCT verification — a bounded convergent sequence on a discrete space; confirm limit and integral commute.
- Jensen's inequality — checked for , , and the call payoff .
- Markov and Chebyshev — verified that the bounds hold exactly for specific distributions.
Before opening the notebook, try the following by hand:
Let with for each . Define , so .
- Compute directly from Definition 1.2.
- Compute and verify Jensen with : check .
- Compute and verify Markov with : check .
Limitations
is the minimum; is what stochastic calculus needs. The Itô integral (Module 1 of Stochastic Calculus) is constructed for processes satisfying . This is an condition — square-integrability, not just integrability. A random variable in cannot serve as an Itô integrand. Practically: power-law tailed payoffs (e.g., leveraged volatility products) may be in but fail integrability, invalidating standard Itô calculus arguments.
Non-integrable random variables are real. The Cauchy distribution has — the expectation does not exist. A model that assumes finite mean when the true distribution is heavy-tailed (Pareto with tail index ) is not a conservative approximation; it is a misspecification. This arises in extreme credit events and in certain volatility-of-volatility models.
Fubini's theorem is not free. To interchange the order of a double integral , the Fubini–Tonelli theorem requires of the product measure (or for Tonelli). This condition is non-trivial for joint distributions under stochastic vol models, pricing in multi-currency frameworks (FX triangle), or computing expected values of path-dependent integrals.
Differentiating under requires DCT. The formula holds if with . For discontinuous payoffs (digitals, barriers, first-touch), can be a Dirac delta — not integrable — and the formula fails. The resulting bias in finite-difference Greeks is not a numerical artefact; it is a mathematical failure of the interchange condition.
" exists" is not automatic. Always verify before applying linearity, Jensen, or the DCT. For a product , existence of and separately does not imply is finite — for that you need and (Cauchy-Schwarz: ).
Interview Angle
L1 — Junior quant / quant developer
Expected depth: Compute on discrete spaces from first principles; state linearity and monotonicity; apply Jensen to options; know the Markov bound.
Q1. "What is ? Write the formula for a discrete random variable and connect it to the Lebesgue integral."
Expected: — this is when is a simple (discrete) function. Linearity and monotonicity follow immediately from the integral definition. Weak answer: "the average" without probability weights or without connecting to the formal definition.
Q2. "State Jensen's inequality and use it to derive a lower bound for a European call."
Expected: for convex . With and : . Weak answer: states the inequality but cannot derive the finance application.
Q3. "Can fail to exist? Give a concrete example."
Expected: Yes. The Cauchy distribution : the integral diverges. Common in finance: power-law tails with tail index . The sample mean of Cauchy samples does not converge as sample size increases — it fluctuates permanently.
L2 — Senior quant
Expected depth: Explain when DCT justifies differentiating under , derive Jensen via supporting hyperplanes, distinguish from .
Q1. "When can you compute by differentiating under the expectation sign? When does it fail?"
Expected: DCT justifies it when is dominated by an function. For a call, , which is bounded by 1 — DCT applies, yielding . For a digital , the derivative with respect to is , a Dirac delta — not integrable, DCT fails, naive differentiation gives a wrong value.
Q2. "What is the difference between and ? Why does the Itô integral require ?"
Expected: : ; : . on probability spaces (Cauchy-Schwarz). The Itô isometry is defined only when the right side is finite — i.e., when the integrand is . Without this, the stochastic integral is not a martingale and its variance is not controlled.
Q3. "State the Dominated Convergence Theorem precisely and give a pricing application."
Expected: a.s., for all . Application: continuity of option prices in parameters. For a call with strike , a.s. and (under log-normal, ) — so DCT gives continuity of the call price in the strike.
L3 — Quant researcher
Expected depth: Fubini–Tonelli and its conditions, uniform integrability, interpolation, measurability on path space.
Q1. "When can you swap and an infinite sum? State the precise theorem."
Expected: If : MCT gives always (no integrability required). For general : if , set (MCT applied to partial sums), then — DCT gives . For double integrals: Tonelli (non-negative, no condition) or Fubini ( of the product measure) allows interchange of order.
Q2. "What is uniform integrability? Why is it the correct condition for martingale convergence?"
Expected: is uniformly integrable (UI) if as — the tails of are uniformly controlled. A martingale converges in (not just a.s.) if and only if it is UI. Boundedness in () implies UI by Cauchy-Schwarz. This underpins the Optional Stopping Theorem: holds for bounded stopping times, or for UI martingales.
Q3. "Brownian paths live in . What is the natural σ-algebra and measure on this space?"
Expected: The Borel σ-algebra on with the sup-norm topology. This is generated by cylinder sets . The Wiener measure is the unique probability measure on under which a.s., , and increments on disjoint intervals are independent. Its existence is guaranteed by the Kolmogorov extension theorem. Integration of functionals with respect to is what every path-dependent pricing formula computes.