Context
Let be a probability space and
a sub--algebra.
We work in the Hilbert space with inner product
Define the closed subspace
Theorem (Projection Characterization)
For any , the conditional expectation
is the unique orthogonal projection of onto .
Equivalently:
Orthogonality Property
For all ,
In particular, it is sufficient to check this for
, .
Proof Sketch
- is a Hilbert space.
- is a closed subspace.
- Every element of a Hilbert space admits a unique orthogonal projection onto a closed subspace.
- The projection satisfies the defining property of conditional expectation.
Hence:
Interpretation
- : future payoff or random outcome
- : available information
- : best mean-square predictor using only
This underlies:
- filtering
- least-squares Monte Carlo
- regression-based pricing
- risk-neutral valuation
Finite Partition Example
If ,
The projection is piecewise constant on the partition.
Quant Finance Example
Let and .
Then: is the Black–Scholes call price function.
This is the best estimator of the payoff given today’s spot.
Key Takeaways
- Conditional expectation = orthogonal projection in
- Optimal under squared loss
- Foundational for pricing, filtering, and regression methods
Interview Angle
L1: What does it mean that conditional expectation is the best predictor? Why do we use squared loss rather than absolute loss in this context?
is the unique -measurable random variable minimising over all -measurable . "Best" means no other predictor using only the information in achieves smaller expected squared error. The optimality is global: it holds simultaneously for all realisations in the sense, not just on average.
Squared loss is preferred in this Hilbert space framework for structural reasons. is a Hilbert space: it has an inner product , an induced norm , and — crucially — the projection theorem holds. Every element has a unique orthogonal projection onto any closed subspace, and this projection is exactly the conditional expectation. Under absolute loss (), the optimal predictor is the conditional median, not the conditional mean; is a Banach space but not a Hilbert space, so the orthogonal projection structure does not apply and the mathematics is considerably more complex.
In quant finance, squared loss also has a natural economic interpretation: it penalises large hedging errors more than small ones, matching the quadratic P&L structure of a delta-hedged position.
L2: Prove that is a closed subspace of . Why does closedness matter for the projection theorem?
is a subspace. It is clearly non-empty (contains 0). If and , then is -measurable (measurability is preserved under linear combinations) and square-integrable (by the triangle inequality in ), so it belongs to .
Closedness. Let be a sequence converging to in -norm: . We need , i.e., is -measurable. -convergence implies a subsequence almost surely. Since each is -measurable, and a.s. limits of -measurable functions are -measurable (by completeness of the probability space and the fact that is closed under a.s. limits of measurable functions), we conclude .
Why closedness matters. The projection theorem in Hilbert spaces states: for every and every closed subspace , there exists a unique with for all . Without closedness, the infimum may not be attained: a minimising sequence converges to a limit outside , and no projection exists. Closedness ensures the minimiser is achieved inside the subspace.
L3: How does the projection interpretation of conditional expectation underpin least-squares Monte Carlo (Longstaff-Schwartz)? What is the connection to the martingale representation theorem?
Least-squares Monte Carlo (Longstaff-Schwartz). For an American option, the continuation value at time along path is , where is the (discounted) continuation value at the next step. This conditional expectation is a function of alone (by the Markov property of GBM), so it lies in . Longstaff-Schwartz approximates this function by projecting onto a finite-dimensional subspace of — spanned by a chosen basis of functions (e.g., Laguerre polynomials). The -projection coefficient vector is estimated by OLS regression across Monte Carlo paths. The algorithm is exactly the sample-path approximation of the projection theorem.
The key structural insight: the projection interpretation guarantees that the OLS estimator converges (as the number of paths and basis dimension at appropriate rates) to the true conditional expectation. Convergence is measured in , matching the norm of the underlying Hilbert space.
Connection to the martingale representation theorem (MRT). The MRT states that in a Brownian filtration, any -martingale can be written as: for a unique adapted process . This is a representation theorem in : the space of square-integrable -martingales is isometric (via the Itô isometry) to the space of square-integrable adapted processes . The MRT says the projection of any functional of the Brownian path onto the subspace of stochastic integrals is surjective — every such functional is reachable. In pricing terms: the option payoff can be projected onto the subspace of self-financing portfolios (stochastic integrals), and the resulting is the delta hedge. The projection that is conditional expectation produces the option price; the projection that is MRT produces the hedge. Both are faces of the same Hilbert space geometry.