Brownian Bridge™

Setup

Why Market Impact Estimation Matters

Market impact is the price change caused by a trader's own activity. Correctly estimating market impact is essential for:

Transaction cost analysis (TCA): decomposing the implementation shortfall into pre-trade estimate and realised impact to assess execution quality.
Optimal execution: the Almgren-Chriss model requires the impact parameters $\eta$ (temporary) and $\gamma$ (permanent) as inputs; these must be estimated from data.
Portfolio construction: large-scale optimisers (BlackRock Aladdin, Axioma) incorporate impact costs as a non-linear transaction cost penalty; wrong impact estimates lead to over-trading.
Strategy capacity analysis: an alpha strategy with zero capacity above a certain AUM (assets under management) is unscalable; impact estimation defines this limit.

Conventions

Price impact: the change in mid-price attributable to a trade of size $Q$ shares. Measured as a fraction of the pre-trade mid (in basis points) or in ticks.
Participation rate: $\eta_{\mathrm{part}} = Q / V$ , the fraction of average daily volume (ADV) traded. ADV is typically the 20- or 60-day average.
Normalised quantity: $q = Q / (V \cdot \sigma_{\mathrm{daily}}^{1/2})$ in some models (Kyle normalisation).
Convention: impact is measured as the mid-price move from the start of the trade to some point after completion. "Instantaneous" impact is measured at the moment of the last fill; "realised" or "total" impact is measured 15–30 minutes after completion (allowing temporary impact to decay).

Kyle's Lambda: The Linear Impact Model

The Kyle (1985) Model

Kyle (1985) studies a single-period model with one informed trader, one market maker, and uninformed noise traders. The informed trader submits order $x$ (unknown to the market maker); noise traders submit $u \sim \mathcal{N}(0, \sigma_u^2)$ . The market maker observes total order flow $y = x + u$ and sets price $P = \mathbb{E}[V \mid y]$ .

Assumptions:

True value: $V \sim \mathcal{N}(\mu_0, \Sigma_0)$ .
Informed trader maximises expected profit: $\Pi = (V - P)x$ , choosing $x$ optimally.
Market maker is competitive: sets $P$ to be a martingale.

Equilibrium (Linear): In the unique linear equilibrium:

$P = \mu_0 + \lambda\, y, \qquad \lambda = \frac{1}{2\sigma_u}\sqrt{\Sigma_0},$

where $\lambda > 0$ is Kyle's lambda — the price sensitivity to order flow. The informed trader submits:

$x^* = \frac{V - \mu_0}{2\lambda} = \frac{\sigma_u}{\sqrt{\Sigma_0}}(V - \mu_0),$

which hides in the noise: $x^* \sim \mathcal{N}(0, \sigma_u^2)$ , indistinguishable from noise trading.

Kyle's Lambda as a Price Impact Coefficient

In continuous time, Kyle (1985) and subsequent work show that the permanent price impact per unit of signed order flow is:

$\Delta P = \lambda \cdot Q_{\mathrm{signed}},$

where $Q_{\mathrm{signed}}$ is the net signed order flow (positive for buys, negative for sells) and $\lambda$ has units of $/\text{share}^2 $(or$ $/\text{share}$ per share of flow).

Estimation from tick data. Regress mid-price changes on signed order flow:

$\Delta M_{t} = \alpha + \lambda \cdot \mathrm{OF}_t + \varepsilon_t,$

where $\mathrm{OF}_t = \sum_{i: t_i \in \Delta t} q_i \cdot \mathrm{sign}(P_i - M_{i-1})$ is the signed order flow in the interval $[t-\Delta t, t]$ (the Lee-Ready rule assigns sign using the trade direction relative to the preceding mid). The slope $\lambda$ is Kyle's lambda for that stock and time period.

Units: for a stock with ADV = 10 million shares and daily volatility of 1%, a typical $\lambda \approx 10^{-7}$ to $10^{-6}$ $/share $^2$ , consistent with 1–10 bps impact per 1% of ADV traded.

The Square-Root Law

Empirical Evidence

A robust empirical finding across asset classes (equities, futures, FX) is that the expected price impact of a meta-order of size $Q$ follows a square-root law:

$\mathrm{Impact}(Q) \approx \sigma_{\mathrm{daily}} \cdot Y \cdot \sqrt{\frac{Q}{V}},$

where:

$\sigma_{\mathrm{daily}}$ : daily volatility of the asset.
$V$ : average daily volume (ADV).
$Y$ : a dimensionless constant, typically $Y \in [0.5, 1.5]$ (empirically $\approx 1$ for large-cap equities).
$Q/V$ : participation rate (fraction of ADV).

Key references: Almgren et al. (2005), Torre and Ferrari (1997), Grinold and Kahn (1999), Zarinelli et al. (2015).

The square-root law has several remarkable properties:

Universal across markets: the same functional form fits equities, futures, and FX with different $Y$ but the same square-root dependence on quantity.
Concave in $Q$ : large orders have sublinear impact — doubling the order size less than doubles the price move. This is consistent with limit order book dynamics: large orders are filled across many price levels; the first shares hit the best ask, subsequent shares reach successively less liquid levels.
Inconsistency with linear models: the linear model ( $\lambda Q$ ) used in Kyle and Almgren-Chriss is an approximation for small $Q/V$ . The empirically correct law is nonlinear.

Theoretical Justification

Gabaix et al. (2006) and Farmer et al. (2013) derive the square-root law from order book theory. The argument:

Consider a book with a power-law distribution of order sizes $P[q \geq x] \propto x^{-\alpha}$ with $\alpha \approx 3/2$ . A meta-order of size $Q$ must consume all orders up to some depth $d$ . The expected depth required is $\mathbb{E}[Q_{\mathrm{book}}(d)] \propto d^{1/(1-1/\alpha)} = d^{2}$ for $\alpha = 3/2$ . Inverting: $d \propto Q^{1/2}$ . The impact is proportional to $d$ , giving $\mathrm{Impact} \propto Q^{1/2}$ .

More formally, the concave impact function $h(v) \propto v^\beta$ with $\beta \approx 1/2$ is consistent with market impact theory. Almgren et al. (2005) propose the power-law model:

$h(v) = \eta\, \sigma \left(\frac{v}{V_{\mathrm{ADV}}}\right)^\beta, \qquad \beta \approx 0.6.$

Transient vs Permanent Impact

Decomposition

The total price move after a trade consists of:

Permanent impact: the component that remains permanently in the mid-price after the trade is completed. Attributable to information: the trade reveals private information about the true value.
Temporary (transient) impact: the component that reverts after the trade is completed. Attributable to liquidity costs: the market maker charges a premium for immediacy, which dissipates as new limit orders refill the book.

Over a horizon $\tau$ after completion:

$\Delta P_\tau = \underbrace{\Delta P_\infty}_{\text{permanent}} + \underbrace{(\Delta P_0 - \Delta P_\infty) e^{-\rho\tau}}_{\text{transient, decaying at rate }\rho}.$

In practice: immediate impact $\Delta P_0$ is measured at execution; permanent impact $\Delta P_\infty$ is measured 1–2 hours after completion (when temporary effects have decayed). The transient fraction $(\Delta P_0 - \Delta P_\infty)/\Delta P_0$ is the reversion fraction, typically 40–60% for equities.

Estimation from Tick Data

Data Pipeline

To estimate $\lambda$ (or the square-root law constant $Y$ ) from trade and quote data:

Clean tick data: remove outliers (errors, clearly erroneous prints), apply exchange-specific adjustments (NBBO for US equities, consolidated tape).
Trade signing: assign each trade a sign (+1 for buyer-initiated, -1 for seller-initiated) using the Lee-Ready (1991) algorithm:
- Tick rule: if trade price $> P_{\text{prev}}$ , buyer-initiated; if $< P_{\text{prev}}$ , seller-initiated.
- Quote rule: if trade price $> M$ , buyer-initiated; if $< M$ , seller-initiated.
- Combine: quote rule takes precedence; tick rule used when trade is at the mid.
Aggregate to meta-orders: identify sequences of trades on the same side likely belonging to a single institutional order (e.g., using the Almgren-Tetlock-Atkins algorithm, which clusters consecutive same-side trades within a time window with low volume between them).

OLS Regression for Kyle's Lambda

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression

def estimate_kyle_lambda(
    mid_prices: np.ndarray,     # mid-price time series
    signed_flow: np.ndarray,    # signed order flow per interval (shares, signed)
    interval_sec: float = 60.0  # interval length in seconds
) -> dict:
    """
    Estimate Kyle's lambda from tick data via OLS:
        delta_M = alpha + lambda * OF + epsilon

    Inputs:
        mid_prices:  array of mid-prices, shape (N+1,)
        signed_flow: array of signed order flow per interval, shape (N,)
        interval_sec: duration of each interval (for annualising)

    Returns:
        dict with keys: lambda_, alpha, r_squared, t_stat_lambda
    """
    delta_M = np.diff(mid_prices)   # N observations
    X       = signed_flow.reshape(-1, 1)
    y       = delta_M

    model   = LinearRegression(fit_intercept=True).fit(X, y)
    lam     = float(model.coef_[0])
    alpha   = float(model.intercept_)
    r_sq    = float(model.score(X, y))

    # t-statistic for lambda
    residuals  = y - model.predict(X)
    n          = len(y)
    se_lam     = np.std(residuals) / (np.std(X.ravel()) * np.sqrt(n))
    t_stat     = lam / se_lam

    return dict(lambda_=lam, alpha=alpha, r_squared=r_sq, t_stat_lambda=t_stat)


def estimate_sqrt_impact(
    meta_orders: pd.DataFrame,
    # Columns: quantity (shares), adv (shares/day), daily_vol (fraction),
    #          impact_bps (measured impact in basis points)
) -> dict:
    """
    Estimate the square-root impact law: impact = Y * sigma * sqrt(Q/V)
    via OLS on log-transformed data.

    Model: log(impact / sigma) = log(Y) + 0.5 * log(Q/V)
    Estimated exponent beta (free, not constrained to 0.5).
    """
    participation = meta_orders['quantity'] / meta_orders['adv']
    normalised_impact = meta_orders['impact_bps'] / (meta_orders['daily_vol'] * 1e4)

    # OLS in log space
    log_x  = np.log(participation.values)
    log_y  = np.log(normalised_impact.clip(lower=1e-8).values)

    X      = np.column_stack([np.ones_like(log_x), log_x])
    beta_hat, _ = np.linalg.lstsq(X, log_y, rcond=None)[:2]

    log_Y  = beta_hat[0]
    beta   = beta_hat[1]   # should be ~0.5 if square-root law holds
    Y      = np.exp(log_Y)

    return dict(Y=Y, beta=beta,
                model_str=f"impact = {Y:.3f} * sigma * (Q/V)^{beta:.3f}")

Bias Corrections

Adverse selection bias. Large trades are more likely to be by informed traders, who select trades when they have private information — i.e., when the market is about to move in their direction anyway. Regressing impact on quantity conflates the informational component (not caused by the trade) with the mechanical component (caused by the trade). Correction: use instrument variables (IV) or restrict to liquidity-motivated trades.

Bid-ask bounce. Alternating buys and sells create artificial negative autocorrelation in mid-price changes. Use mid-price (not transaction price) changes and trade-signed flow (not tick direction) to mitigate this.

Time aggregation. Impact measured over different horizons (15 min, 1 hour, 1 day) gives different $\lambda$ estimates due to impact decay. Always state the measurement horizon explicitly.

Limitations

Model identification. The square-root law is a reduced-form relationship; it does not identify the mechanism (permanent vs temporary) or separate information from liquidity. Structural models (Kyle, Glosten-Milgrom) separate these but require strong assumptions.

Non-stationarity. Kyle's lambda varies with market conditions: it is low in liquid markets (tight spreads, deep books) and high in stressed markets (wide spreads, thin books). A single estimated $\lambda$ is a time-average that misrepresents the dynamic nature of impact.

Simultaneity bias. Large market participants affect prices by their trades, but they also respond to prices. Regression of price change on signed flow has a simultaneity bias: the trader accelerates execution when the price moves against them (buying faster as the price rises), creating a positive bias in the estimated $\lambda$ .

Meta-order identification. Correctly identifying the boundaries of a single institutional order from exchange-level trade data is an unsolved problem. Different identification algorithms produce different estimates of $Q$ (the order size) and therefore different estimates of the impact function. Almgren, Tetlock, and Atkins (2005) provide one approach; the practical accuracy is limited.

Cross-impact. In a multi-asset portfolio, trading asset $i$ can impact the price of correlated asset $j$ . This cross-impact is often ignored in single-asset models but is material for ETF trading, pairs strategies, and multi-asset execution.

Interview Angle

L1. What is Kyle's lambda? Give the formula and explain its units. How would you estimate it from Level-2 tick data for a liquid equity? What biases might affect your estimate?

Kyle's lambda: the coefficient relating signed order flow to permanent price impact: $\Delta P = \lambda \cdot \mathrm{OF}$ , where $\lambda$ has units of $/share per share (price per quantity). Estimation: regress 1-minute mid-price changes on 1-minute net signed order flow; $\lambda$ is the OLS slope. Units: $/share $^2$ (each share of net buy flow causes $\lambda$ dollars of price rise per share). Biases: adverse selection bias (informed traders cause prices to move before trading, not because of trading); bid-ask bounce (transaction price oscillation inflates impact estimates); time-aggregation (short-horizon $\lambda$ is higher than long-horizon due to impact reversion).

L2. State the square-root law of market impact. Derive the optimal temporary impact cost in the Almgren-Chriss framework if $h(v) \propto \sigma\sqrt{v/V}$ (square-root, not linear). Why does the square-root form make sense from a limit order book perspective?

Square-root law: $\mathrm{Impact}(Q) \approx Y\sigma_{\mathrm{daily}}\sqrt{Q/V}$ . For linear AC model, $h(v) = \eta v$ gives cost $\eta\int v^2\, dt$ ; for square-root, $h(v) = \eta\sigma\sqrt{v/V}$ gives cost $\eta\sigma\int\sqrt{v/V}\, dt$ — a concave cost function. The AC optimisation with $h(v) \propto v^\beta$ gives a trade-off between $\int v^{1+\beta}\, dt$ and $\lambda\sigma^2\int x^2\, dt$ ; the optimal schedule has the form $x^*(t) \propto (T-t)^{1/\beta}$ , which for $\beta = 1/2$ gives $x^*(t) \propto (T-t)^2$ (more front-loaded than TWAP). LOB intuition: the book has finite depth. The first $\Delta Q$ shares sit at the best ask (cheapest); subsequent shares sit at successively worse prices. If order sizes follow a power law, the price to consume depth $d$ scales as $d^{1/2}$ — a square root of quantity consumed.

L3. Explain the permanent vs temporary decomposition of market impact. How would you estimate each component from execution data? Discuss the implications for the Almgren-Chriss framework: what does it mean if the permanent impact is close to zero, and what does it mean if it is large?

Decomposition: $\Delta P_\tau = \Delta P_\infty + (\Delta P_0 - \Delta P_\infty)e^{-\rho\tau}$ . Measure $\Delta P_0$ (impact at last fill) and $\Delta P_\infty$ (impact 1 hour later, when temporary has decayed). The permanent fraction $f = \Delta P_\infty / \Delta P_0$ and transient fraction $1 - f$ are estimated from a sample of meta-orders.

Implications: If $\Delta P_\infty \approx 0$ (nearly all impact is transient): the trade contained little information; the market maker's premium reverts as new limit orders refill the book. Execution cost is primarily temporary. This favours patient execution (TWAP or slower) because temporary impact decays. If $\Delta P_\infty \approx \Delta P_0$ (nearly all impact is permanent): the trade is informative; the new price level reflects updated fair value. Patient execution is dangerous — waiting exposes the trader to the permanent price move on the unexecuted residual. In Almgren-Chriss, the permanent parameter $\gamma$ captures this: large $\gamma$ means the unexecuted position marks against the trader, increasing the urgency of execution.