Setup
Why Market Impact Estimation Matters
Market impact is the price change caused by a trader's own activity. Correctly estimating market impact is essential for:
- Transaction cost analysis (TCA): decomposing the implementation shortfall into pre-trade estimate and realised impact to assess execution quality.
- Optimal execution: the Almgren-Chriss model requires the impact parameters (temporary) and (permanent) as inputs; these must be estimated from data.
- Portfolio construction: large-scale optimisers (BlackRock Aladdin, Axioma) incorporate impact costs as a non-linear transaction cost penalty; wrong impact estimates lead to over-trading.
- Strategy capacity analysis: an alpha strategy with zero capacity above a certain AUM (assets under management) is unscalable; impact estimation defines this limit.
Conventions
- Price impact: the change in mid-price attributable to a trade of size shares. Measured as a fraction of the pre-trade mid (in basis points) or in ticks.
- Participation rate: , the fraction of average daily volume (ADV) traded. ADV is typically the 20- or 60-day average.
- Normalised quantity: in some models (Kyle normalisation).
- Convention: impact is measured as the mid-price move from the start of the trade to some point after completion. "Instantaneous" impact is measured at the moment of the last fill; "realised" or "total" impact is measured 15–30 minutes after completion (allowing temporary impact to decay).
Kyle's Lambda: The Linear Impact Model
The Kyle (1985) Model
Kyle (1985) studies a single-period model with one informed trader, one market maker, and uninformed noise traders. The informed trader submits order (unknown to the market maker); noise traders submit . The market maker observes total order flow and sets price .
Assumptions:
- True value: .
- Informed trader maximises expected profit: , choosing optimally.
- Market maker is competitive: sets to be a martingale.
Equilibrium (Linear): In the unique linear equilibrium:
where is Kyle's lambda — the price sensitivity to order flow. The informed trader submits:
which hides in the noise: , indistinguishable from noise trading.
Kyle's Lambda as a Price Impact Coefficient
In continuous time, Kyle (1985) and subsequent work show that the permanent price impact per unit of signed order flow is:
where is the net signed order flow (positive for buys, negative for sells) and has units of $/\text{share}^2$/\text{share}$ per share of flow).
Estimation from tick data. Regress mid-price changes on signed order flow:
where is the signed order flow in the interval (the Lee-Ready rule assigns sign using the trade direction relative to the preceding mid). The slope is Kyle's lambda for that stock and time period.
Units: for a stock with ADV = 10 million shares and daily volatility of 1%, a typical to $/share, consistent with 1–10 bps impact per 1% of ADV traded.
The Square-Root Law
Empirical Evidence
A robust empirical finding across asset classes (equities, futures, FX) is that the expected price impact of a meta-order of size follows a square-root law:
where:
- : daily volatility of the asset.
- : average daily volume (ADV).
- : a dimensionless constant, typically (empirically for large-cap equities).
- : participation rate (fraction of ADV).
Key references: Almgren et al. (2005), Torre and Ferrari (1997), Grinold and Kahn (1999), Zarinelli et al. (2015).
The square-root law has several remarkable properties:
- Universal across markets: the same functional form fits equities, futures, and FX with different but the same square-root dependence on quantity.
- Concave in : large orders have sublinear impact — doubling the order size less than doubles the price move. This is consistent with limit order book dynamics: large orders are filled across many price levels; the first shares hit the best ask, subsequent shares reach successively less liquid levels.
- Inconsistency with linear models: the linear model () used in Kyle and Almgren-Chriss is an approximation for small . The empirically correct law is nonlinear.
Theoretical Justification
Gabaix et al. (2006) and Farmer et al. (2013) derive the square-root law from order book theory. The argument:
Consider a book with a power-law distribution of order sizes with . A meta-order of size must consume all orders up to some depth . The expected depth required is for . Inverting: . The impact is proportional to , giving .
More formally, the concave impact function with is consistent with market impact theory. Almgren et al. (2005) propose the power-law model:
Transient vs Permanent Impact
Decomposition
The total price move after a trade consists of:
-
Permanent impact: the component that remains permanently in the mid-price after the trade is completed. Attributable to information: the trade reveals private information about the true value.
-
Temporary (transient) impact: the component that reverts after the trade is completed. Attributable to liquidity costs: the market maker charges a premium for immediacy, which dissipates as new limit orders refill the book.
Over a horizon after completion:
In practice: immediate impact is measured at execution; permanent impact is measured 1–2 hours after completion (when temporary effects have decayed). The transient fraction is the reversion fraction, typically 40–60% for equities.
Estimation from Tick Data
Data Pipeline
To estimate (or the square-root law constant ) from trade and quote data:
- Clean tick data: remove outliers (errors, clearly erroneous prints), apply exchange-specific adjustments (NBBO for US equities, consolidated tape).
- Trade signing: assign each trade a sign (+1 for buyer-initiated, -1 for seller-initiated) using the Lee-Ready (1991) algorithm:
- Tick rule: if trade price , buyer-initiated; if , seller-initiated.
- Quote rule: if trade price , buyer-initiated; if , seller-initiated.
- Combine: quote rule takes precedence; tick rule used when trade is at the mid.
- Aggregate to meta-orders: identify sequences of trades on the same side likely belonging to a single institutional order (e.g., using the Almgren-Tetlock-Atkins algorithm, which clusters consecutive same-side trades within a time window with low volume between them).
OLS Regression for Kyle's Lambda
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
def estimate_kyle_lambda(
mid_prices: np.ndarray, # mid-price time series
signed_flow: np.ndarray, # signed order flow per interval (shares, signed)
interval_sec: float = 60.0 # interval length in seconds
) -> dict:
"""
Estimate Kyle's lambda from tick data via OLS:
delta_M = alpha + lambda * OF + epsilon
Inputs:
mid_prices: array of mid-prices, shape (N+1,)
signed_flow: array of signed order flow per interval, shape (N,)
interval_sec: duration of each interval (for annualising)
Returns:
dict with keys: lambda_, alpha, r_squared, t_stat_lambda
"""
delta_M = np.diff(mid_prices) # N observations
X = signed_flow.reshape(-1, 1)
y = delta_M
model = LinearRegression(fit_intercept=True).fit(X, y)
lam = float(model.coef_[0])
alpha = float(model.intercept_)
r_sq = float(model.score(X, y))
# t-statistic for lambda
residuals = y - model.predict(X)
n = len(y)
se_lam = np.std(residuals) / (np.std(X.ravel()) * np.sqrt(n))
t_stat = lam / se_lam
return dict(lambda_=lam, alpha=alpha, r_squared=r_sq, t_stat_lambda=t_stat)
def estimate_sqrt_impact(
meta_orders: pd.DataFrame,
# Columns: quantity (shares), adv (shares/day), daily_vol (fraction),
# impact_bps (measured impact in basis points)
) -> dict:
"""
Estimate the square-root impact law: impact = Y * sigma * sqrt(Q/V)
via OLS on log-transformed data.
Model: log(impact / sigma) = log(Y) + 0.5 * log(Q/V)
Estimated exponent beta (free, not constrained to 0.5).
"""
participation = meta_orders['quantity'] / meta_orders['adv']
normalised_impact = meta_orders['impact_bps'] / (meta_orders['daily_vol'] * 1e4)
# OLS in log space
log_x = np.log(participation.values)
log_y = np.log(normalised_impact.clip(lower=1e-8).values)
X = np.column_stack([np.ones_like(log_x), log_x])
beta_hat, _ = np.linalg.lstsq(X, log_y, rcond=None)[:2]
log_Y = beta_hat[0]
beta = beta_hat[1] # should be ~0.5 if square-root law holds
Y = np.exp(log_Y)
return dict(Y=Y, beta=beta,
model_str=f"impact = {Y:.3f} * sigma * (Q/V)^{beta:.3f}")
Bias Corrections
Adverse selection bias. Large trades are more likely to be by informed traders, who select trades when they have private information — i.e., when the market is about to move in their direction anyway. Regressing impact on quantity conflates the informational component (not caused by the trade) with the mechanical component (caused by the trade). Correction: use instrument variables (IV) or restrict to liquidity-motivated trades.
Bid-ask bounce. Alternating buys and sells create artificial negative autocorrelation in mid-price changes. Use mid-price (not transaction price) changes and trade-signed flow (not tick direction) to mitigate this.
Time aggregation. Impact measured over different horizons (15 min, 1 hour, 1 day) gives different estimates due to impact decay. Always state the measurement horizon explicitly.
Limitations
Model identification. The square-root law is a reduced-form relationship; it does not identify the mechanism (permanent vs temporary) or separate information from liquidity. Structural models (Kyle, Glosten-Milgrom) separate these but require strong assumptions.
Non-stationarity. Kyle's lambda varies with market conditions: it is low in liquid markets (tight spreads, deep books) and high in stressed markets (wide spreads, thin books). A single estimated is a time-average that misrepresents the dynamic nature of impact.
Simultaneity bias. Large market participants affect prices by their trades, but they also respond to prices. Regression of price change on signed flow has a simultaneity bias: the trader accelerates execution when the price moves against them (buying faster as the price rises), creating a positive bias in the estimated .
Meta-order identification. Correctly identifying the boundaries of a single institutional order from exchange-level trade data is an unsolved problem. Different identification algorithms produce different estimates of (the order size) and therefore different estimates of the impact function. Almgren, Tetlock, and Atkins (2005) provide one approach; the practical accuracy is limited.
Cross-impact. In a multi-asset portfolio, trading asset can impact the price of correlated asset . This cross-impact is often ignored in single-asset models but is material for ETF trading, pairs strategies, and multi-asset execution.
Interview Angle
L1. What is Kyle's lambda? Give the formula and explain its units. How would you estimate it from Level-2 tick data for a liquid equity? What biases might affect your estimate?
Kyle's lambda: the coefficient relating signed order flow to permanent price impact: , where has units of $/share per share (price per quantity). Estimation: regress 1-minute mid-price changes on 1-minute net signed order flow; is the OLS slope. Units: $/share (each share of net buy flow causes dollars of price rise per share). Biases: adverse selection bias (informed traders cause prices to move before trading, not because of trading); bid-ask bounce (transaction price oscillation inflates impact estimates); time-aggregation (short-horizon is higher than long-horizon due to impact reversion).
L2. State the square-root law of market impact. Derive the optimal temporary impact cost in the Almgren-Chriss framework if (square-root, not linear). Why does the square-root form make sense from a limit order book perspective?
Square-root law: . For linear AC model, gives cost ; for square-root, gives cost — a concave cost function. The AC optimisation with gives a trade-off between and ; the optimal schedule has the form , which for gives (more front-loaded than TWAP). LOB intuition: the book has finite depth. The first shares sit at the best ask (cheapest); subsequent shares sit at successively worse prices. If order sizes follow a power law, the price to consume depth scales as — a square root of quantity consumed.
L3. Explain the permanent vs temporary decomposition of market impact. How would you estimate each component from execution data? Discuss the implications for the Almgren-Chriss framework: what does it mean if the permanent impact is close to zero, and what does it mean if it is large?
Decomposition: . Measure (impact at last fill) and (impact 1 hour later, when temporary has decayed). The permanent fraction and transient fraction are estimated from a sample of meta-orders.
Implications: If (nearly all impact is transient): the trade contained little information; the market maker's premium reverts as new limit orders refill the book. Execution cost is primarily temporary. This favours patient execution (TWAP or slower) because temporary impact decays. If (nearly all impact is permanent): the trade is informative; the new price level reflects updated fair value. Patient execution is dangerous — waiting exposes the trader to the permanent price move on the unexecuted residual. In Almgren-Chriss, the permanent parameter captures this: large means the unexecuted position marks against the trader, increasing the urgency of execution.